backup.shA Backup Script for Oracle on Unix with NetApp Storage. The backup.sh script has gone through many changes over the years and still contains remnants of RMAN, various Tivoli Storage Manager (TSM) clients, direct to disk backups and support for clones for tape among other items. Today the backup.sh is literally a massive amount of checks, reports, and maintenance task that are wrapped around a hand full of NetApp command for creating and dropping snapshots, and updating snapmirrors. Download: backup.sh Features
UsageThe usage output for the backup.sh script is overwhelming at first but much of the functional details can be ignored until possibly needed, and the important items will be discussed on below.
Hot Backups Without Archive LogsThe NetApp documentation states: "In order for Snapshot copies to
be effectively used with Oracle Databases, they must be coordinated
with the Oracle hot backup facility." But after
thousands of successful clones of
production databases from hot snapshots, without a single failure; I
am convinced that any possible risk is acceptably small. Backup Oracles Supporting FilesSupporting files include everything that will or even might be needed to reconstruct the database on a new host. In theory this information is also on tape due to TSM scheduled sweeps of the host, but these files take very little space so it is much more convenient to have them readily available. Some of the files are also just snips that pertain to a specific database. The reason for parsing the file, such as crontab entries, is so the process of moving a database and all of it's supporting configuration can be automated. For example we have a script named "dr.sh" originally written of a Disaster Recovery plan, that can mount, configure and start any database and any of it's related host side applications with one command: "dr.sh move {SID} to here" The control, initialization, and password files will all be copied to tape when the database is being copied to tape. This makes sure that we have a control file that is accurate to each tape backup. For snapshot backup, the control file is rewritten in to a script associated to the snapshot by name. Together the script and the snapshot are all that is needed to clone the database. The primary destination for all of the supporting files is directory with the same name as the database. These directories all reside on a single NetApp volume which is NFS mounted on all of the Oracle database host. This volume scheduled to be copied to tape daily by TSM, but it is also mirrored to the companies off site NetApp storage using NetApp snapmirror. The shared storage volume also has a directory for each Oracle database host. And these directories contain the host configuration files and information that would be necessary for recreating the Oracle environment. Below are a list of the files created for the database and a second list of the files gathered for the host. The process of collecting the supporting files is done by the backup_config.sh script which is called by the backup.sh script. It is separate because it is also scheduled to routinely run from crontab. This insures that the supporting files are current for any parameter changes or data files added, even if a backup has not been run. Download: backup_config.sh Below are a list of the files created for the database and a second list of the files gathered for the host.
|
lsnr:/u01/app/oracle/admin/scripts>n05.sh
snap list tips Volume tips working... %/used %/total date name ---------- ---------- ------------ -------- 1% ( 1%) 1% ( 1%) Feb 21 15:35 n09(0151701781)_xtips.8767 (snapmirror) 3% ( 2%) 2% ( 1%) Feb 21 12:03 tips_110221_1203_R7 (busy,vclone) 6% ( 3%) 3% ( 1%) Feb 21 02:21 tips_110221_0221_R7 11% ( 6%) 6% ( 3%) Feb 18 12:02 tips_110218_1202_R7 14% ( 4%) 8% ( 2%) Feb 18 02:20 tips_110218_0220_R7 18% ( 5%) 10% ( 3%) Feb 17 12:02 tips_110217_1202_R7 20% ( 3%) 12% ( 1%) Feb 17 02:20 tips_110217_0220_R7 23% ( 5%) 14% ( 2%) Feb 16 12:02 tips_110216_1202_R7 25% ( 4%) 16% ( 2%) Feb 16 02:20 tips_110216_0220_R7 27% ( 3%) 17% ( 2%) Feb 15 12:02 tips_110215_1202_R7 29% ( 3%) 19% ( 2%) Feb 15 02:20 tips_110215_0220_R7 30% ( 3%) 20% ( 1%) Feb 14 12:02 tips_110214_1202_R7 32% ( 5%) 22% ( 2%) Feb 14 02:20 tips_110214_0220_R7 34% ( 3%) 24% ( 1%) Feb 11 02:20 tips_110211_0220_R7 (busy,vclone) 38% ( 8%) 28% ( 4%) Jan 24 12:02 tips_110124_1202_R7 (busy,vclone) 43% (13%) 35% ( 7%) Nov 23 02:20 tips_101123_0220_R7 (busy,vclone) |
The oldest snapshot and by nature, the one using the most snapshot reserve space is: "tips_101123_0220_R7". To find the database volume's that are using this snapshot we can grep the latest NetApp Volume Snapshot Report" for the same filer. To make it easier to read, it's combined with the reports column headers:
head -5 backup_netapp_vol_snapshot_report_n05.txt ;\ > grep tips_101123_0220_R7 backup_netapp_vol_snapshot_report_n05.txt NetApp Host: n05 Volume Snapshot Report Status: C=Clone or mounted. M=Mirror Mon Feb 21 05:38:09 CST 2011 Cloned From Volume Snapshot Date S Cloned To Volume Snapshot ttips ttips_110217_2346_R4 Feb 17 23:47 tips tips_101123_0220_R7 tips tips_101123_0220_R7 Nov 23 02:20 C ttips |
The results show that the snapshot is only being used by the "ttips" database volume. It's only one volume that is not frequently cloned so initiating a split would be the favorable choice. Once the split is complete the tips_101123_0220_R7 snapshot will no longer be busy which means it will be dropped during the next time: "backup.sh -snap_purge -filer=N05" is run.
It is possible and routinely done that a "non-split" clone has snapshots that are themselves being used for a clone. Note that the "ttips" clone of "tips" also has a snapshot but the report indicates that the "ttips" snapshot is not being used for a clone. While this will not be effected by the split, it would prevent you from destroying the "ttips" volume in order to refresh the clone from "tips". In that case the above example would again show you the volume that needed to be split.
Using the "resourceutilization" setting, TSM allows you to
control the number of concurrent sessions used to copy files to the
TSM storage. But it is limited to a specific set of files. If
you backup multiple databases and hence multiple sets of files at
the same time, then each set would open multiple sessions and the
impact on the host could be severe. The backup.sh script
allows you to control the total number of sessions for
both the each set of backup files as as well as all backups running on
the host. The following parameter is used:
-c=<Concurrent_Tape_Sessions>/<Total_Tape_Sessions_For_The_Host>
For example when the concurrent parameter "-c15/30" is passed then 15 data files will be sent to TSM at the same time as long as no more that 30 sessions are running on the host. If 30 sessions are already in progress when the backup script is started then it will start only 1 session and continue to monitor the total number of sessions on the host until the number drops below 30 and then it will start additional sessions as long as the total number of sessions for the host is less than 30. The default value for the concurrent settings is 10/20. Once the last data file for a set is has been launched into the background; the script will wait for all of the spawned sessions to complete before moving on. A file set for a cold backup can include all of the database data files, but for a hot backup it will only include the data files for the tablespace currently in backup mode. The concurrent backup feature only works with Hot and Cold backups to the normal TSM management classes. It does not support backup to disk or an archive management class.
Download: backup_remote.sh
A snapshot backup has no performance impact. But copying it to tape does utilized cpu and lots of bandwidth between the database host and the TSM host. This can be avoided performing the copy to TSM from another host. This is especially a good strategy when a backup to tape must run during business hours.
The remote backup to TSM is done by a companion script named "backup_remote.sh" the is scheduled to run periodically on the remote host. To initiate a remote backup the "-rtape" parameter replaces the normal "-tape" parameter to the backup.sh script. For example: "backup.sh hot TIPS -snap=7 -rtape -21dy". This will make a hot snapshot backup of the database and set a flag file in the REMOTE_TAPE_DIR as configured in the backup.sh script. This location is on a volume which is mounted on both the database host as well as the host that runs the backup_remote.sh script.
The backup_remote.sh script will check for the existence of flag files, working from the oldest to the newest. For each flag file it will mount the database volume using NFS, and then copy the files in the specific snapshot to TSM using a management class that was specified as a parameter to the backup.sh script.
The status of the copy to TSM is updated in the flag file, and any subsequent runs of the backup.sh script will check the status and update the Master log file on the database host as well as the bkup.bkup_log table.
TSM management classes are defined in the backup_remote.sh script. Unlike the backup.sh script only one backup to TSM will run at any given time on the remote host. For that reason, there is only one subdirectory named backup_remote_dsm for the TSM configurations files. The concurrent copy of files to TSM is control by "resourceutilization" setting in the dsm.sys configuration files in the backup_remote_dsm directory. The inclexcl.def file is this directory is overwritten with the specified management class prior starting each volumes copy to TSM.
Once a copy to TSM completes the flag file is updated with the status and the volume is unmounted from the remote host.
If necessary you can use the backup.sh script without the NetApp snapshot feature. In these case the backup.sh script will temporarily rename the files with a data and time suffix before they are copied to TSM. The format of the file name is: <file_name>_MonDD_YYYY_HHMM, for example: altra_data_04.dbf is renamed to: altra_data_04.dbf_Dec05_2002_2330. Once the tape copy is completed the file is renamed to its original name before the script proceeds to the next file. By renaming the files it is easier to identify which files in TSM belong with each other as the will all have the same suffix name.
When a snapshot is used the files are not renamed because the copy to TSM includes the name of the snapshot in the path, and the name of the snapshot clearly identifies the data and time the snapshot was created. For example: /tips/.snapshot/tips_110112_1200_R2.
When a backup is successful it will create scripts that can be assist with restoring or cloning the database. This script is time stamped and the most recent scripts are maintained on the host in the directory with the backup.sh script. An example is RESTORE_PROD_SCRIPT_Jun29_2011_0230.sh. will restore the Prod database. You only need to shutdown the target database, change the rights on the script to 700, run it, then start the database when the script completes. If a snapshot was used in the backup then a script to restore the backup from the snapshot is also created. For example the RESTORE_TIPS_SNAPSHOT_tips_110227_0221_R7.sh will restore the TIPS database from the tips_110227_0221_R7 snapshot.
The backup.sh script also creates a scripts that are used to clone the database from either TSM or a snapshot. The most recent backup creates a clone scripts without a time stamp such as CLONE_TIPS_SCRIPT.sh and CLONE_TIPS_SNAPSHOT.txt. These are the source scripts that will be used by the clone script on the target host. Each backup also creates a time stamped version of the clone scripts. For example CLONE_TIPS_SNAPSHOT_tips_110226_0220_R7.txt is available if you don't want to clone from the most recent snapshot that is in CLONE_TIPS_SNAPSHOT.txt. To clone from the a previous snapshot you can simply copy the appropriate text file over the current CLONE_TIPS_SNAPSHOT.txt. The clone script will use ftp to retrieve the CLONE_TIPS_SNAPSHOT.txt file.
There are multiple log files available for reviewing and troubleshooting backups. The backup logs will be found in the same directory as the backup.sh script. The backup_master.log contains a summary of all backup.sh backups, including mirror sync backups since the last time the file was deleted and it is delinated with a pipe, "|". Below is an example of viewing the last backups completed, omitting the mirror syncs. Most of the columns are obvious, but the 3rd column is Normal indicating no errors, or it could be Error or Pending. Pending indicates that the copy to tape is being preformed on a remote host. When it is completed the log file will be updated.
>grep -v MIRROR
backup_master.log | tail 110227|PM|Normal|UNIX|rsprod12|APWRX|COLD R14 SNAP R2|21:10|21:16|TAPE|apwrx_110227_2110_R2 110227|PM|Normal|UNIX|rsprod12|PTM|HOT SNAP R7|23:35|23:35|SNAP|ptm_110227_2335_R7 110227|PM|Normal|UNIX|rsprod12|RISK|COLD R14 SNAP R2|23:45|00:03|TAPE|risk_110227_2347_R2 110227|PM|Normal|UNIX|rsprod12|PWP|HOT R14 SNAP R7|23:16|00:20|TAPE|pwp_110227_2318_R7 110228|AM|Normal|UNIX|rsprod12|FN|COLD R14 SNAP R2|00:05|00:26|TAPE|fn_110228_0006_R2 110227|PM|Normal|UNIX|rsprod12|ONKI|COLD R14 SNAP R7|23:02|00:33|TAPE|onki_110227_2303_R7 110227|PM|Normal|UNIX|rsprod12|BQES|COLD R14 SNAP R7|23:10|00:35|TAPE|bqes_110227_2311_R7 110228|AM|Normal|UNIX|rsprod12|MAX|HOT R14 SNAP R7|01:05|01:26|TAPE|max_110228_0105_R7 110228|AM|Normal|UNIX|rsprod12|MAX|HOT SNAP R3|04:05|04:05|SNAP|max_110228_0405_R3 110228|AM|Normal|UNIX|rsprod12|TIPS|HOT R21 SNAP R7|02:20|04:11|TAPE|tips_110228_0220_R7 |
A backup_daily.log file has the same data as the master log but it it fixed length and only includes backups that were completed since 7 am on the previous day.
Backup data is also uploaded to the bkup.bkup_log table in database configured in the backup.sh script. The bkup.bkup_log table shown below contains additional information to the backup_master.log. It is the data source for reporting backup status to the application administrators and developers as well as a process that verifies the backup files on the TSM system.
Name ---------------------- START_DATE AM_PM STATUS OS HOST SID SCHEMA BKUP_TYPE TAPE_RETENSION_DAYS SNAP_RETENSION_DAYS SNAPSHOT START_TIME END_TIME REND_DATE REND_TIME RHOST TSM_CHECK_DATE TSM_FILE_COUNT TSM_MGMT_CLASS |
Type -------------- CHAR(6) CHAR(2) VARCHAR2(10) VARCHAR2(10) VARCHAR2(10) VARCHAR2(20) VARCHAR2(20) VARCHAR2(30) NUMBER NUMBER VARCHAR2(40) CHAR(5) CHAR(5) CHAR(6) CHAR(5) VARCHAR2(10) CHAR(6) NUMBER VARCHAR2(16) |
The most useful data for application administrators and developers is the bkup_available_snap_tape_vw view which list snapshots and backups on tape that are still available for restore. You simply need to add a where clause to limit the results to a specific database. For example the following select displays any backup of the TIPS database which are still available for restore.
column
database format a10 column source format a10 column status format a10 column backup_type format a25 column snapshot_name format a30 select database, start_time, end_time, source, status, backup_type, snapshot_name from bkup.bkup_available_snap_tape_vw where database = 'TIPS' order by start_time desc; |
|||||||
DATABASE
---------- TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS TIPS |
START_TIME
------------------- 02/28/2011 12:02 PM 02/28/2011 02:20 AM 02/27/2011 02:20 AM 02/26/2011 02:20 AM 02/25/2011 12:02 PM 02/25/2011 02:20 AM 02/24/2011 12:02 PM 02/24/2011 02:20 AM 02/23/2011 12:02 PM 02/23/2011 02:20 AM 02/22/2011 12:02 PM 02/22/2011 02:20 AM 02/21/2011 12:02 PM 02/21/2011 02:20 AM 02/18/2011 02:20 AM 02/17/2011 02:20 AM 02/16/2011 02:20 AM 02/15/2011 02:20 AM 02/14/2011 02:20 AM 02/13/2011 02:20 AM 02/12/2011 02:20 AM 02/11/2011 02:20 AM 02/10/2011 02:20 AM 02/09/2011 02:20 AM 02/08/2011 02:20 AM 02/07/2011 02:20 AM |
END_T ----- 12:02 04:11 04:15 04:33 12:02 04:14 12:02 04:15 12:02 04:14 12:02 04:11 12:03 04:19 04:23 04:17 04:17 04:15 04:13 04:13 04:38 04:15 04:15 04:14 04:16 04:12 |
SOURCE --------- Snap Snap/Tape Snap/Tape Snap/Tape Snap Snap/Tape Snap Snap/Tape Snap Snap/Tape Snap Snap/Tape Snap Snap/Tape Tape Tape Tape Tape Tape Tape Tape Tape Tape Tape Tape Tape |
STATUS --------- Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed Completed |
BACKUP_TYPE ------------------------- HOT SNAP:R7 HOT TAPE:R21 SNAP:R7 COLD TAPE:R21 SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT TAPE:R21 SNAP:R7 COLD TAPE:R21 SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT TAPE:R21 SNAP:R7 HOT TAPE:R21 SNAP:R7 |
SNAPSHOT_NAME --------------- tips_110228_1202_R7 tips_110228_0220_R7 tips_110227_0221_R7 tips_110226_0220_R7 tips_110225_1202_R7 tips_110225_0220_R7 tips_110224_1202_R7 tips_110224_0220_R7 tips_110223_1202_R7 tips_110223_0220_R7 tips_110222_1202_R7 tips_110222_0220_R7 tips_110221_1203_R7 tips_110221_0221_R7 tips_110218_0220_R7 tips_110217_0220_R7 tips_110216_0220_R7 tips_110215_0220_R7 tips_110214_0220_R7 tips_110213_0221_R7 tips_110212_0220_R7 tips_110211_0220_R7 tips_110210_0220_R7 tips_110209_0220_R7 tips_110208_0220_R7 tips_110207_0220_R7 |
For troubleshooting purposes a detail of each backup is found in a file named for the host and the date. For example: backup_rsprod12_Dec12.log. For example the detail includes the copy of each data file to TSM including start and stop time stamps.
Backups to TSM and run long sometimes due to load on the network, the TSM system, or growth of the database. It is desirable that the backups complete copies to TSM prior to the daily user workload. If a backup is running long the DBA can be notified by the backup_check_complete.sh which runs from crontab just after 7 AM each morning.