Thursday, November 7, 2013

Extraction of Grants from a schema


The following should be able to extract all the grants given from a particular schema to the other schemas.

This is particular useful when you have a bunch of schemas (with cross grants) and you are planning to move schema by schema and then finally apply the grants when all schemas are moved over.

expdp \'/ as sysdba\' schemas=scott dumpfile=scott_role.dmp nologfile=yes  include=owner_grant


impdp \'/ as sysdba\' dumpfile=scott_role.dmp sqlfile=scott_role.sql nologfile=yes

Script to monitor Alert log files for an Oracle Database

Recently I was being asked to write a script to monitor the Oracle Alert Logs for "ORA-" errors and send out an automated email to the DBA team.

Here are the challenges :
1.How do I make the script scan only the lines which were not scanned by the previous scan .
An other way to reframe the question - if scan x happened at 3pm, how do I ensure that the Scan (x+1) picked up only the new lines generated after 3pm. Should I store the time stamp of the last scan or should I store the high-water line count . I finally decided to go with the HWM line number.

2.Also how do I deal when a alert log is recycled?Will my script be able to find out that alert log has been recycled and then automatically set by line HWM to start from line 1 of the new alert log.
Looks like checking the stored HWM line count and the current line count of the alert log would help me in finding this out .If the alert log line count is smaller than the stored HWM line count then it would be safe to assume the alert log was recycled. Again in extraordinary circumstances where an alert log was recycled and the subsequent script ran so late(say after a couple of weeks or so) that new alert log line count exceeded the stored HWM line count ,then my logic would make it skip the first stored HWM lines of the alert log for that particular run only (but then again regaing normal operations for the subsequent scans)but I choose to ignore this possibility as this is extremely very remote and even if so, only the first scan would fail to detect the ORA- errors.

#!/bin/bash
#Alert Log Monitoring Script
#Usage: alert_monitor.sh
#Author:Gautham Chandrasekaran
#Version 1.0 06 NOVEMBER 2013

EMAIL_LIST=gautha@hotmail.com
ALERT_LOG_FILE=/u01/app/oracle/diag/rdbms/prod/PROD_1/trace/alert_PROD_1.log
EMAIL_FILE=/tmp/alert_mail.log

#Error out if Alert Log is missing
if [  ! -f $ALERT_LOG_FILE ] ; then
 echo "**** $ALERT_LOG_FILE FILE MISSING FROM ALERT LOG MONITOR ******" > $EMAIL_FILE
 cat $EMAIL_FILE|mailx -s "Alert Log Errors" $EMAIL_LIST
 exit 1
fi



ALERT_LOG_LINE_CNT_FILE=${ALERT_LOG_FILE}.ctr

#First run of the script or somebody deleted it.Therefore start from zero.
if [  ! -f $ALERT_LOG_LINE_CNT_FILE ] ; then
echo 0 > $ALERT_LOG_LINE_CNT_FILE
fi

STORED_LINE_CTR=`cat $ALERT_LOG_LINE_CNT_FILE`


ALERT_LOG_LINE_CTR=`cat $ALERT_LOG_FILE|wc -l`

#check to see whether somebody has recycled alert log file.
#in this case the STORED_LINE_CTR will be higher than ALERT_LOG_LINE_CTR
#If so just reset STORED_LINE_CTR to 0.
if [ $ALERT_LOG_LINE_CTR -lt $STORED_LINE_CTR ] ; then
STORED_LINE_CTR=0
fi


LINES_TO_MONITOR=`expr $ALERT_LOG_LINE_CTR - $STORED_LINE_CTR`


tail -n $LINES_TO_MONITOR $ALERT_LOG_FILE|grep -i "ora-"  > /tmp/alert_mail.log

#Reset line number to ctr file to skip the scanned rows in the next run
echo $ALERT_LOG_LINE_CTR > $ALERT_LOG_LINE_CNT_FILE


#Email only if the file is not empty
if [ -s $EMAIL_FILE ] ; then
 cat $EMAIL_FILE|mailx -s "Alert Log Errors" $EMAIL_LIST
fi

Wednesday, May 8, 2013

ASM Persistent Device Mapping for Oracle 11gR2 RAC for RHEL 6

Recently when I was setting up a RAC setup at home using some dell desktop servers , I hit a unique problem.
Before going into that , here was my setup details

RAC Node Names : siddhu.gautham.com, surya.gautham.com (CentOS 6.4)
Openfiler : openfiler.gautham.com (Openfiler 2.3 or 2.4 (not sure) 32-bit) (ip 192.168.1.195(eth0),
                 192.168.2.195(eth1)) ->Used to provide shared storage to the RAC Nodes.

The openfiler server was deployed to provide iscsi storage to the other RAC Nodes.

After the install of the 3 servers and provisioning of the storage to the 2 RAC nodes, I reached a point where i needed to install the ASMLib libraries for device persistance. That is when I  realized that Oracle Corp had stopped supplying ASMLib packages starting RHEL 6.

So now I had to look at other alternatives for device persistence.

The below should tell you what I mean by device persistence
=========================================


[root@siddhu by-path]# pwd
/dev/disk/by-path



[root@siddhu by-path]# ls -lt |grep 192.168.2

lrwxrwxrwx 1 root root  9 May  6 23:27 ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.crs1-lun-0 -> ../../sdb
lrwxrwxrwx 1 root root  9 May  6 23:27 ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.data1-lun-0 -> ../../sde
lrwxrwxrwx 1 root root  9 May  6 23:27 ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.fra1-lun-0 -> ../../sdc

The reason I had to filter with 192.168.2 was because openfiler had 2 network interfaces and I want to use storage presented only via eth1 which was 192.168.2.195.

The problem above is the crs lun is available as /dev/sdb, data lun as /dev/sde and fra lun is /dev/sdc.
The moment the system is rebooted this would change to a different mapping.

Also if you go the other nac rode i.e. surya.gautham.com , you would see the above mappings as completely different i.e. CRS lun would not be presented as the same as /dev/sdb but as something else.

Therefore what i needed was a reliable mapping like say /dev/asm-crs-disk1 which would always map to crs lun and  /dev/asm-data-disk1 which would map to data lun and so on across all nodes of the RAC.

The way we get this is to use udev rules.

Before you do this, please partition the the 3 devices above using fdisk command.
Create just 1 primary partition (I am not going to the details of this).

fdisk /dev/sdb
fdisk /dev/sde
fdisk /dev/sdc

Also perform partprobe on the other node to see the resulting /dev/sd[x]1 partition.

PLEASE NOTE THAT I HAD TO REBOOT BOTH THE RAC NODES AND THERFORE THE MAPPING CHANGED FROM ABOVE.PLEASE DO NOT LET THE MISMATCH ABOVE AND BELOW CONFUSE YOU.


From above we identified that /dev/sdd points to CRS in siddhu and /dev/sdc points to CRS in surya
[root@siddhu ~]# /sbin/scsi_id -g -u -d /dev/sdd
14f504e46494c45526834373035642d374d73772d794c386b

[root@surya iscsi]# /sbin/scsi_id -g -u -d /dev/sdc
14f504e46494c45526834373035642d374d73772d794c386b

The above shows that identifier is same irrespective of whether it is /dev/sdc(surya) or /dev/sdd(siddhu)
and this identifier will always be the same for CRS.

For FRA
=======
[root@surya iscsi]#  /sbin/scsi_id -g -u -d /dev/sdf
14f504e46494c455252635354667a2d4a6153492d6c307a66

[root@siddhu ~]# /sbin/scsi_id -g -u -d /dev/sde
14f504e46494c455252635354667a2d4a6153492d6c307a66



For DATA
========
[root@siddhu ~]# /sbin/scsi_id -g -u -d /dev/sdf
14f504e46494c45523473334e397a2d70347a612d554a394f

[root@surya iscsi]# /sbin/scsi_id -g -u -d /dev/sdg
14f504e46494c45523473334e397a2d70347a612d554a394f


vi /etc/udev/rules.d/99-oracle-asmdevices.rules



KERNEL=="sd?1", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -d /dev/$parent", RESULT=="14f504e46494c45526834373035642d374d73772d794c386b", NAME="asm-crs-disk1", OWNER="oracle", GROUP="dba", MODE="0660"
KERNEL=="sd?1", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -d /dev/$parent", RESULT=="14f504e46494c455252635354667a2d4a6153492d6c307a66", NAME="asm-fra-disk1", OWNER="oracle", GROUP="dba", MODE="0660"
KERNEL=="sd?1", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -d /dev/$parent", RESULT=="14f504e46494c45523473334e397a2d70347a612d554a394f", NAME="asm-data-disk1", OWNER="oracle", GROUP="dba", MODE="0660"



After reboot
+++++++++++++

[root@surya dev]# (cd /dev/disk/by-path; ls -l *openfiler* |grep -i 192.168.2| awk '{FS=" "; print $9 " " $10 " " $11}')
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.crs1-lun-0 -> ../../sde
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.crs1-lun-0-part1 -> ../../asm-crs-disk1
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.data1-lun-0 -> ../../sdg
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.data1-lun-0-part1 -> ../../asm-data-disk1
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.fra1-lun-0 -> ../../sdc
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.fra1-lun-0-part1 -> ../../asm-fra-disk1
[root@surya dev]#


[root@surya dev]# ls -lt /dev/*asm*
brw-rw---- 1 oracle dba 8, 65 May  1 00:50 /dev/asm-crs-disk1
brw-rw---- 1 oracle dba 8, 33 May  1 00:50 /dev/asm-fra-disk1
brw-rw---- 1 oracle dba 8, 97 May  1 00:50 /dev/asm-data-disk1
[root@surya dev]# 



Monday, March 4, 2013

Restore and recover from cold backup using rman

Many a times before a application upgrade, we would like to take a cold backup of the database just to take care of some "mess-up" and would just want to restore it if something goes wrong.

The following is the preferred way for me to take a RMAN cold backup.





Take the BACKUP
===============
Put the DB in mount mode.

rman target / nocatalog<<EOF>backup.log

run{
allocate channel d1  device type disk format  '/oracle/arch/rmanbkp/HOMEDB_FULL_%d_%T_%U';

backup as compressed backupset tag rman_backfulcold_HOMEDB database include current controlfile ;
}
EOF

Query from RMAN to see where the control file resides


list backup;


List of Backup Sets
===================


BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
13      Full    39.13M     DISK        00:00:06     04-MAR-13  
        BP Key: 13   Status: AVAILABLE  Compressed: YES  Tag: RMAN_BACKFULCOLD_HOMEDB
        Piece Name: /oracle/arch/rmanbkp/HOMEDB_FULL_HOMEDB_20130304_0do3ltu7_1_1
  List of Datafiles in backup set 13

..


..
..
..

        Piece Name: /oracle/arch/rmanbkp/HOMEDB_FULL_HOMEDB_20130304_0no3ltuo_1_1
  List of Datafiles in backup set 23
  File LV Type Ckp SCN    Ckp Time  Name
  ---- -- ---- ---------- --------- ----
  28      Full 301818     04-MAR-13 /oracle/oradata02/HOMEDB/data/forumindextabspace01.dbf

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
24      Full    1.03M      DISK        00:00:01     04-MAR-13  
        BP Key: 24   Status: AVAILABLE  Compressed: YES  Tag: RMAN_BACKFULCOLD_HOMEDB
        Piece Name: /oracle/arch/rmanbkp/HOMEDB_FULL_HOMEDB_20130304_0oo3ltup_1_1
  Control File Included: Ckp SCN: 301818       Ckp time: 04-MAR-13





STEPS TO RESTORE
================

startup nomount

Restore the CONTROL FILE from the backup piece identified above.

run
{
allocate channel c1 device type disk format '/oracle/arch/rmanbkp/HOMEDB_FULL_%d_%T_%U';
restore controlfile from '/oracle/arch/rmanbkp/HOMEDB_FULL_HOMEDB_20130304_0oo3ltup_1_1';
}

alter database mount;

run
{
allocate channel c1 device type disk format '/oracle/arch/rmanbkp/HOMEDB_FULL_%d_%T_%U';
restore database from tag='rman_backfulcold_HOMEDB';
}

Tuesday, February 19, 2013

Data Guard Catch-Up via RMAN using TSM

Many a times, we DBA's are faced with an issue where the DG has been out of synch with the Production DB for quite a long time.
Certain examples are we ran an overnight rman duplicate script to create a standy database and by the time we got to look at the status of the restore, hundreds of logs had been generated in the primary .
In all probability, in a typical production environment, a hourly RMAN archivelog backup and delete script would have run and deleted the archivelogs (after backing it up, obviously).

So to help the standby get in synch with the primary , we have to either put the standby in recover mode and restore the archivelogs on production hoping the FAL services would pick it up.

Once we realize that the archivelogs number in 100's or 1000's, this task becomes nearly impossible with the risk of endangering the archive destination on the primary getting filled up.

I have expalined a method below uses the same RMAN technique to get the standby in synch with the primary.
The example uses TSM (Tivoli) as the storage medium and please change accordingly as per your environment's tape set up.

For TSM, we need to copy over the primary node's TSM password file to the standby to ensure that the auxiliary channel can access the tape backups created from the primary database's node.Along with it , we need to ensure that both primary and standby are talking to the same TSM server.

At the Standby, Cancel the Managed recover process if running and ensure that the database is in mount mode.

alter database recover managed standby database cancel;


1)   Do the following in Primary.

create a file called recoverystandby.cmd, with the following contents


#++++++++++++++++++++++++++++++++++
#++RECOVERYSTANDBY.CMD++++++++++
#++++++++++++++++++++++++++++++++++
backup archivelog all not backed up;
run {
#Adjust it as per the archivelogs already present in standby
set until time 'sysdate-(1/24)';
#Make sure below /tmp/tdpo.opt file is available in standby database.
allocate auxiliary channel d2 type 'sbt_tape' parms 'ENV=(TDPO_OPTFILE=/tmp/tdpo.opt)';
recover standby clone database delete archivelog;
}

2)Do the following in Standby.

This is to make sure that the standby can access the tape backups.
******These steps are specific to TSM******
Make sure /tmp/tdpo.opt file is available. This TDPO file is used by auxiliary channel in step 1.

> cat /tmp/tdpo.opt
DSMI_LOG                /tmp
DSMI_ORC_CONFIG         /tmp/dsm.opt
TDPO_FS                 xxx
TDPO_NUM_BUFFERS        10
TDPO_NODE               prmyserver
TDPO_OWNER              oracle
TDPO_PSWDPATH           /tmp

Make sure /tmp/dsm.opt file is present referred above as DSMI_ORC_CONFIG


> cat /tmp/dsm.opt
SERVERNAME <tsm server>

Make sure you have physically copied the primary node's TSM password file. The name of the password file would be TDPO.<tdpo_node> i.e. TDPO.prmyserver and should be available under
TDPO_PSWDPATH path.





3) Do the following in Primary

Set environmental variables to connect to local primary database
export ORACLE_HOME='<path of ORACLE_HOME>'
export ORACLE_SID='<SID of Primary Database>'
export PATH=$ORACLE_HOME/bin:$PATH

Run the above command via RMAN as

rman target / auxiliary sys/manager@<standby database> catalog rman/rman@<rman db> cmdfile=recoverstandby.cmd log=catchup_stdby.log

Please allow the above script to run and this script should allow the standby to catch up with the primary without the need to restore the files into primary.

Once it has caught up, you can put the primary back into managed recovery process.

Please note that there is one more way to achieve the same thing using backup as incremental via rman like 
backup incremental from scn <scn number> database . You can look at other blogs to read more about this and decide for yourself which one to use. If there is a missing archivelog , then you cannot use my method and will be better off using the other approach I just mentioned above.




Thursday, January 31, 2013

Golden Gate Add SCHEMATRANDATA Verification

I just conjured up a small script to verify whether the GG Command

add schematrandata <schema name>

worked correctly.
By the way, please keep in mind that once you executed the above command, all future objects would be automatically covered. That is why I highly recommend setting logging at the schema level rather than the individual table level.

Basically the below query should be able to give you info

select * from table(logmnr$always_suplog_columns('name of the schema in upper case','table name in upper case'));

Please be aware that dba_log_groups will not show any entry if you had added schema level logging . Only if you had added logging at a table level using add trandata would there be an entry in dba_log_groups.

A small pl/sql snippet to go through an entire schema and report back is below

declare
l_schema varchar2(20) := upper('schema name') ;
l_table_name varchar2(200);
x_ctr number(5);
x_refcur sys_refcursor;
begin
open x_refcur for 'select table_name from dba_tables where owner='''||l_schema||''' order by table_name';
loop
fetch x_refcur into l_table_name;
exit when x_refcur%NOTFOUND;

execute immediate 'select count(*) from table(logmnr$always_suplog_columns(:1,:2))'
into x_ctr using l_schema,l_table_name ;
if x_ctr =0 then

dbms_output.put_line('Analyzing '||l_table_name||' ..**ERROR**');
else
dbms_output.put_line('Analyzing '||l_table_name||' ..OK');
end if;

end loop;
end;
/