Random Oracle Posts

Saturday, August 9, 2014

Part 2 - Goldengate supplemental logging , keycols, logallsupcols testcase

This is in continuation to my series of findings and I recommend that you start from the first post
http://gauthamc.blogspot.com/2014/08/goldengate-supplemental-logging-keycols.html to understand my environment much better.

TEST CASE 2
==========
The test case is pretty similar to test case 1 except that I have modified to extract parameter to add
LOGALLSUPCOLS.

Otherwise the setup is exactly the same as test case 1.

You might want to read more about LOGALLSUPCOLS (extract parameter) which is GG 12c feature and seems to be mandatory if you are using INTEGRATED replicat (again a 12c feature) with parallelism >1 (which should be true in most cases). Here we are going to look at what changes in the trail file when this parameter is set .

So here it goes again, the same steps as in test case 1.

--extract parameter file
..
..

DDL include Mapped
LOGALLSUPCOLS
table SCOTT.junk;

sequence SCOTT.*;
--end of extract parameter file

--create table in source with no PK/UK defined
SQL> create table junk(col1 number,col2 number,col3 number,col4 number,col5 numb
er,col6 number);

Table created.

--make sure the table DDL has moved to target
SQL> select * from junk;

no rows selected

--set supplemental logging for col1,col2 and col3

GGSCI (titan.gautham.com) 140> add trandata scott.junk, cols(col1,col2,col3),nokey

2014-08-09 15:52:09 WARNING OGG-06439 No unique key is defined for table JUNK. All viable columns will be used to represent the key, but may not guarantee uniqueness. KEYCOLS may be used to define the key.

Logging of supplemental redo data enabled for table SCOTT.JUNK.

TRANDATA for scheduling columns has been added on table 'SCOTT.JUNK'.

--verify

GGSCI (titan.gautham.com) 142> info trandata scott.junk

Logging of supplemental redo log data is enabled for table SCOTT.JUNK.

Columns supplementally logged for table SCOTT.JUNK: COL1, COL2, COL3.

--perform an insert on source
SQL> insert into junk values(100,100,100,100,100,100);

1 row created.

SQL> commit;

Commit complete.

--verify insert has travelled to target
SQL> select * from junk;

COL1 COL2 COL3 COL4 COL5 COL6
---------- ---------- ---------- ---------- ---------- ----------

100 100 100 100 100 100

--perform an update on the source of col4 .Remember only col1,col2 and col3 is added for supplemental logging

SQL> update junk set col4=500 where col1=100;

1 row updated.

SQL> commit;

Commit complete.

Look at the trailfile from logdump for the update
--------------------------------------------------------------------
The interesting thing is you will see 2 records in the trail file for the above update.
You will see a FieldComp (before image) followed by a GGSPKUpdate(before/after image).

It does look like the  LOGALLSUPCOLS is responsible for the 1st FieldComp record .
With my interactions with Oracle Support and update operation always appears as either a FieldComp or GGSPKUpdate(when there is a PK update).In most practical situations where a table always has a PK and logging is setup either on the PK or all columns, I have only seen a FieldComp when an update operation takes place.
I am assuming that if the source had a PK or KEYCOLS defined in parameter file(assuming supplemental logging covered the PK or the KEYCOLS), we would see the GGSPKUpdate replaced with a FieldComp .That will be my test case 3.

Logdump 752 >n
___________________________________________________________________
Hdr-Ind : E (x45) Partition : . (x04)
UndoFlag : . (x00) BeforeAfter: B (x42)
RecLength : 44 (x002c) IO Time : 2014/08/09 16:17:18.000.000
IOType : 15 (x0f) OrigNode : 255 (xff)
TransInd : . (x00) FormatType : R (x52)
SyskeyLen : 0 (x00) Incomplete : . (x00)
AuditRBA : 10578 AuditPos : 243605008
Continued : N (x00) RecCount : 1 (x01)

2014/08/09 16:17:18.000.000 FieldComp Len 44 RBA 8711
Name: SCOTT.JUNK
Before Image: Partition 4 G b
0000 0007 0000 0003 3130 3000 0100 0700 0000 0331 | ........100........1
3030 0002 0007 0000 0003 3130 3000 0300 0700 0000 | 00........100.......
0331 3030 | .100
Column 0 (x0000), Len 7 (x0007)
0000 0003 3130 30 | ....100
Column 1 (x0001), Len 7 (x0007)
0000 0003 3130 30 | ....100
Column 2 (x0002), Len 7 (x0007)
0000 0003 3130 30 | ....100
Column 3 (x0003), Len 7 (x0007)
0000 0003 3130 30 | ....100

GGS tokens:
TokenID x52 'R' ORAROWID Info x00 Length 20
4141 4155 4763 4141 4641 4141 414b 4641 4141 0001 | AAAUGcAAFAAAAKFAAA..
TokenID x4c 'L' LOGCSN Info x00 Length 9
3539 3131 3038 3032 37 | 591108027
TokenID x36 '6' TRANID Info x00 Length 12
3130 2e32 392e 3637 3134 3533 | 10.29.671453

Important findings from the above record -
As expected we see col1,col2,col3 records because of supplemental logging.
Looks like since we updated col4 (4th column of the table or column 3 in GG(as GG numbers column from 0),
it stores the before image of col4 which is 100

Logdump 753 >n
___________________________________________________________________
Hdr-Ind : E (x45) Partition : . (x04)
UndoFlag : . (x00) BeforeAfter: A (x41)
RecLength : 90 (x005a) IO Time : 2014/08/09 16:17:18.000.000
IOType : 115 (x73) OrigNode : 255 (xff)
TransInd : . (x02) FormatType : R (x52)
SyskeyLen : 0 (x00) Incomplete : . (x00)
AuditRBA : 10578 AuditPos : 243605008
Continued : N (x00) RecCount : 1 (x01)

2014/08/09 16:17:18.000.000 GGSPKUpdate Len 90 RBA 8873
Name: SCOTT.JUNK
After Image: Partition 4 G e
002c 0000 0007 0000 0003 3130 3000 0100 0700 0000 | .,........100.......
0331 3030 0002 0007 0000 0003 3130 3000 0300 0700 | .100........100.....
0000 0331 3030 0000 0007 0000 0003 3130 3000 0100 | ...100........100...
0700 0000 0331 3030 0002 0007 0000 0003 3130 3000 | .....100........100.
0300 0700 0000 0335 3030 | .......500
Before Image Len 46 (x0000002e)
KeyLen 44 (x0000002c)
KeyCol 0 (x0000), Len 7 (x0007)
0000 0003 3130 30 | ....100
KeyCol 1 (x0001), Len 7 (x0007)
0000 0003 3130 30 | ....100
KeyCol 2 (x0002), Len 7 (x0007)
0000 0003 3130 30 | ....100
KeyCol 3 (x0003), Len 7 (x0007)
0000 0003 3130 30 | ....100

After Image Len 44 (x0000002c)
Column 0 (x0000), Len 7 (x0007)
0000 0003 3130 30 | ....100
Column 1 (x0001), Len 7 (x0007)
0000 0003 3130 30 | ....100
Column 2 (x0002), Len 7 (x0007)
0000 0003 3130 30 | ....100
Column 3 (x0003), Len 7 (x0007)
0000 0003 3530 30 | ....500

GGS tokens:
TokenID x52 'R' ORAROWID Info x00 Length 20
4141 4155 4763 4141 4641 4141 414b 4641 4141 0001 | AAAUGcAAFAAAAKFAAA..

As with testcase 1, the replicat was broken here too -
REPLICAT ABENDED RPHVN1D 00:00:00 00:02:35

report file
---------------
...
...
2014-08-09 16:16:54 WARNING OGG-06439 No unique key is defined for table JUNK. All viable columns will be used to represent the key, but may not guarantee uniqueness. KEYCOLS may be used to define the key.
Using following columns in default map by name:
COL1, COL2, COL3, COL4, COL5, COL6

Using the following key columns for target table SCOTT1.JUNK: COL1, COL2, COL3, COL4, COL5, COL6.
..
..

It does seem like we had the same abort as testcase 1 and we can fix this issue either by putting KEYCOLS on the replicat parameter file or just adding a PK(on target table) on one of the columns of col1, col2,col3 and col4.

Just for curiousity, instead of adding a PK on col1 or col2 or col3 (unchanged), let me add the PK on the changed column col4.

SQL> alter table junk add constraint pk_junk primary key(col4);

Table altered.

--restart the replicat and check the target

SQL> select * from junk;

COL1 COL2 COL3 COL4 COL5 COL6
---------- ---------- ---------- ---------- ---------- ----------

100 100 100 500 100 100

Looks like we are all good now.

Part 1 - Goldengate supplemental logging , keycols, logallsupcols testcase

My goldengate set up is a 11.2.0.4 oracle database on a goldengate 12c .
I am running a series of tests to understand how goldengate writes data into the trail file and how replicat behaves based on supplemental logging (all columns vs few columns), no PK/UK defined on the tables, effect of keycols in the extract and replicat parameter file, LOGALLSUPCOLS(a 12c GG parameter).

I hope to document a bunch of them and will be writing multiple testcases.

TEST CASE 1
=============

Table has no PK or unique key defined on both source and target and table structure is identical.
Supplemental logging is done for 3 columns out of 6 columns.
Extract has no KEYCOLS defined.
EXTRACT has no LOGALLSUPCOLS.
replicat has no KEYCOLS defined.

SQL> create table junk(col1 number,col2 number,col3 number,col4 number,col5 numb
er,col6 number)
2 /

Table created.

GGSCI (titan.gautham.com) 119> add trandata scott.junk, cols(col1,col2,col3),nokey

2014-08-09 13:07:44 WARNING OGG-06439 No unique key is defined for table JUNK. All viable columns will be used to represent the key, but may not guarantee uniqueness. KEYCOLS may be used to define the key.

Logging of supplemental redo data enabled for table SCOTT.JUNK.
TRANDATA for scheduling columns has been added on table 'SCOTT.JUNK'.
GGSCI (titan.gautham.com) 120> info trandata scott.junk

Logging of supplemental redo log data is enabled for table SCOTT.JUNK.

Columns supplementally logged for table SCOTT.JUNK: COL1, COL2, COL3.

Extract parameter file

..
--LOGALLSUPCOLS -->commented out
table SCOTT.junk; -->no keycols mentioned here

Perform the following from sqlplus at the source-
SQL> insert into junk values(100,100,100,100,100,100);

1 row created.

SQL> commit;

Commit complete.

SQL> update junk set col4=500 where col1=100;

1 row updated.

SQL> commit;

Commit complete.

The below logdump portion is relevant only to the update statement .I have not included the logdump portion for the insert as it does not serve us any purpose for this discussion.

logdump of trail file

1.Has info on col1, col2,col3 (as they are added in add trandata)
2.Has info on col4 as it was changed
3.shows GGSPKUpdate (for the update) with before and after image

Logdump 692 >n
___________________________________________________________________
Hdr-Ind : E (x45) Partition : . (x04)
UndoFlag : . (x00) BeforeAfter: A (x41)
RecLength : 90 (x005a) IO Time : 2014/08/09 13:23:00.000.000
IOType : 115 (x73) OrigNode : 255 (xff)
TransInd : . (x03) FormatType : R (x52)
SyskeyLen : 0 (x00) Incomplete : . (x00)
AuditRBA : 10578 AuditPos : 230433296
Continued : N (x00) RecCount : 1 (x01)

2014/08/09 13:23:00.000.000 GGSPKUpdate Len 90 RBA 4260
Name: SCOTT.JUNK
After Image: Partition 4 G s
002c 0000 0007 0000 0003 3130 3000 0100 0700 0000 | .,........100.......
0331 3030 0002 0007 0000 0003 3130 3000 0300 0700 | .100........100.....
0000 0331 3030 0000 0007 0000 0003 3130 3000 0100 | ...100........100...
0700 0000 0331 3030 0002 0007 0000 0003 3130 3000 | .....100........100.
0300 0700 0000 0335 3030 | .......500
Before Image Len 46 (x0000002e)
KeyLen 44 (x0000002c)
KeyCol 0 (x0000), Len 7 (x0007)
0000 0003 3130 30 | ....100
KeyCol 1 (x0001), Len 7 (x0007)
0000 0003 3130 30 | ....100
KeyCol 2 (x0002), Len 7 (x0007)
0000 0003 3130 30 | ....100
KeyCol 3 (x0003), Len 7 (x0007)
0000 0003 3130 30 | ....100

After Image Len 44 (x0000002c)
Column 0 (x0000), Len 7 (x0007)
0000 0003 3130 30 | ....100
Column 1 (x0001), Len 7 (x0007)
0000 0003 3130 30 | ....100
Column 2 (x0002), Len 7 (x0007)
0000 0003 3130 30 | ....100
Column 3 (x0003), Len 7 (x0007)
0000 0003 3530 30 | ....500

GGS tokens:
TokenID x52 'R' ORAROWID Info x00 Length 20
4141 4155 4751 4141 4641 4141 4149 7641 4141 0001 | AAAUGQAAFAAAAIvAAA..
TokenID x4c 'L' LOGCSN Info x00 Length 9
3539 3130 3932 3935 36 | 591092956
TokenID x36 '6' TRANID Info x00 Length 11
352e 3132 2e32 3138 3937 35 | 5.12.218975

replicat parameter file
=========================
DDL INCLUDE MAPPED &
EXCLUDE OBJNAME GGADM.EXCEPTIONS
ASSUMETARGETDEFS
GETTRUNCATES
REPERROR (-1, EXCEPTION)
REPERROR (-1403, EXCEPTION)
..
..
map scott.junk, target scott1.junk, mapexception (TARGET ggadm.exceptions
, INSERTALLRECORDS
-- , EXCEPTIONSONLY
, colmap ( rep_name = 'rphvn1d'
, table_name = @GETENV ('GGHEADER', 'TABLENAME')
, errno = @GETENV ('LASTERR', 'DBERRNUM')
, dberrmsg = @GETENV ( 'LASTERR', 'DBERRMSG')
, optype = @GETENV ( 'LASTERR', 'OPTYPE')
, errtype = @GETENV ( 'LASTERR', 'ERRTYPE')
, logrba = @GETENV ( 'GGHEADER', 'LOGRBA')
, logposition = @GETENV ( 'GGHEADER', 'LOGPOSITION')
, TRAILRBA = @GETENV('RECORD', 'FILERBA')
, TRAILSEQUENCE = @GETENV('RECORD', 'FILESEQNO')
, committimestamp = @GETENV ( 'GGHEADER', 'COMMITTIMESTAMP')
) );

EFFECT ON REPLICAT
==================

REPLICAT CRASHES

ggserr.log
------------

2014-08-09 13:23:05 WARNING OGG-01151 Oracle GoldenGate Delivery for Oracle, rphvn1d.prm: Error mapping from SCOTT.JUNK to SCOTT1.JUNK.
2014-08-09 13:23:05 WARNING OGG-01003 Oracle GoldenGate Delivery for Oracle, rphvn1d.prm: Repositioning to rba 4260 in seqno 14.
2014-08-09 13:23:05 ERROR OGG-01296 Oracle GoldenGate Delivery for Oracle, rphvn1d.prm: Error mapping from SCOTT.JUNK to SCOTT1.JUNK.
2014-08-09 13:23:14 ERROR OGG-01668 Oracle GoldenGate Delivery for Oracle, rphvn1d.prm: PROCESS ABENDING.

report file
-----------
2014-08-09 13:40:18 WARNING OGG-06439 No unique key is defined for table JUNK. All viable columns will be used to represent the key, but may not guarantee uniqueness. KEYCOLS may be used to define the key.
Using following columns in default map by name:
COL1, COL2, COL3, COL4, COL5, COL6
Using the following key columns for target table SCOTT1.JUNK: COL1, COL2, COL3, COL4, COL5, COL6.

From above error message , it is clear that at replicat side since it does not see any PK (on the target)or KEYCOL defined (in replicat parameter file)
, it needs all the 6 columns in the trail file. But the trail file has before and after image for only first 4 columns (col1,col2.col3 as it is defined by supplemental logging and col4 as it was the changed column).

So to remediate, temporarily add a PK to target-
SQL> alter table junk add constraint pk_junk primary key(col1);

Table altered.

And restart the replicat

GGSCI (titan.gautham.com) 137> start rphvn1d

Sending START request to MANAGER ...
REPLICAT RPHVN1D starting

GGSCI (titan.gautham.com) 138> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
JAGENT STOPPED
EXTRACT RUNNING XTHVN1D 00:00:09 00:00:07
REPLICAT RUNNING RPHVN1D 02:18:58 00:00:02

--Check the data on the target
SQL> select * from junk;

COL1 COL2 COL3 COL4 COL5 COL6
---------- ---------- ---------- ---------- ---------- ----------
100 100 100 500 100 100

Now since the testing is complete, get the source and target ready for the next testcase by dropping junk table at the source and make sure the DDL got replicated to target -

source
--------
SQL> drop table junk;

Table dropped.

target
------

SQL> select * from junk;
select * from junk
*
ERROR at line 1:
ORA-00942: table or view does not exist

Wednesday, February 19, 2014

GG Procedure calling

--begin character setissue handler
--MAP HIGHERONE.tblemailqueue ,target HIGHERONE.tblemailqueue ,
--SQLEXEC(SPNAME higherone.p_string_transform,ID x,PARAMS(in_str=sbody),TRACE ALL),
--COLMAP(
--usedefaults,
--sbody=@getval(x.out_str));
--end character setissue handler

Tuesday, January 7, 2014

Convert Single Instance Database to Oracle RAC One Node Database / RAC 1 node database

I had the recent need to convert a single instance DB to a RAC 1-node. I haven't seen many documents
which talks about the steps involved As most docs focus on conversion from single instance to a regular RAC.

In the past, I have used DBCA to create a 1-Node RAC and always used the "Generate database creation scripts" at the end. Then I have had manually edited the scripts generated to suit my purpose and this approach has served me very well as allowing dbca to create a DB directly has caused many unnecessary schemas to creep in.

I drove into the scripts and to my best abilities, tried to reproduce what it does and have written the below blog.

Briefly here are the steps to convert a single instance DB to 1 Node RAC.

Before you start, ensure
+++++++++++++++++++++

I assume you just have single instance init.ora parameters enabled.

No RAC init.ora parameters at this point should be enabled especially CLUSTER_DATABASE.

You might want to consider setting the below init.ora parameters to valid ASM disk group

db_create_file_dest="+DATA_DEV"
db_recovery_file_dest="+FRA_DEV"

Please make sure that your control files, data files, redo logs are already in ASM before you perform these steps.

I also recommend that you use init.ora file which internally points to a spfile as shown below.

Also make sure, you have x init.ora files (where x is the number of nodes)in each node's ORACLE_HOME/dbs directory
with the naming convention init<DB Name>_1.ora, init<DB Name>_2.ora, init<DB Name>_3.ora etc

For example,since my setup is a 2-Node RAC, I have 2 init.ora files with the name init<DB Name>_1.ora and init<DB Name>_2.ora in
ORACLE_HOME/dbs directory of both the RAC nodes.

$ cat initTESTDB_1.ora
SPFILE='+DATA_DEV/TESTDB/spfileTESTDB.ora'

$ cat initTESTDB_2.ora
SPFILE='+DATA_DEV/TESTDB/spfileTESTDB.ora'

My Oracle Home is /u02/app/oracle/product/11.2.0.4/db_1 and DB Name is TESTDB.

The Grid Home is /u02/app/11.2.0/grid .

1)Create the password file if it does not exist (recommended) on 1 node and copy it to other nodes.
Please note that there is NO suffix to password file name.

/u02/app/oracle/product/11.2.0.4/db_1/bin/orapwd file=/u02/app/oracle/product/11.2.0.4/db_1/dbs/orapwTESTDB force=y

Make the following change in your init.ora file(If using spfile perform an alter system set)
remote_login_passwordfile=exclusive

Please shutdown the DB at this point.

2)This can be skipped if already performed(please note dbca automatically does it if you have ever created a DB on this
OH in the past) but again no harm in doing this again.

/u02/app/11.2.0/grid/bin/setasmgidwrap o=/u02/app/oracle/product/11.2.0.4/db_1/bin/oracle

3)Add the database to CRS.

/u02/app/oracle/product/11.2.0.4/db_1/bin/srvctl add database -d TESTDB -o /u02/app/oracle/product/11.2.0.4/db_1 -p +DATA_DEV/TESTDB/spfileTESTDB.ora -n TESTDB -c RACONENODE -e node1,node2 -i TESTDB -a "DATA_DEV,FRA_DEV"

Please pay particular attention to
(a)The option "-c RACONENODE"
(b)The places which will need you to plug in your ASM disk group.In my case , I have two of them called DATA_DEV,FRA_DEV

(4) Add a service.
/u02/app/oracle/product/11.2.0.4/db_1/bin/srvctl add service -s TESTDB.gautham.com -d TESTDB -P BASIC

(5)Disable the database
/u02/app/oracle/product/11.2.0.4/db_1/bin/srvctl disable database -d TESTDB

(6)Startup the database normally via

sqlplus '/ as sysdba'
startup

(7) Add the redo log groups for thread 2.

I assume that GROUP 1 and GROUP 2 are the only 2 groups in the single instance DB belonging to thread 1.By default, thread 1 should already exist. If there
are more than 2 groups in thread 1, then change the group name below accordingly .

ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 3 SIZE 250m,
GROUP 4 SIZE 250m;

ALTER DATABASE ENABLE PUBLIC THREAD 2;

Repeat it with THREAD 3 , if it is a 3 node RAC and so on.

(8) There is no need to add a 2nd Tablespace as this is a 1-Node RAC and only 1 instance will be acive at any point.

(9) Set the 2 parameters in the spfile to

ALTER SYSTEM SET cluster_database=TRUE SCOPE=spfile;
ALTER SYSTEM SET remote_listener='<scan name>:1521' scope=spfile ;

(10)Shutdown the DB via a normal "shutdown immediate" command.

(11) /u02/app/oracle/product/11.2.0.4/db_1/bin/srvctl enable database -d TESTDB;
(12) /u02/app/oracle/product/11.2.0.4/db_1/bin/srvctl start database -d TESTDB;

Thursday, November 7, 2013

Extraction of Grants from a schema

The following should be able to extract all the grants given from a particular schema to the other schemas.

This is particular useful when you have a bunch of schemas (with cross grants) and you are planning to move schema by schema and then finally apply the grants when all schemas are moved over.

expdp \'/ as sysdba\' schemas=scott dumpfile=scott_role.dmp nologfile=yes include=owner_grant

impdp \'/ as sysdba\' dumpfile=scott_role.dmp sqlfile=scott_role.sql nologfile=yes

Script to monitor Alert log files for an Oracle Database

Recently I was being asked to write a script to monitor the Oracle Alert Logs for "ORA-" errors and send out an automated email to the DBA team.

Here are the challenges :
1.How do I make the script scan only the lines which were not scanned by the previous scan .
An other way to reframe the question - if scan x happened at 3pm, how do I ensure that the Scan (x+1) picked up only the new lines generated after 3pm. Should I store the time stamp of the last scan or should I store the high-water line count . I finally decided to go with the HWM line number.

2.Also how do I deal when a alert log is recycled?Will my script be able to find out that alert log has been recycled and then automatically set by line HWM to start from line 1 of the new alert log.
Looks like checking the stored HWM line count and the current line count of the alert log would help me in finding this out .If the alert log line count is smaller than the stored HWM line count then it would be safe to assume the alert log was recycled. Again in extraordinary circumstances where an alert log was recycled and the subsequent script ran so late(say after a couple of weeks or so) that new alert log line count exceeded the stored HWM line count ,then my logic would make it skip the first stored HWM lines of the alert log for that particular run only (but then again regaing normal operations for the subsequent scans)but I choose to ignore this possibility as this is extremely very remote and even if so, only the first scan would fail to detect the ORA- errors.

#!/bin/bash
#Alert Log Monitoring Script
#Usage: alert_monitor.sh
#Author:Gautham Chandrasekaran
#Version 1.0 06 NOVEMBER 2013

EMAIL_LIST=gautha@hotmail.com
ALERT_LOG_FILE=/u01/app/oracle/diag/rdbms/prod/PROD_1/trace/alert_PROD_1.log
EMAIL_FILE=/tmp/alert_mail.log

#Error out if Alert Log is missing
if [ ! -f $ALERT_LOG_FILE ] ; then
echo "**** $ALERT_LOG_FILE FILE MISSING FROM ALERT LOG MONITOR ******" > $EMAIL_FILE
cat $EMAIL_FILE|mailx -s "Alert Log Errors" $EMAIL_LIST
exit 1
fi

ALERT_LOG_LINE_CNT_FILE=${ALERT_LOG_FILE}.ctr

#First run of the script or somebody deleted it.Therefore start from zero.
if [ ! -f $ALERT_LOG_LINE_CNT_FILE ] ; then
echo 0 > $ALERT_LOG_LINE_CNT_FILE
fi

STORED_LINE_CTR=`cat $ALERT_LOG_LINE_CNT_FILE`

ALERT_LOG_LINE_CTR=`cat $ALERT_LOG_FILE|wc -l`

#check to see whether somebody has recycled alert log file.
#in this case the STORED_LINE_CTR will be higher than ALERT_LOG_LINE_CTR
#If so just reset STORED_LINE_CTR to 0.
if [ $ALERT_LOG_LINE_CTR -lt $STORED_LINE_CTR ] ; then
STORED_LINE_CTR=0
fi

LINES_TO_MONITOR=`expr $ALERT_LOG_LINE_CTR - $STORED_LINE_CTR`

tail -n $LINES_TO_MONITOR $ALERT_LOG_FILE|grep -i "ora-" > /tmp/alert_mail.log

#Reset line number to ctr file to skip the scanned rows in the next run
echo $ALERT_LOG_LINE_CTR > $ALERT_LOG_LINE_CNT_FILE

#Email only if the file is not empty
if [ -s $EMAIL_FILE ] ; then
cat $EMAIL_FILE|mailx -s "Alert Log Errors" $EMAIL_LIST
fi

Wednesday, May 8, 2013

ASM Persistent Device Mapping for Oracle 11gR2 RAC for RHEL 6

Recently when I was setting up a RAC setup at home using some dell desktop servers , I hit a unique problem.
Before going into that , here was my setup details

RAC Node Names : siddhu.gautham.com, surya.gautham.com (CentOS 6.4)
Openfiler : openfiler.gautham.com (Openfiler 2.3 or 2.4 (not sure) 32-bit) (ip 192.168.1.195(eth0),
192.168.2.195(eth1)) ->Used to provide shared storage to the RAC Nodes.

The openfiler server was deployed to provide iscsi storage to the other RAC Nodes.

After the install of the 3 servers and provisioning of the storage to the 2 RAC nodes, I reached a point where i needed to install the ASMLib libraries for device persistance. That is when I realized that Oracle Corp had stopped supplying ASMLib packages starting RHEL 6.

So now I had to look at other alternatives for device persistence.

The below should tell you what I mean by device persistence
=========================================

[root@siddhu by-path]# pwd
/dev/disk/by-path

[root@siddhu by-path]# ls -lt |grep 192.168.2

lrwxrwxrwx 1 root root 9 May 6 23:27 ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.crs1-lun-0 -> ../../sdb
lrwxrwxrwx 1 root root 9 May 6 23:27 ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.data1-lun-0 -> ../../sde
lrwxrwxrwx 1 root root 9 May 6 23:27 ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.fra1-lun-0 -> ../../sdc

The reason I had to filter with 192.168.2 was because openfiler had 2 network interfaces and I want to use storage presented only via eth1 which was 192.168.2.195.

The problem above is the crs lun is available as /dev/sdb, data lun as /dev/sde and fra lun is /dev/sdc.
The moment the system is rebooted this would change to a different mapping.

Also if you go the other nac rode i.e. surya.gautham.com , you would see the above mappings as completely different i.e. CRS lun would not be presented as the same as /dev/sdb but as something else.

Therefore what i needed was a reliable mapping like say /dev/asm-crs-disk1 which would always map to crs lun and /dev/asm-data-disk1 which would map to data lun and so on across all nodes of the RAC.

The way we get this is to use udev rules.

Before you do this, please partition the the 3 devices above using fdisk command.
Create just 1 primary partition (I am not going to the details of this).

fdisk /dev/sdb
fdisk /dev/sde
fdisk /dev/sdc

Also perform partprobe on the other node to see the resulting /dev/sd[x]1 partition.

PLEASE NOTE THAT I HAD TO REBOOT BOTH THE RAC NODES AND THERFORE THE MAPPING CHANGED FROM ABOVE.PLEASE DO NOT LET THE MISMATCH ABOVE AND BELOW CONFUSE YOU.

From above we identified that /dev/sdd points to CRS in siddhu and /dev/sdc points to CRS in surya
[root@siddhu ~]# /sbin/scsi_id -g -u -d /dev/sdd
14f504e46494c45526834373035642d374d73772d794c386b

[root@surya iscsi]# /sbin/scsi_id -g -u -d /dev/sdc
14f504e46494c45526834373035642d374d73772d794c386b

The above shows that identifier is same irrespective of whether it is /dev/sdc(surya) or /dev/sdd(siddhu)
and this identifier will always be the same for CRS.

For FRA
=======
[root@surya iscsi]# /sbin/scsi_id -g -u -d /dev/sdf
14f504e46494c455252635354667a2d4a6153492d6c307a66

[root@siddhu ~]# /sbin/scsi_id -g -u -d /dev/sde
14f504e46494c455252635354667a2d4a6153492d6c307a66

For DATA
========
[root@siddhu ~]# /sbin/scsi_id -g -u -d /dev/sdf
14f504e46494c45523473334e397a2d70347a612d554a394f

[root@surya iscsi]# /sbin/scsi_id -g -u -d /dev/sdg
14f504e46494c45523473334e397a2d70347a612d554a394f

vi /etc/udev/rules.d/99-oracle-asmdevices.rules

KERNEL=="sd?1", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -d /dev/$parent", RESULT=="14f504e46494c45526834373035642d374d73772d794c386b", NAME="asm-crs-disk1", OWNER="oracle", GROUP="dba", MODE="0660"
KERNEL=="sd?1", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -d /dev/$parent", RESULT=="14f504e46494c455252635354667a2d4a6153492d6c307a66", NAME="asm-fra-disk1", OWNER="oracle", GROUP="dba", MODE="0660"
KERNEL=="sd?1", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -d /dev/$parent", RESULT=="14f504e46494c45523473334e397a2d70347a612d554a394f", NAME="asm-data-disk1", OWNER="oracle", GROUP="dba", MODE="0660"

After reboot
+++++++++++++

[root@surya dev]# (cd /dev/disk/by-path; ls -l *openfiler* |grep -i 192.168.2| awk '{FS=" "; print $9 " " $10 " " $11}')
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.crs1-lun-0 -> ../../sde
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.crs1-lun-0-part1 -> ../../asm-crs-disk1
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.data1-lun-0 -> ../../sdg
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.data1-lun-0-part1 -> ../../asm-data-disk1
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.fra1-lun-0 -> ../../sdc
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.fra1-lun-0-part1 -> ../../asm-fra-disk1
[root@surya dev]#

[root@surya dev]# ls -lt /dev/*asm*
brw-rw---- 1 oracle dba 8, 65 May 1 00:50 /dev/asm-crs-disk1
brw-rw---- 1 oracle dba 8, 33 May 1 00:50 /dev/asm-fra-disk1
brw-rw---- 1 oracle dba 8, 97 May 1 00:50 /dev/asm-data-disk1
[root@surya dev]#