Thursday, June 7, 2012

Golden Gate Bounded Recovery ( BR) Process

I created a sample script called test.sh and introduced a sleeptime of 5 hours(or 18,000 seconds).

Make sure below user has DBA privs so that he can query from v$log(to check the log sequence).


++++++Contents of test.sh +++++++++++++++++++++++++

#!/bin/ksh
sqlplus ops\$gchandra/test123 << EOF >test.log

select max(sequence#) from v\$log;
select to_char(sysdate,'dd-mon-yyyy hh24:mi:ss') from dual;
drop table testtab;
create table testtab (col1 number);
insert into testtab values(1);
execute dbms_lock.sleep(18000);
commit;
exit
EOF

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

I run the script in background with no interruption.
jupiter<oracle>/tmp> nohup ./test.sh &
[1]     28917




jupiter<oracle>/oracle/arch/athenadv/arch> cat /tmp/test.log

SQL*Plus: Release 11.2.0.3.0 Production on Thu Jun 7 20:29:25 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options

SQL> SQL>
MAX(SEQUENCE#)
--------------
         15625

SQL>
TO_CHAR(SYSDATE,'DD-MON-YYYYH
-----------------------------
07-jun-2012 20:29:26

SQL>
Table dropped.

SQL>
Table created.

SQL>
1 row created.





The point to note is if the GG process is stopped or aborts at this point of time and restarted later, it needs
access to redo log from sequence 15625 . This is because uncomitted transactions are never written to the trail file
by golden gate. Only commited transactions are written.

There is something called bounded recovery which is basically a time period i.e 4 hours by default.
After every BR time interval , GG looks at the active transactions which qualify as old transactions ( i.e
they have been active for more than BR time period) and writes it to a temp location GG HOME/BR/<extract name> .

For example if the BR is set to 4 hours(default), and say a BR flush happens at 9am. At that time it scans all the active transactions
which started earlier than 5am (9am -4 hrs).
Say there were say 2 active transactions A and B  .Transaction A started at 5.01am and transaction B started at
3.59am. Therefore transaction B is flushed to the BR location and transaction A is ignored as it has not been running for 4 hours.

Therefore if the golden gate crashed just a few seconds before the next flushout i.e.1pm , then transaction A which started
at 5.01am is lost.Therefore when you restart the GG it needs archive logs generated from 5.01am.

So in this worst case scenario , it needs access to log files(or archive files) from 2 times BR interval.
So prepare to retain archivelogs (or have capacity to restore archive logs) for 2 X (BR Interval period).



Now via RMAN, I delete the archivelog 15625.
Please note that the transaction is still active but now if GG is stopped and restarted, it will not have access to archive sequence 15625
as we have deleted it via rman using a simple command

backup
filesperset = 4
format 'al_%d%s%t_%U'
archivelog until time 'sysdate'
delete all input;



Now let us see if bounded recovery comes to our rescue.

I have on purpose set BR to 20 minutes and will patiently wait for the BR to happen.
Then I will note the log sequence as of the BR . Ideally after this point if GG process is stopped and restarted, it should
not need the original log sequence 15625 but should need only the log sequence during the BR flushout.

The BR flushout happens (as shown by GG log at 8.50pm)

2012-06-07 20:50:20  INFO    OGG-01738  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  BOUNDED RECOVERY: CHECKPOINT: for object pool 1: p28822_extr: start=SeqNo: 15642, RBA: 14661648, SCN: 1932.1131459784 (8299008275656), Timestamp: 2012-06-07 20:50:19.000000, end=SeqNo: 15642, RBA: 14662656, SCN: 1932.1131459784 (8299008275656), Timestamp: 2012-06-07 20:50:19.000000.


Also see it here
jupiter<oracle>/oracle/software/goldengate/11.1.1.1.2/BR/XTATHDV> ls -lt
total 64
-rw-rw----   1 oracle     dba          65536 Jun  7 20:50 CP.XTATHDV.000000003
-rw-rw----   1 oracle     dba          65536 Jun  7 20:50 PO.XTATHDV.0000001.0000000
drwxrwxr-x   2 oracle     dba             96 Jun  7 15:54 stale



SQL> select sequence# from v$archived_log
  2   where to_date('07-jun-2012 20:50:00','dd-mon-yyyy hh24:mi:ss') between first_time and next_time;

 SEQUENCE#
----------
     15641

SQL> select sequence# from v$archived_log
  2    where to_date('07-jun-2012 20:50:59','dd-mon-yyyy hh24:mi:ss') between first_time and next_time;

 SEQUENCE#
----------
     15642


So at this point if I stop GG and restart I should need only from sequence 15641 or 15642 . To be on safer side, make
sure you have archive logs from sequence 15641.

let us test this !!!!


GGSCI (jupiter) 39> stop ext xtathdv

Sending STOP request to EXTRACT XTATHDV ...
Request processed.


Restart the extract and carefully observe the ggserr.log file


2012-06-07 20:53:46  INFO    OGG-00975  Oracle GoldenGate Manager for Oracle, mgr.prm:  EXTRACT XTATHDV starting.
2012-06-07 20:53:46  INFO    OGG-00992  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  EXTRACT XTATHDV starting.
2012-06-07 20:53:47  INFO    OGG-01639  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  BOUNDED RECOVERY:  ACTIVE: for object pool 1: p28822_extr.
2012-06-07 20:53:47  INFO    OGG-01640  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  BOUNDED RECOVERY: recovery start XID: 0.0.0.
2012-06-07 20:53:47  INFO    OGG-01641  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  BOUNDED RECOVERY: recovery start position: SeqNo: 15642, RBA: 14661648, SCN: 1932.1131459784 (8299008275656), Timestamp: 2012-06-07 20:50:19.000000.
2012-06-07 20:53:47  INFO    OGG-01642  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  BOUNDED RECOVERY: recovery end position: SeqNo: 15642, RBA: 14662656, SCN: 1932.1131459784 (8299008275656), Timestamp: 2012-06-07 20:50:19.000000.
2012-06-07 20:53:47  INFO    OGG-01579  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  BOUNDED RECOVERY: VALID BCP: CP.XTATHDV.000000003.
2012-06-07 20:53:47  INFO    OGG-01629  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  BOUNDED RECOVERY: PERSISTED OBJECTS RECOVERED: 1.
2012-06-07 20:53:50  INFO    OGG-01513  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  Positioning to Sequence 15642, RBA 14661648.
2012-06-07 20:53:51  INFO    OGG-01516  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  Positioned to Sequence 15642, RBA 14661648, Jun 7, 2012 8:50:19 PM.
2012-06-07 20:53:51  INFO    OGG-00993  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  EXTRACT XTATHDV started.
2012-06-07 20:53:51  INFO    OGG-01055  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  Recovery initialization completed for target file /oracle/goldengatedata/dirdat/at000005, at RBA 2213190.
2012-06-07 20:53:51  INFO    OGG-01478  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  Output file /oracle/goldengatedata/dirdat/at is using format RELEASE 10.4/11.1.
2012-06-07 20:53:51  INFO    OGG-01026  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  Rolling over remote file /oracle/goldengatedata/dirdat/at000005.
2012-06-07 20:53:51  INFO    OGG-01053  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  Recovery completed for target file /oracle/goldengatedata/dirdat/at000006, at RBA 1032.
2012-06-07 20:53:51  INFO    OGG-01057  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  Recovery completed for all targets.
2012-06-07 20:53:51  INFO    OGG-01517  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  Position of first record processed Sequence 15642, RBA 14661648, SCN 1932.1131459784, Jun 7, 2012 8:50:19 PM.
2012-06-07 20:53:51  INFO    OGG-01644  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  BOUNDED RECOVERY: COMPLETE: for object pool 1: p28822_extr at SeqNo: 15642, RBA: 14662672, SCN: 1932.1131459785 (8299008275657).
2012-06-07 20:54:02  WARNING OGG-00869  Oracle GoldenGate Capture for Oracle, xtathdv.prm:  No unique key is defined for table DV3_STATUS. All viable columns will be used to represent the key, but may not guarantee uniqueness.  KEYCOLS may be used to define the key.


If you look above it starts looking at archive logs only from sequence 15642. This is indeed a big relief as some transactions
run longer than 4 hours and if i left it to default I might have to restore as much as 8 hours  of archive logs.

1 comment: