Sunday, September 30, 2012

Oow day 1

Score so far 2 great sessions, 1 disapointing and more basic than the title implied and but i will not disclose which one. ;)

Christo Kutrovsky s session was really great with humor and up to speed

"The Answers to Free Memory, Swap, Oracle Database, and Everything"

"Resolving Child Cursor Issues Resulting in Mutex Waits" by. Martin Klier was very very interesting...

If you have the chance to follow these sessions at other conferences please do.

More impressions from SFO

Some pics of today in chronological order





some cat racing ....


and then we saw this beauty pasing by .




















Saturday, September 29, 2012

First impressions of sfo and oow

Some impressions of oracle openworld and san francisco... More to come in next days...

Some streets will be closed during the convention, not everybody is very happy with this :)

Just got bag and pass










Friday, September 28, 2012

Logical standby how to use for upgrading (2)


In my previous article i blogged about the use of logical standby  for upgrading purposes.

As said the client wants to have a physical standby per site at all times during the upgrade process and to have a limited downtime. 
So that the production database will nourish the logical standby which we can upgrade to 11.2.0.3 and which in it's turn will feed the redo it generates to other physical standby …..


I will not go through the documentation here but describe what we did.


i would like to refer to an excellent paper  which I followed but adapted to our situation


Database Rolling Upgrade Using Transient Logical Standby
Oracle Database 10g Release 2
Oracle Maximum Availability Architecture White Paper August, 2010

which can be found here  





It was decided not to do the upgrade in a Rolling fashion but to put a parallel environment in place, that way there is always a fallback. we didn't have to buy new hardware to do that but made use of containers in solaris ... 
Since we are not doing a rolling upgrade this means that we didn't use the KEEP identity feature of logical standby but that the logical standby received a new database namen with its own db id


so briefly explained you start of with a physical database
and then you convert it to a logical standby.


the DB_UNIQUE_NAME of the logical standby will be PRODSITEB



First step after you checked the physical standby is following correctly, all datatypes are supported ...  is to stop the application of redo on the physical standby



    SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  CANCEL;

on the primary db you have to build a dictionary in the redologs and also enable supplemental loging this is done by executing following



    SQL> begin
           dbms_logstdby.build();
         end;
         /



==> this makes use of flashback query make sure the undo_retention is at least 3600
==> this can have some nasty consequences so best to do it in off peak periods Alter Database Add Supplemental Log Data Hangs [ID 406498.1] I scheduled a maintenance window to do this to avoid all risk.

A logical standby will generate redo from its  own so make sure you create a directory, since we use ASM i created a directory with asmcmd

ASMCMD>cd +ARCHDG1/PRDSITEB
ASMCMD>mkdir standbyARCHIVE


on the future logical standby I also put following :

SQL> alter system set log_archive_Dest_2='LOCATION=+ARCHDG1/PRDSITEB/standbyarchive VALID_FOR=(STANDBY_LOGFILES,STANDBY_ROLE) DB_UNIQUE_NAME=PRODSITEB';


This way the redo logs generated by the primary and source redo logfiles for mining  are put here.


the redo log files that are generated by reconstructing the sql statements are put in a different place as you can see here


 SQL>alter system set log_archive_dest_1='LOCATION=+ARCHDG1 VALID_FOR=(ONLINE_LOGFILE,STANDBY_ROLE)';


this took me a while to get, because it sounds a bit contradictory, ONLINE_LOGFILE in a STANDBY_ROLE but if you think about it, it makes sense ...


once this was set we actually converted the physical standby to logical standby



SQL> ALTER DATABASE RECOVER TO LOGICAL STANDBY DB11G

SQL> BEGIN
DBMS_LOGSTDBY.APPLY_SET(„LOG_AUTO_DELETE‟,‟FALSE‟);
END;



Now is time to really open the logical standby

SQL> ALTER DATABASE OPEN RESETLOGS; 

start mining ….

SQL> ALTER DATABASE START LOGICAL STANDBY APPLY IMMEDIATE;

check if that works
Now you do whatever you want with the logical standby, e.g. upgrade it but first stop logical apply


SQL> ALTER DATABASE STOP LOGICAL STANDBY APPLY ;



After the upgrade finished I duplicated the standby via rman …. from active database … see here for more info about that great feature…


to ship the redo that the logical standby generated following log_Archive_dest needs to be defined;


SQL> ALTER system set log_archive_dest_3='SERVICE=PHYSTDB LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,STANDBY_ROLE)  DB_UNIQUE_NAME=PHYSTDB'




done …

cave eat and known issues :
  • FAL and correct settings for the log_Archive_dests are two different things, sometimes FAL works but not regular sending for redo, in fact it manifested itself after putting SQL apply on, FAL kicked in but afterwards the primary didn't send the redo ….thanks to @GeertDepaep my buddy for pointing this out
  • our primary database version 10203 was pretty bug prone regarding to logical standby

I encountered following bugs :

  • 3313487 : A failed Create Table As Select (CTAS) causes SQL Apply to shutdown on logical standby with ORA-95
  • 5685296: Apply process stops with PLS-0103
  • also other one unfortunately no reference to that were indexes were being created twice,at least that is what was tried but which failed miserably … solution was to skip the transaction altogether..

         





Wednesday, September 12, 2012

rman duplicate from active database for standby cave-eats

today I started to clone the future production 11.2.0.3.3 database which is currently a logical standby to different physical standby's this way i have a parallel environment ready and downtime will be minimal for the actual upgrade.

I used rman and a new feature duplicate target database from active database.


Prerequisites :

1. make a static listener entry for you standby
2. copy over the password file
3. create a minimal init.ora to be able startup nomount  (DB_NAME,DB_BLOCKSIZE,DB_UNIQUE_NAME...)

Test to  make sure you can connect to the primary ( in my case my logical standby ) and vice versa
you connect to the target database.


the manual recommends to use an spfile so that way you could do following



connect target sys@prdsitex
connect auxiliary sys@prdsiteyrun 
{
allocate channel prmy1 type disk;
allocate channel prmy2 type disk;
allocate channel prmy3 type disk;
allocate channel prmy4 type disk;
allocate channel prmy5 type disk;
allocate auxiliary channel stby type disk;
duplicate target database for standby from active database
spfile 
  set db_unique_name='PRDSITEY' 
set fal_client='PRDSITEY'
  set fal_server='PRDSITEX' 
set log_archive_config='dg_config=(PRDSITEX,PRDSITEY)' 
 nofilenamecheck;
}






This didn't work out in my case : since I was still running in compatible 10.2.0.3.0 on the logical standby (my primary) i received following error ;


startup failed: ORA-32012: SPFILE format is inconsistent with value of COMPATIBLE parameter

I circumvented this by creating an init.ora for the standby however it was a procession of echternach (3 steps forward and 2 backwards) to get it working.

i had issues with DB_FILES not correctly set,  remote_listener (this is a setup) that couldn't be resolved, log_archive_config not correctly set, .... all things that are arranged automatically when you are able to use an spfile .... 

in the end i put following in the init.ora


db_unique_name=PRDSITEY
compatible='10.2.0.3.0'
db_name=PRD
db_block_size=8192
service_names=PRDSITEY.eurid.eu
remote_login_passwordfile=EXCLUSIVE
CONTROL_FILES='+DATADG1/prdsite/controlfile/current1.ctl'parameter_value_convert 'PRDSITEX','PRDSITEY'
fal_client='PRDSITEY'
fal_server=PRDSITEX'
log_archive_config='dg_config=(PRDSITEX,PRDSITEY)'
remote_listener='racsiteyprd-scan:1521'
db_files=1024
sga_max_size=32G
db_cache_size=20G
large_pool_size=1G
shared_pool_size=4G
pga_aggregate_target=16G
recovery_parallelism=16



NOTE one error here : 
I copy pasted parameter_value_convert 'PRDSITEY.... but without the = sign .... 
since it comes after the controlfile section it was seen as names of control files... I noticed files named parameter_value_convert, PRDSITEY, PRDSITEX in my $ORACLE_HOME/dbs directory the default place.


this worked out for me the controlfile was copied over and the new db restarted and was mounted awaiting the restore of the datafiles (which is currently still in progress)



run 
{
allocate channel prmy1 type disk;
allocate channel prmy2 type disk;
allocate channel prmy3 type disk;
allocate channel prmy4 type disk;
allocate channel prmy5 type disk;
allocate auxiliary channel stby type disk;
duplicate target database for standby from active database
nofilenamecheck;
}


note : i changed the names of db's for privacy issues 

thx to my twitter buddies @ik_zelf and @leight0nn for the nice moral support and tips ;-)

and here


Tuesday, September 11, 2012

services in 11.2.0.3 with logical standby

Today i was experimenting with services to make them run on a preferred node and with an available node in case of issues.


I created the service as follows


srvctl add service -d   PRDCLU -s myservice -r PRDN2 -a PRDN1 -y AUTOMATIC 

for some reason the service didn't failover to the other service when i performed a shutdown abort.

I experimented a bit with the different options to create a service but the results remained the same.

I googled a bit and came on the site of Freek  and found this interesting article Freek noticed the same behaviour as I.

his article pointed to note 1324574.1 – “11gR2 RAC Service Not Failing Over To Other Node When Instance Is Shut Down”.

This made me think that i hit a bug or something. 

But then I  saw that there was a parameter with srvctl add service to indicate the role of the database the -l option .

I had a look in the 

oraagent_oracle.log


2012-09-11 17:14:14.685: [ora.prd.myservice.xxx.xxx.svc][45] {2:39545:58589} [start] SvcAgent::start: not starting service myservice.xxx.xxx. Role mismatch - Service role: PRIMARY, current DB role: LOGICAL_STANDBY



and my assumptions were confirmed

just creating the service with the -l option solved my problem

srvctl add service -d   PRDCLU -s myservice -r PRDN2 -a PRDN1 -y AUTOMATIC -l logical_standby


The service fails now over correctly.

Thanks Freek for the nice article