Monday, May 4, 2015

upgrading to GI 12.1.0.2

Some weeks ago I patched the GI from 11.2.0.4 to 12.1.0.2 I implemented the fix for the listener poisoning issue you can read about it here here is the original listener.ora
#CVE-2012-1675
VALID_NODE_CHECKING_REGISTRATION_LISTENER=1
VALID_NODE_CHECKING_REGISTRATION_LISTENER_DG=1
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN1=1
REGISTRATION_INVITED_NODES_LISTENER_SCAN2=(x.y.z.61,x.y.z.64)
 
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN2=1
REGISTRATION_INVITED_NODES_LISTENER_SCAN1=(x.y.z.61,x.y.z.64)
 
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN3=1
REGISTRATION_INVITED_NODES_LISTENER_SCAN3=(x.y.z.61,x.y.z.64)
REGISTRATION_INVITED_NODES_LISTENER_DG=(172.20.20.72,172.20.20.73)

here the upgraded one
#CVE-2012-1675
VALID_NODE_CHECKING_REGISTRATION_LISTENER=1
VALID_NODE_CHECKING_REGISTRATION_LISTENER_DG=1
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN1=OFF             # line added by Agent
REGISTRATION_INVITED_NODES_LISTENER_SCAN2=()            # line added by Agent
 
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN2=OFF             # line added by Agent
REGISTRATION_INVITED_NODES_LISTENER_SCAN1=()            # line added by Agent
 
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN3=OFF             # line added by Agent
REGISTRATION_INVITED_NODES_LISTENER_SCAN3=()            # line added by Agent
REGISTRATION_INVITED_NODES_LISTENER_DG=(172.20.20.72,172.20.20.73)
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER_CLONE=ON                # line added by Agent
VALID_NODE_CHECKING_REGISTRATION_LISTENER_CLONE=SUBNET          # line added by Agent
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_MGMTLSNR=ON              # line added by Agent
VALID_NODE_CHECKING_REGISTRATION_MGMTLSNR=SUBNET                # line added by Agent

The installer / upgrade process removes these entries, I saw similar behaviour with the dbca when you have IFILES in the tnsnames.ora, most annoying these entries get removed .... NOTE : I anNonimized the ip addresses

Sunday, April 19, 2015

OLTP compression and Updates



I did some small tests (actually more troubleshooting ) with OLTP compression, a couple of weeks back.


Some execution times nearly doubled, I wanted to get a clear understanding of what happened so I asked to trace the procedure.

Which was easy since dbms_application.set_module could be used in the procedure to set the module and the action.

This made it much less cumbersome to enable the trace, the  dbms_monitor package has a procedure which allows you to start tracing whenever the combination of service_name and module is used very handy

so in the end I used this to trace


DBMS_MONITOR.SERV_MOD_ACT_TRACE_ENABLE('SERVICE_NAME','MODULE');


That made it easy to trace as soon as the procedure was launched and avoid to play around with logon triggers and trace more, then necessary.

I did the tests both with a warmed up DB cache as with a flushed db cache and will do them again next week the results were in the same league


But back to the problem, speed of the query was much slower.


I checked both



NONCOMPRESSED

45810 rows updated.
Elapsed: 00:00:01.59

COMPRESSED

45810 rows updated.
Elapsed: 00:00:04.38


Quite a difference ...



The query looks a bit like this I had to obfuscate because of privacy reasons :

UPDATE TAB1 SET

(COLUMNS ....)

= (SELECT COLUMNS ....

FROM TAB2

WHERE TAB2.ID = TAB1.TAB2_ID)

WHERE TAB2.DATE_FIELD = TO_DATE('31/03/2015' ,'DD/MM/YYYY')







This is how the trace file looks like after being processed with Method R profiler a tool i really can recommend, it save me lots of time, time you can use to solve you performance issue, instead of crawling through raw trace files ...


Here You see the uncompressed trace



here is the compress trace


Much more blocks in Current mode


In both cases the last step of the execution plan takes quite a bit

the last step of the compressed update took ; ~ 3,25 sec (4,37 -0,34 ....)
the last step of the non compressed update ;  ~ 0,77 sec

So unbelievable that I will redo some tests again next week and add them to the post.

Thursday, April 2, 2015

12c adr support

I noticed that with the upgrade of GI to 12c some new things are in the ADRCI
 the scan listeners are know under adrci control as well, 
In 12c,(12.1.0.2.4)

adrci

ADRCI: Release 12.1.0.2.0 - Production on Tue Mar 31 18:17:21 2015

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

ADR base = "/u01/app/grid"
adrci> show control
DIA-48448: This command does not support multiple ADR homes

adrci> show homes
ADR Homes:
diag/tnslsnr/node2/listener_clone
diag/tnslsnr/node2/listener_scan2
diag/tnslsnr/node2/listener_scan1
diag/tnslsnr/node2/listener
diag/tnslsnr/node2/listener_scan3
diag/tnslsnr/node2/listener_dg
diag/tnslsnr/node2/mgmtlsnr
diag/rdbms/_mgmtdb/-MGMTDB
diag/crs/node2/crs
diag/asm/+asm/+ASM2
diag/asm/user_grid/host_4252752997_82
adrci> set home diag/tnslsnr/node2/listener


on 11g this was not the case


while on 11204 …

ADRCI: Release 11.2.0.4.0 - Production on Tue Mar 31 18:21:18 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

ADR base = "/u01/app/grid"
adrci> show homes
ADR Homes:
diag/asm/+asm/+ASM2
diag/tnslsnr/other_node2/listener
diag/tnslsnr/other_node2/listener_dg
diag/tnslsnr/other_node2/listener_clone

Tuesday, March 31, 2015

assumption is the mother of ....

We had complaint about a batch job that took much longer to complete on environment A then on environment B.

I checked and indeed the plans where different

As we recently updated from 11.2.0.4 BP12 to BP15 on that environment (A) my first idea was that this caused the issue,  to exclude everything I imported the table stats from the working environment B to the environment A.
Still I got the same plan.


I checked the system stats they were the same.
I used Mauro Pagano's excellent SQLd360 tool get it here and couldn't see anything (that's ENTIRELY my fault I could really work on this and was multitasking a lot due to other constraints;-).


I ran a 10053 optimizer trace and saw that there were fix controls not used on env B and only FTS were used in SINGLE TABLE ACCESS PATH , .... bizar 

Environment A

SINGLE TABLE ACCESS PATH 
  Single Table Cardinality Estimation for TABLE_X[TABLE_X
  Column (#2): 
    NewDensity:0.000210, OldDensity:0.000476 BktCnt:254, PopBktCnt:2, PopValCnt:1, NDV:4725
  Column (#2): COL2(
    AvgLen: 7 NDV: 4725 Nulls: 0 Density: 0.000210 Min: 113898 Max: 814396873
    Histogram: HtBal  #Bkts: 254  UncompBkts: 254  EndPtVals: 254
  Column (#3): 
    NewDensity:0.000019, OldDensity:0.001339 BktCnt:254, PopBktCnt:18, PopValCnt:9, NDV:48180
  Column (#3): COL3(
    AvgLen: 7 NDV: 48180 Nulls: 0 Density: 0.000019 Min: 113328 Max: 816880773
    Histogram: HtBal  #Bkts: 254  UncompBkts: 254  EndPtVals: 246

  Table: TABLE_X  Alias: TABLE_X
    Card: Original: 633107.000000  Rounded: 1  Computed: 0.00  Non Adjusted: 0.00
  Access Path: TableScan
    Cost:  1113.98  Resp: 1113.98  Degree: 0
      Cost_io: 1100.00  Cost_cpu: 180830851
      Resp_io: 1100.00  Resp_cpu: 180830851
  Best:: AccessPath: TableScan
         Cost: 1113.98  Degree: 1  Resp: 1113.98  Card: 0.00  Bytes: 0

While on the other env B

SINGLE TABLE ACCESS PATH 
  Single Table Cardinality Estimation for TABLE_X[TABLE_X
  Column (#2): 
    NewDensity:0.000210, OldDensity:0.000476 BktCnt:254, PopBktCnt:2, PopValCnt:1, NDV:4725
  Column (#2): COL2(
    AvgLen: 7 NDV: 4725 Nulls: 0 Density: 0.000210 Min: 113898 Max: 814396873
    Histogram: HtBal  #Bkts: 254  UncompBkts: 254  EndPtVals: 254
  Column (#3): 
    NewDensity:0.000019, OldDensity:0.001339 BktCnt:254, PopBktCnt:18, PopValCnt:9, NDV:48180
  Column (#3): COL3(
    AvgLen: 7 NDV: 48180 Nulls: 0 Density: 0.000019 Min: 113328 Max: 816880773
    Histogram: HtBal  #Bkts: 254  UncompBkts: 254  EndPtVals: 246
  ColGroup (#1, Index) IX_4
    Col#: 4 5    CorStregth: -1.00
  ColGroup Usage:: PredCnt: 2  Matches Full:  Partial: 
  Table: TABLE_X  Alias: TABLE_X
    Card: Original: 633107.000000  Rounded: 1  Computed: 0.00  Non Adjusted: 0.00
  Access Path: TableScan
    Cost:  1113.55  Resp: 1113.55  Degree: 0
      Cost_io: 1100.00  Cost_cpu: 180830851
      Resp_io: 1100.00  Resp_cpu: 180830851
  Access Path: index (AllEqRange)
    Index: IX_2
    resc_io: 9.00  resc_cpu: 69774
    ix_sel: 0.000019  ix_sel_with_filters: 0.000019 
    Cost: 9.01  Resp: 9.01  Degree: 1
  Access Path: index (AllEqRange)
    Index: IX_3
    resc_io: 28.00  resc_cpu: 257919
    ix_sel: 0.000210  ix_sel_with_filters: 0.000210 
    Cost: 28.02  Resp: 28.02  Degree: 1
  ****** trying bitmap/domain indexes ******
  Access Path: index (AllEqRange)
    Index: IX_2
    resc_io: 3.00  resc_cpu: 23964
    ix_sel: 0.000019  ix_sel_with_filters: 0.000019 
    Cost: 3.00  Resp: 3.00  Degree: 0
  Access Path: index (AllEqRange)
    Index: IX_3
    resc_io: 3.00  resc_cpu: 47964
    ix_sel: 0.000210  ix_sel_with_filters: 0.000210 
    Cost: 3.00  Resp: 3.00  Degree: 0
  Bitmap nodes:
    Used IX_2
      Cost = 3.001919, sel = 0.000019
    Used IX_3j
      Cost = 3.004938, sel = 0.000210
  Access path: Bitmap index - accepted
    Cost: 6.007407 Cost_io: 6.000536 Cost_cpu: 91721.496173 Sel: 0.000000
    Not Believed to be index-only
  ****** finished trying bitmap/domain indexes ******
******** Begin index join costing ********
  Access Path: index (FullScan)
    Index: IX_1
    resc_io: 1439.00  resc_cpu: 136869152
    ix_sel: 1.000000  ix_sel_with_filters: 1.000000 
    Cost: 1449.25  Resp: 1449.25  Degree: 0



Being in a hurry because I was busy with plenty of stuff  I assumed  that fix_control was the problem

however setting fix_control didn't change a thing so I went back to check  the trace with a colleague  and then we saw following


Table Stats::
  Table: TABLE_X  Alias: TABLE_X
    #Rows: 633107  #Blks:  4056  AvgRowLen:  43.00  ChainCnt:  0.00
Index Stats::
  Index: IX_1  Col#: 1
    LVLS: 2  #LB: 1437  #DK: 633107  LB/K: 1.00  DB/K: 1.00  CLUF: 37633.00
  Index: IX_2  Col#: 3
    LVLS: 2  #LB: 1665  #DK: 48180  LB/K: 1.00  DB/K: 5.00  CLUF: 287723.00
    INVISIBLE
  Index: IX_3  Col#: 2
    LVLS: 2  #LB: 2103  #DK: 4725  LB/K: 1.00  DB/K: 25.00  CLUF: 118278.00
    INVISIBLE
  Index: IX_4  Col#: 4 5
    LVLS: 2  #LB: 2490  #DK: 14906  LB/K: 1.00  DB/K: 2.00  CLUF: 36217.00
    INVISIBLE



We checked the status of the indexes they were INVISIBLE, when they were put back to VISIBLE the correct plan was choosen again ...


Moral of the story : As one of my previous managers (probably the best I had so far but that is another story )said Assumption is the mother of all f*ck ups, don't think that everything is a complicated problem and check basic things, and take your time to check things don't do 50 things at the same time, man anyway can't multitask ;-------------)
luckily in the end it didn't take to much time to identify the issue...



Thanks @MartinDBA, @Mautro and everyone interacting in the discussion on twitter


Saturday, March 14, 2015

wow what a seminar

wow it’s behind us again a great OUGN Varseminar 2015,big shout out to the boa(r)(t)d  to have made this again a success. 


You’ve set the bar very high for coming years, I hope to be able to be part of other future editions  ! 
It was a very good idea to put the exhibition hall not on the same floor as the talks,giving more breathing room around the rooms

Apart from the fact that a seminar on a boat creates a special bond with the people aboard the quality of the talks was really very good, i learned plenty of things...

I really enjoyed being on a boat with people passionate about there job in fact for most of the people it more then just being a job it is a passion , vocation ,...

It was great to meetup with old friends and make new ones

my personal highlights not in any particular order

were the sessions of :

Neil Johnson (@neiljdba) about contentious small tables really a very good presentation I learned a lot about the hakan factor, I strongly recommend to go to follow this session on other conferences this year, beside that Neil is a great guy to hangout with. Unfortunately I missed his talk about jumbo frames...but the good news I found his slides already ;)

Kellyn Potvin-Gorman  also known as @DBAKevlar talking about AWR Warehouse thanks to her presentation I finally know that this is a “free” limited license if you have licensed tuning and diagnostic pack for your  databases, I will start pushing for it at my current customer,because it can really save you bacon and lots of discussio. Her talk about EM 12c was also great by the way despite that she was pretty i'll suffering from her sinusses, so really well done

Jan Karremans’ session ‘ok now my database crashed’ was a very good reminder on how important it is to take this part of your oracle installation not too lightly, btw Jan is pretty (hyper) active on twitter ;-) here (@johnnyq72)

James Morle’s (@JamesMorle) session about opitmizing table scanning  and how to tune all layers of the cake even the blue ones

Luis Marques (@drune) session about Rat was of a very high quality to was his talk about Resource Manager, this is a subject that needs bigger attention especially in the actual spirit of consolidation

I really enjoyed the ‘Instructor’ aka Uwe Hesse’s presentations about the broker and fra I envy his courage of doing a complete talk with live demo’s. (@UweHesse)

I had a blast doing the SE Round Table BE (boat edition) with SE lady Ann Sjokvist (@annsjokvist) and my good friend Jan Karremans, I liked the chemistry that was going on on stage.
SE edition can be a good solution for you, it depends on your business needs and it is an easy step up to EE when you later would need those features. Each of us three looked the product from a Different angle 
Ann really has lots of experience with it and is a really advocate, Jan Looked at it from the business side, and I for my part I am a spoiled EE dba who realises that you need to do correct size matching and see if the 

I had my own talk about a great underestimated product in the engineered systems line up the Sparc Super Cluster, I’ve been working with this product since beginning 2014 and it is for me very much the Swiss Army knife of the engineered exadata systems, providing an un-paralleled versatility and flexibility. 
I liked the questions and discussion with Jacco Landlust that came offline after my talk about Super Cluster.

I would like to explicitely thank Oyvind Isene (@OyvindIsene) and Kjell Tore for many of us international speakers you guys are the public face of OUGN. Speaking For myself if I wouldn’t have come to OUGN as a delegate in 2013 after talking to Oyvind at #OOW2012  I wouldn’t be speaking now, so Oyvind  I owe you  a lot. I hope the next board will continue your work and the investments of blood sweat and tears you two did.

I would also like to thank portrix Florian , Connie and Bjoern (@brost)for the great gin / buffet you organised in Kiel. It becomes clearly a tradition.

Too bad I missed several sessions I due to overlap ;-(


Nice meeting you all girls and guys

Thank you 

Jan Karremans
Heli Helskyaho ;-)
Debra Lilley
Brendan Tierney
Oyvind Isene
Kjell Tore
Frits Hoogland
Christian Antognini
Neil Johnson
Kellyn & Tim Gorman
Gurcan Orhan
Bjoern Roest Florian 
Andy Colvin & wife
James Morle
Luis Marquis
Kelly Potvin
Tim Gorman
Jacco Landlust
Alex Nuijten
Patrick Barrel
Magnus Fagertun
Mark Rittman
Eric Van Roon



Thank you all for making my OUGN 15 so great

If you haven't decided yet maybe this small video and some pictures (sorry I am not so good with an iphone DSLR user) can persuade you to attend or submit for OUGN16


video












Saturday, February 14, 2015

patching super cluster


we patched our X4 cells for to the latest and greatest version 12.1.2.1.0 last weekend 
we had some issues with a cell that wasn’t reachable anymore during patching luckily after an init 6 of that cell the patching  continued, 
unfortunately the default 3,6 h asm disk time out was reached and the disks where removed from the disk group…

the next day we needed to upgrade DEV with GI upgrade from 11.2.0.4 to 12.1.0.2 and afterwards the 12.1.0.2.4 upgrade of january… the we upgrade the dbs from 11.2.0.4.12 to BP15 11.2.0.4.15

all went ok just one big caveat (it is in the patch notes ), once you upgrade to GI 12c the opatch  auto doesn’t work anymore for the db lower the 12d, so we had to apply those patches manually without opatch auto

so after a long day all database where patched ( mostly because i could start late, it take a while to stop all applications)…

the next day however asm had generated 64GBytes for core dumps, 

support found quickly that this was related to 

  Bug 20313024 - Exadata Solaris: ORA-7445 [ossdisk_ioctl_compl] on XDMG startup with 12.1.0.2.4 DBBP ( Doc ID 20313024.8 ) 


but then the worst still needed to follow

the unlocking of the GI and patch apply of the patch went fine, however 

 /u01/app/grid/product/12.1.0/grid/crs/install/roothas.pl -patch  didn’t complete…….

not knowing if I could stop this i opened an sr where they told me that it could be done, reexectuting didn’t change anything.

After a while I found info in the logs that point to evm issues i gave this observation  to support 

and also told them that i found old-er files in /var/tmp/.oracle and asked if i could remove them
they said no several times. (however apparently i blogged already about this years ago here;)
In the mean time i had to increase severity to 1 because of coming dev deadline the system was already 1 day longer down than expected ...
Had some phone calls with support in US, one guy didn't know the logging changed from $GRID_HOME/log to $ORACLE_BASE/diag ....
asked them again if i could remove the .oracle files, but again no was the answer because i could break things.

next morning got contacted by support pointed again to .oracle files and there he said that it was ok to remove them 
I did and could execetute roothas.pl -patch after rebooting with inittab entries about ohas disabled  removing those entries in /var/tmp/.oracle 


the point was that i mentioned this very early in the SR and nobody listened....

anyway was happy that everything was running back again, and i hope that by blogging about this it will help other people and  that I will remember the next time i have this issue ;-)

Friday, January 30, 2015

Rac One Node and Data Guard Broker some thoughts on bugs, notes.....


Currently we are moving user acceptance testing db's from regular single instances with ZFS cross-site mirroring to Super Cluster with RAC One and Data Guard.

The initial setup of this was finished a couple of months ago

and apart from Martin Bach (@MartinDBA) suggestions here and here and in real life together with Marcin Przepiorowski (@pioro) in Ireland in march last year.

 I also used following metalink note



Bug 17781373 : ORAAGENT DOES NOT SET LOCAL_LISTENER WHEN LISTENER_NETWORKS IS SET IN SPFILE

This exactly looked like what happened I could easily reproduce this by removing listener_networks

The workaround suggested was to put the nodeVIP as local listener, but that wasn't what the dgbroker wanted because it's static connect identifier was set to the this VIP ...and I want to avoid at all cost to modify it after role changes or relocates of the Rac One instances.

so I finally changed the LISTENER_NETWORKS parameter from


((NAME=network1)(LOCAL_LISTENER=RACONE)(REMOTE_LISTENER=scan01-uat:1521)), ((NAME=network_dg)(LOCAL_LISTENER=DG_VIP)(REMOTE_LISTENER=REMOTE_NET2))

To this :


((NAME=network1)(LOCAL_LISTENER=NODE_VIP,RACONE)(REMOTE_LISTENER=scan01-ora:1521)), ((NAME=network_dg)(LOCAL_LISTENER=DG_VIP)(REMOTE_LISTENER=REMOTE_NET2))




Note that NODE_VIP , RACONE , DG_VIP and REMOTE_NET2 resolve differently on each node in the RAC One Cluster.

eg.
on node1 one this maps to :


DG_VIP, NODE01_LOCAL_NET2=
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = node01dg-vip)(PORT = 1522))
  )


on node2 
DG_VIP, NODE02_LOCAL_NET2=
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = node02dg-vip)(PORT = 1522))
  )
...

also if you use alter system to change this don't forget the quotes

eg

alter system set listener_networks='(NAME=network1)(LOCAL_LISTENER=NODE_VIP,RACONE)(REMOTE_LISTENER=scan01-ora:1521))', '((NAME=network_dg)(LOCAL_LISTENER=DG_VIP)(REMOTE_LISTENER=REMOTE_NET2))';


While discussing some other things with my ex colleague Freek D'Hooge you can follow him on twitter and on his blog he pointed me to a note

How to Configure A Second Listener on a Separate Network in 11.2 Grid Infrastructure (Doc ID 1063571.1)

which says following

" Listeners specified by the LISTENER_NETWORKS parameter should not be used in the LOCAL_LISTENER and REMOTE_LISTENER parameters. Otherwise, cross registration will happen and connections will be redirected cross networks."


however this is not explicitly mentioned in the first note mentioned in this post.
For the time being I left the remote_listener as is and will test on newly created databases in order to remove it from already migrated uat databases...
Thanks Freek , Martin and Marcin for your insights ;-)  you guys rock !



PS I the host names in this blogpost are made up for privacy reasons .