Friday, April 20, 2018

Setting up standby of PDB 12.2.0.1.170818

We have a single tenant production database

We wanted to create a standby database for it.

That worked perfectly fine :





SET DECRYPTION WALLET OPEN IDENTIFIED BY “password”;
run
{
 allocate channel target01 type disk;
allocate channel target02 type disk;
allocate channel target03 type disk;
allocate channel target04 type disk;
ALLOCATE auxiliary CHANNEL aux01 TYPE DISK;
ALLOCATE auxiliary CHANNEL aux02 TYPE DISK;
ALLOCATE auxiliary CHANNEL aux03 TYPE DISK;
ALLOCATE auxiliary CHANNEL aux04 TYPE DISK;
duplicate target database for standby from active database password file section size 2000M;
}
However ..... my datafiles where not put in place correctly on the standby on the primary it had this structure
+DATA/db_unique_name/DATAFILE 

+DATA/db_unique_name/pdb_guid/DATAFILE

on the Standby on the other hand .... Everything also the PDB datafiles in
+DATA/db_unique_name/DATAFILE
After checking all my parameters I went to look on MOS apparently known issue




Datafile on Standby Database is created under incorrect directory (Doc ID 2346623.1)
Patches available
Update :
Installed the patch and still the same issue ! :-( Opening SR
Update 2 : Twitter is fantastic Piotr Wrzosek pointed me to bug

INCONSISTENT OMF BEHAVIOUR WITH ALTER DATABASE MOVE DATAFILE bug 17613474
https://twitter.com/pewu78/status/987335828958564352
It seems to be present already for some time however no patch available for our version it seems ;(
Patch 24836489: DATAFILES ARE CREATED WRONG LOCATION IN OMF DEFINED PDB DATABASE



Update and solution : REMOVE SECTION SIZE from the RMAN duplicate then the files are created correctly Need to rollback patch to see if just REMOVING section size is enough or if the above mentioned patch 25576813 is still needed



Tested without the patch seems that section size is the culprit

Thursday, April 12, 2018

QFSDP JANUARY 2018 on Exadata with OVM with more then 8 vm's watch out

Last Month we installed the OS related, firmware, .... part of the QFSDP JANUARY 2018 12.2.1.1.6 to be more specific, on our test and dev system at my customer.

GI and DB still need to be patched.

After our last patching experience http://pfierens.blogspot.be/2017/06/applying-april-2017-qfsdp-12102-exadata.html this only could go better.


Well to put a very long story short be cautious we ran into now 4 different bugs, causing instability of the RAC clusters, GI that refused to startup, loss of Infiniband Connectivity ...


So the Monday after the patching we were hit by instability of our Exadata OVM infrastructure for Test and Dev and Qualification. Dom0 rebooting ....

There seemed also to be an issue on IB interfaces in the domU, unfortunately
we didn't have a crash dump so support couldn't really do something.


The only way to get GI and DB's up again was to reboot the VM, crsctl stop crs and start crs didn't really work logs showed IB issues


Last time (forgot to blog about that ) we ran into the gnttab_max_frames issue which we had set to 512 after this patching it was put to 256 so we thought that might have been the reason, because in this release another parameter was introduced in grub.conf.



gnttab_max_maptrack_frames
gnttab_max_frames

the relation between the two was difficult to find but in the end this seem not to be the right diagnosis

if you want some more information about the gnttab_max_frames please read this
shortly put each virtual disk needs and networking operations needs a number of frames granted to communicate if this is not correctly set then you have issues ....


Luckily the Friday in that same week we were in the same situation, we decided to let the dom0 crash and that way have a crashdump.

After uploading that crashdump to Support the where able to see that issue was on Melanox HCA Firmware layer. between APR 2017 and January there where 4000 changes in that Firmware that happened which one or combination caused our issue.



Bottom line : There seem to be issue with the melanox HCA firmware (from 2.11.1280 to 2.35.5532.)
in this patch, you may encounter it if you have more then 8 vm's under one dom0, we had 9......



so basically we shutdown one vm on each node and had again stability.

when it was confirmed in numerous conf calls that  8 was  the magic number we decided to move the exadata monitoring vm functionality to another vm and shutdown the monitoring vm, to be again at 8 vm's


we got a stable situation until last Friday where we had an issue with both IB switches being unresponsive and the second switch not take the sm master role, this issue is still under investigation and hopefully not related to the QFSDP JAN 2018 ...



If you have similar symptoms point support to bugs :

  Bug 27724899 - Dom0 crashes with ib_umad_close with large no. of VMs 
  Bug 27691811 
  Bug 27267621 


UPDATE :

There seem to be a bug as well in the IB switch version 2.2.7-1 solved in 2.2.10 (not released yet) not everything is solved only the logging issue but not the main root cause apparently there is a separate ER for this






Sunday, April 1, 2018

to_dog_year on Exadata

One of the new features which seem to be overlooked in all the publications I saw about 18c is the TO_DOG_YEAR() function. It seems obvious that this was missed, because it fairly undocumented as Frank Pachot Pieter Van Puymbroeck Brendan Tierney Martin Berger Oyvind Isene pointed out.


I wanted to know how it behaved on the Exadata especially on my customers OVM on Exadata. tested version : Exadata 18.1.4.0.0 my first tries where not successfully and still are not successful i see quite some strange behaviour.
desc dogyear

FUNCTION TO_DOG_YEAR RETURNS NUMBER
 Argument Name        Type        In/Out Default?
 ------------------------------ ----------------------- ------ --------
 DOB                            DATE                    IN
 FEMALE                         BOOLEAN                 IN      DEFAULT
 NLS_BREAD                      VARCHAR2                IN      DEFAULT
 OWN_SMOKE                      BOOLEAN                 IN      DEFAULT  

i tried following :
select to_dogyear(to_date(‘01-04-2008','DD-MM-YYYY’) , ‘LABRADOR’) from dual ;


and this raised following :

ORA-07445: exception encountered: core dump [DOG_BREED_VIOLATION] [+34] [PC:0xDEADBEEF] [ADDR:0x14] [UNABLE_TO_BARK] [] 

when choosing another breed it worked although it gave a pretty bizarre result
select to_dogyear(to_date('28-03-2013','DD-MM-YYYY') , ‘POODLE’) a from dual ;

a
-------------------------
vulnerability not found

while according to pedigree it should be 50 and not only that it should RETURN a Number ? WTH ok what’s happen when we try to run the function on cats
select to_dogyear(to_date(‘01-04-2008','DD-MM-YYYY’), ‘GARFIELD’) a from dual ;


a
------------------------
is this a dog ?

Oracle you have some work to do. I would expect a number to be returned not a string Does anybody else with an Exadata see this behaviour preferably Bare Metal ? Cloud ?


Update 2-APR-2018 Before you think that PIO is Pets In Oracle A little update for those who didn’t realize this was posted on 1st of April. It was an April Fool common idea from some Oracle Community buddies on the post-UKOUG_TECH17 trip. And what remains true all the year is how this community is full of awesome people. And special thanks to Connor who added great ideas here :)