evmd not starting in oracle restart

Monday two weeks ago I patched the DEV Super Cluster System to GI 12.1.0.2 BP10 together with DB 11.2.0.4 BP17. To win time I created a seperate oracle database home and upgraded that one to BP17. Then it would be a matter to stop the db's change their homes and do catbundle exa apply.... So far so good... Then came the time to upgrade GI everything went pretty smooth.
opatchauto -oh $GI_HOME -ocmrf /export/home/grid/ocm.rsp ....
and there we go ... however the last last step post patch took a long long time. looking at the traces evm didn’t want to start I remembered that last time i had the same issue and cleaning up /var/tmp/.oracle solved the issue …. so i interrupted this step,disabled automatic has start, cleaned up /var/tmp/.oracle and rebooted the zone. ok all perfect however relaunching the step still didn’t help, evmd still didn’t want to start
grid 3493 566 0 15:18:41 pts/17 0:00 grep d.bin 
grid 1380 27879 0 15:11:57 ? 0:07 /u01/app/grid/product/12.1.0/grid/bin/ohasd.bin reboot 
grid 1708 27879 0 15:12:10 ? 0:07 /u01/app/grid/product/12.1.0/grid/bin/oraagent.bin 
root 1355 1161 0 15:11:57 pts/19 0:02 /u01/app/grid/product/12.1.0/grid/bin/crsctl.bin start has 

I also saw plenty of errors regarding evmd in oohs_oraagent_grid.trc
"2015-09-01 15:19:59.298545 :GIPCXCPT:13:  gipcInternalConnectSync: failed sync request, ret gipcretConnectionRefused (29)

2015-09-01 15:19:59.298700 :GIPCXCPT:13:  gipcConnectSyncF [EvmConConnect : evmgipcio.c : 205]: EXCEPTION[ ret gipcretConnectionRefused (29) ]  failed sync connect endp 102d76690 [00000000000050c4] { gipcEndpoint : localAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=)(GIPCID=00000000-00000000-0))', remoteAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth)(GIPCID=00000000-00000000-0))', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0, ready 0, wobj 1030d1d40, sendp 102d76190 status 13flags 0xa008871a, flags-2 0x1, usrFlags 0x30020 }, addr 1033b1290 [00000000000050cb] { gipcAddress : name 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth)(GIPCID=00000000-00000000-0))', objFlags 0x0, addrFlags 0x4 }, flags 0x8000000

2015-09-01 15:19:59.299375 : CLSCEVT:13: (:CLSCE0017:)clsce_subscribe 10212ee50 EvmConnCreate failed with status = 13

2015-09-01 15:19:59.299798 :  CRSEVT:13: {0:0:2} ClusterPubSub::subscribe clsce_subscribe failed [4]

2015-09-01 15:19:59.299917 : USRTHRD:13: {0:0:2} LsnrAgentSub-LISTENER_CLONE ClusterReconnectingSubscriber::subscribe Exception ClusterConnectException : CRS-10203: (:CLSCE0017:)  Could not connect to the Event Manager daemon

2015-09-01 15:19:59.300001 : CLSCEVT:13: (:CLSCE0028:)clsce_unsubscribe 10212ee50 successfully unsubscribed : 0

2015-09-01 15:20:00.301266 : CLSCEVT:13: clsce_subscribe 10226cad0 filter='^CRS_RESOURCE_PROFILE_CHANGE.*NAME='ora\.(scan|ssc02dbdat05z01\.vip).*RESOURCE_CLASS='(scan_vip|vip)'', flags=1, handler=100b26978, arg=102f928e0

2015-09-01 15:20:00.303161 :GIPCXCPT:13:  gipcInternalConnectSync: failed sync request, ret gipcretConnectionRefused (29)

2015-09-01 15:20:00.303300 :GIPCXCPT:13:  gipcConnectSyncF [EvmConConnect : evmgipcio.c : 205]: EXCEPTION[ ret gipcretConnectionRefused (29) ]  failed sync connect endp 102d76690 [00000000000050d5] { gipcEndpoint : localAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=)(GIPCID=00000000-00000000-0))', remoteAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth)(GIPCID=00000000-00000000-0))', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0, ready 0, wobj 1030d1d40, sendp 102d76190 status 13flags 0xa008871a, flags-2 0x1, usrFlags 0x30020 }, addr 1033b0990 [00000000000050dc] { gipcAddress : name 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth)(GIPCID=00000000-00000000-0))', objFlags 0x0, addrFlags 0x4 }, flags 0x8000000

2015-09-01 15:20:00.303791 : CLSCEVT:13: (:CLSCE0017:)clsce_subscribe 10226cad0 EvmConnCreate failed with status = 13

2015-09-01 15:20:00.304118 :  CRSEVT:13: {0:0:2} ClusterPubSub::subscribe clsce_subscribe failed [4]

2015-09-01 15:20:00.304204 : USRTHRD:13: {0:0:2} LsnrAgentSub-LISTENER ClusterReconnectingSubscriber::subscribe Exception ClusterConnectException : CRS-10203: (:CLSCE0017:)  Could not connect to the Event Manager daemon

2015-09-01 15:20:00.304257 : CLSCEVT:13: (:CLSCE0028:)clsce_unsubscribe 10226cad0 successfully unsubscribed : 0

2015-09-01 15:20:00.304304 : CLSCEVT:13: clsce_subscribe 10212ee50 filter='^CRS_RESOURCE_PROFILE_CHANGE.*NAME='ora\.(scan|ssc02dbdat05z01\.vip).*RESOURCE_CLASS='(scan_vip|vip)'', flags=1, handler=100b26978, arg=1033406a0

2015-09-01 15:20:00.305574 :GIPCXCPT:13:  gipcInternalConnectSync: failed sync request, ret gipcretConnectionRefused (29)

2015-09-01 15:20:00.305675 :GIPCXCPT:13:  gipcConnectSyncF [EvmConConnect : evmgipcio.c : 205]: EXCEPTION[ ret gipcretConnectionRefused (29) ]  failed sync connect endp 102d76690 [00000000000050e6] { gipcEndpoint : localAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=)(GIPCID=00000000-00000000-0))', remoteAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth)(GIPCID=00000000-00000000-0))', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0, ready 0, wobj 1030d1d40, sendp 102d76190 status 13flags 0xa008871a, flags-2 0x1, usrFlags 0x30020 }, addr 1033b1290 [00000000000050ed] { gipcAddress : name 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth)(GIPCID=00000000-00000000-0))', objFlags 0x0, addrFlags 0x4 }, flags 0x8000000

2015-09-01 15:20:00.306108 : CLSCEVT:13: (:CLSCE0017:)clsce_subscribe 10212ee50 EvmConnCreate failed with status = 13

2015-09-01 15:20:00.306376 :  CRSEVT:13: {0:0:2} ClusterPubSub::subscribe clsce_subscribe failed [4]

2015-09-01 15:20:00.306470 : USRTHRD:13: {0:0:2} LsnrAgentSub-LISTENER_CLONE ClusterReconnectingSubscriber::subscribe Exception ClusterConnectException : CRS-10203: (:CLSCE0017:)  Could not connect to the Event Manager daemon

2015-09-01 15:20:00.306752 : CLSCEVT:13: (:CLSCE0028:)clsce_unsubscribe 10212ee50 successfully unsubscribed : 0

2015-09-01 15:20:01.308000 : CLSCEVT:13: clsce_subscribe 10226cad0 filter='^CRS_RESOURCE_PROFILE_CHANGE.*NAME='ora\.(scan|ssc02dbdat05z01\.vip).*RESOURCE_CLASS='(scan_vip|vip)'', flags=1, handler=100b26978, arg=102f928e0

2015-09-01 15:20:01.309869 :GIPCXCPT:13:  gipcInternalConnectSync: failed sync request, ret gipcretConnectionRefused (29)

2015-09-01 15:20:01.309994 :GIPCXCPT:13:  gipcConnectSyncF [EvmConConnect : evmgipcio.c : 205]: EXCEPTION[ ret gipcretConnectionRefused (29) ]  failed sync connect endp 102d76690 [00000000000050f7] { gipcEndpoint : localAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=)(GIPCID=00000000-00000000-0))', remoteAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth)(GIPCID=00000000-00000000-0))', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef 0, ready 0, wobj 1030d1d40, sendp 102d76190 status 13flags 0xa008871a, flags-2 0x1, usrFlags 0x30020 }, addr 1033b0990 [00000000000050fe] { gipcAddress : name 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth)(GIPCID=00000000-00000000-0))', objFlags 0x0, addrFlags 0x4 }, flags 0x8000000

2015-09-01 15:20:01.310445 : CLSCEVT:13: (:CLSCE0017:)clsce_subscribe 10226cad0 EvmConnCreate failed with status = 13

2015-09-01 15:20:01.310779 :  CRSEVT:13: {0:0:2} ClusterPubSub::subscribe clsce_subscribe failed [4]”

I opened an sr and after killing ohasd and oraagent.bin process the GI and evmd came up. Later after transferring the SR to my timezone support came back with a suspicion of a couple of bugs.
“

1. Unpublished  BUG 21484367 12.1.0.2 SIHA UPGRADE HANG INDEFINITELY IF MORE SERVICES REGISTERED 
2.  BUG 20620033 AIX ISSUES WITH GI 12.1.0.2 UPGRADE FINE, IF DON'T CONFIGURE > 34 OR 35 SERVICES 
-> not AIX specific 

“
on another system I could reproduce the issue the problem indeed appeared in my case when 35 services where created in total on the machine that is services you add with
“
srvctl add service 
then the GI didn't come up. one way circumvent is to put the services in MANUAL but that is not really a solution for us >80 services or put the db in MANUAL. Dev is working on a patch currently. Hope this help when you get these errors.

Comments

Unknown said…
Valuable information thanks for sharing Oracle DBA Online Training

Popular posts from this blog

Pieter quo vadis

19c Data Guard Series Part III adding a PDB to and existing Data Guard configuration