Monday, February 3, 2020

datapatch execution takes time and fails with ORA-04021 as AQ jobs are running while PSU Patching on 12c

It's now quite common on 12cR2 database, that your execution of datapatch step on PSU patching may wait for a long time and eventually fail with "ORA-04021: timeout occurred while waiting to lock object SYS.DBMS_AQ"

This happened to me during last 2 patching Oct'19 and Jan'20 PSU patching for RAC environment and had to struggle a lot to have it fixed.

I have outlined the solution below for you so you do not have to go through the same pain I have gone through it during critical patching activity.

After completing GI and RDBMS patching for both the nodes of the RAC as you know last step is to apply datapatch as below. (If you are not clear about applying PSU patching on RAC I will post shortly on how to apply patch on 12cR2 very soon)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

oracle@PRODHOSTDB02:[/u01/app/oracle/product/12.2.0.1/dbhome_1/OPatch]$./datapatch -verbose
SQL Patching tool version 12.2.0.1.0 Production on Sun Feb  2 15:21:24 2020
Copyright (c) 2012, 2020, Oracle.  All rights reserved.

Log file for this invocation: /u01/app/oracle/cfgtoollogs/sqlpatch/sqlpatch_34800046_2020_02_02_15_21_24/sqlpatch_invocation.log

Connecting to database...OK
Bootstrapping registry and package to current versions...done
Determining current state...done

Current state of SQL patches:
Bundle series DBRU:
  ID 200114 in the binary registry and not installed in the SQL registry

Adding patches to installation queue and performing prereq checks...
Installation queue:
  Nothing to roll back
  The following patches will be applied:
    30593149 (DATABASE JAN 2020 RELEASE UPDATE 12.2.0.1.200114)

Installing patches...

<=========AT this point it waited for more than 2 hours and failed with below error and automatically tried to rerun the same patch

Patch installation complete.  Total patches installed: 1

Validating logfiles...
Patch 30593149 apply: WITH ERRORS (RETRYABLE)
  logfile: /u01/app/oracle/cfgtoollogs/sqlpatch/30593149/23339232/30593149_apply_DBPROD_2020Feb02_15_22_57.log (errors)
    Error at line 10007: ORA-04021: timeout occurred while waiting to lock object SYS.DBMS_AQ
    Error at line 10051: ORA-04021: timeout occurred while waiting to lock object SYS.DBMS_AQADM_VAR
    Error at line 10057: ORA-04021: timeout occurred while waiting to lock object SYS.DBMS_AQADM_SYS

Adding patches to retry installation queue...
Retry installation queue:
  Nothing to roll back
  The following patches will be applied:
    30593149 (DATABASE JAN 2020 RELEASE UPDATE 12.2.0.1.200114)

<=========AT this point I new what was the issue as same issue was encounted during Oct'19 PSU run and implemented below workaround to fix the issue which resulted in successfully execution of datapatch.

Solution/Workaround:

==> Check if any AQ job are running while execution of datapatch is going on:

SQL> set pagesize 500
SQL> col JOB_NAME for a40
SQL>  select JOB_NAME, STATE, LAST_START_DATE from all_scheduler_jobs where job_name like 'AQ%';


JOB_NAME                                 STATE                LAST_START_DATE
---------------------------------------- -------------------- ---------------------------------------------------------------------------
AQ$_PLSQL_NTFN_3566554670                RUNNING              30-JAN-20 01.03.35.623482 PM ASIA/KUWAIT


run below SQL to generate script to execute:

set pagesize 0
set linesize 200
set echo off
set heading off
select 'EXEC DBMS_SCHEDULER.STOP_JOB ('''|| job_name ||''', force => TRUE);' from dba_scheduler_jobs where job_name like 'AQ$_PLSQL_NTFN_%';

Which generated below SQLs which needs to be executed to stop and drop the AQ jobs. first check if just stopping the job solves the issue if not then run the drop job. 

EXEC DBMS_SCHEDULER.STOP_JOB ('AQ$_PLSQL_NTFN_3566554670', force => TRUE);

EXEC DBMS_SCHEDULER.DROP_JOB ('AQ$_PLSQL_NTFN_3566554670');


SQL> EXEC DBMS_SCHEDULER.STOP_JOB ('AQ$_PLSQL_NTFN_3566554670', force => TRUE);

PL/SQL procedure successfully completed.

SQL>  EXEC DBMS_SCHEDULER.DROP_JOB ('AQ$_PLSQL_NTFN_3566554670');
BEGIN DBMS_SCHEDULER.DROP_JOB ('AQ$_PLSQL_NTFN_3566554670'); END;

*
ERROR at line 1:
ORA-27475: unknown job "SYS"."AQ$_PLSQL_NTFN_3566554670"
ORA-06512: at "SYS.DBMS_ISCHED", line 274
ORA-06512: at "SYS.DBMS_SCHEDULER", line 753
ORA-06512: at line 1

Note: Probably you may also get error for dropping job, but that should be fine. 

Post execution of above work around datapatch completed successfully within minute. 

--below are the remaining execution output of the datapatch. 
---------
Installing patches...


Patch installation complete.  Total patches installed: 1

Validating logfiles...
Patch 30593149 apply: SUCCESS
  logfile: /u01/app/oracle/cfgtoollogs/sqlpatch/30593149/23339232/30593149_apply_DBPROD_2020Feb02_16_11_06.log (no errors)
SQL Patching tool complete on Sun Feb  2 16:21:59 2020
oracle@PRODHOSTDB02:[/u01/app/oracle/product/12.2.0.1/dbhome_1/OPatch]$

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Please note that post execution of datapatch you may get one row of failed output as below, but that should be fine as subsequent row depicts successful  execution. 


set line 500
col patch_id for 9999999999
col patch_uid for 9999999999
col version for a10
col action for a10
col install_id  for 99
col action_time for a30
col description for a60
col bundle_series for a20
col status for a30

select patch_id, patch_uid, version, action, install_id, action_time, description, bundle_series,status  from dba_registry_sqlpatch;


Note: After stopping/dropping AQ running job , we did not find any issue with the database. AQ$_PLSQL_NTFN jobs deals with the PLSQL notification engine. 

No comments:

Post a Comment