Thursday, December 24, 2009

Database appears hanging but generates huge amount of redo and archived redo logs


Problem Description
——————-

Your database is generating huge amount of redo logs with the archive processes  creating new archived logs very often (for example, the 50MB size redo logs are fullfiled and archived every 10 sec),  but  no one  is doing anything on the  database.

The queries run  against the  database  are not completing,  jobs seem to be hanging, although simple queries against the data dictionary can be run ok.

Shutdown immediate hangs,   after shutdown  abort and  restart in normal or restricted mode,  the situation   is the same.   Rebooting the box does not make  any difference in what has been observed.


The major applications can not be run and your database has not  been operational possibly  for  many hours already.

In the output from ‘ps -ef|grep  ’  you can see  one or more  processes  (like ora_p001_,  ora_p002_, …),  consuming   a lot of CPU time.
The CPU time consumed by SMON at the same time,  however,  is not that  significant.

In addition,  the size of the rollback segments  (from RDA output or other source)  have grown to the size, which is times greater than their initial size configured  (this may vary  depending on the rollback segments settings,  ‘optimal’ in particular).


Solution Description
——————–

With the above symptoms you have to leave the database running  and wait until the frequency of generating of the redo logs (and archived redo logs) comes back to normal.


Solution Explanation
——————–

You likely have had large application batch jobs cancelled, sometimes after they were running for up to 6-12 hours.

It is  rolling back activity  after  cancelled long running batch jobs that generates this huge amount of the redo logs.  It always generates  redo logs and nothing can be done to stop it.

You may notice that restarting the database with ‘parallel_max_servers’ parameter set to  0 (if it was,  say,  5)  will reduce the frequency of fullfilling  the redo logs,  but that also increases the time for completion of the  rolling back activity.

Try to avoid cancelling long running jobs in the future.

No comments: