Simple Oracle, MS SQL, Postgres, AWS and more: Database appears hanging but generates huge amount of redo and archived redo logs

Problem Description
——————-

Your database is generating huge amount of redo logs with the archive processes creating new archived logs very often (for example, the 50MB size redo logs are fullfiled and archived every 10 sec), but no one is doing anything on the database.

The queries run against the database are not completing, jobs seem to be hanging, although simple queries against the data dictionary can be run ok.

Shutdown immediate hangs, after shutdown abort and restart in normal or restricted mode, the situation is the same. Rebooting the box does not make any difference in what has been observed.

The major applications can not be run and your database has not been operational possibly for many hours already.

In the output from ‘ps -ef|grep ’ you can see one or more processes (like ora_p001_, ora_p002_, …), consuming a lot of CPU time.
The CPU time consumed by SMON at the same time, however, is not that significant.

In addition, the size of the rollback segments (from RDA output or other source) have grown to the size, which is times greater than their initial size configured (this may vary depending on the rollback segments settings, ‘optimal’ in particular).

Solution Description
——————–

With the above symptoms you have to leave the database running and wait until the frequency of generating of the redo logs (and archived redo logs) comes back to normal.

Solution Explanation
——————–

You likely have had large application batch jobs cancelled, sometimes after they were running for up to 6-12 hours.

It is rolling back activity after cancelled long running batch jobs that generates this huge amount of the redo logs. It always generates redo logs and nothing can be done to stop it.

You may notice that restarting the database with ‘parallel_max_servers’ parameter set to 0 (if it was, say, 5) will reduce the frequency of fullfilling the redo logs, but that also increases the time for completion of the rolling back activity.

Try to avoid cancelling long running jobs in the future.

Simple Oracle, MS SQL, Postgres, AWS and more

Thursday, December 24, 2009

Database appears hanging but generates huge amount of redo and archived redo logs

No comments: