Tanel's snapper script is very handy for trouble shooting if the issue is currently going on.
I would like to add few points if in case you need to trouble shooting this type of issue which happened in the past.
1) AWR can nail downÂ 'redo size per second' toÂ service levelÂ (Â DBA_HIST_SERVICE_STAT)
That roughly tells a direction where the user/process comes from.
2)Â check for objects which went through the highest change rate.
This can be done from AWR âTop Segment by block changeâ (pay attention to the '% of capture' for accuracy), or from logminer
select TABLE_NAME, OPERATION, count(1)
group by TABLE_NAME, OPERATION
order byÂ TABLE_NAME, OPERATION;
3) Â From the object, we can check DBA_HIST_SQL_PLAN and DBA_HIST_SQLSTAT to see if we can dig out the sql statement.
Or, sometime it is just some background knowledge of what application or process could land on those objects.
WITH x AS (
Â Â Â SELECT DISTINCT DBID, SQL_ID, PLAN_HASH_VALUE
Â Â Â FROM DBA_HIST_SQL_PLAN
Â Â Â WHERE DBID=? AND OBJECT_NAME=? AND OBJECT_OWNER=?
Â y AS (
Â Â Â SELECT DBID, INSTANCE_NUMBER,
Â Â Â LAG(SNAP_ID, 1) OVER (PARTITION BY DBID,INSTANCE_NUMBER ORDER BY SNAP_ID) begin_snap_id,
Â Â Â SNAP_ID end_snap_id, END_INTERVAL_TIME end_snap_time
Â Â Â FROM DBA_HIST_SNAPSHOT
Â Â Â WHERE DBID=? AND INSTANCE_NUMBER=? AND END_INTERVAL_TIME BETWEEN ? and ?
Â SELECT /*+ NO_MERGE(x) ORDERED */
Â y.begin_snap_id||'-'||y.end_snap_id snap_id, y.end_snap_time, z.INSTANCE_NUMBER inst#,
Â z.SQL_ID, z.PLAN_HASH_VALUE, ROUND(z.ELAPSED_TIME_DELTA/1000000) elapsed_seconds,
Â ROUND(z.CPU_TIME_DELTA/1000000) cpu_seconds,
Â Â Â z.ROWS_PROCESSED_DELTA rows_processed, z.BUFFER_GETS_DELTA buffer_gets, z.DISK_READS_DELTAÂ disk_reads, z.EXECUTIONS_DELTAÂ executions, z.PARSE_CALLS_DELTA parses
Â FROM DBA_HIST_SQLSTAT z JOIN y ON (z.DBID=y.dbid AND z.INSTANCE_NUMBER=y.instance_number AND z.SNAP_ID=y.end_snap_id)
Â JOIN x ON (z.DBID=x.dbid AND z.SQL_ID =x.sql_id AND z.PLAN_HASH_VALUE=x.plan_hash_value)
Â WHERE y.begin_snap_id IS NOT NULL
Â ORDER BY y.begin_snap_id, y.end_snap_id, z.SQL_ID, z.PLAN_HASH_VALUE
From: Michael Dinh <[email protected]
To: "'[email protected]
'" <[email protected]
>; "[email protected]
" <[email protected]
Cc: ORACLE-L <[email protected]
Sent: Friday, February 17, 2012 9:15 AM
Subject: RE: To find out the Top process or Top User sessions generating highest number of archive logs
A little late?
But wanted to share.http://www.freelists.org/post/oracle-l/archived-log-switch-detecting-whos-causing-excessive-redo-generation
Disparity Breaks Automation (DBA)
Great minds discuss ideas; average minds discuss events; small minds discuss people - Eleanor RooseveltÂ
Confidence comes not from always being right but from not fearing to be wrong - Peter T Mcintyre
From: [email protected]
On Behalf Of Jeremiah Wilton
Sent: Thursday, February 16, 2012 11:13 AM
To: [email protected]
Subject: Re: To find out the Top process or Top User sessions generating highest number of archive logs
On Feb 16, 2012, at 11:27 AM, Powell, Mark wrote:
I note that the query below always seems to return smon as the largest redo generator on the couple of systems where I checked so if you are interested in only user sessions you might want to add "and s.username is not null" as a condition to filter out the Oracle rdbms background sessions.
I don't see a good reason to exclude background processes, especially if one choses to sample and delta the redo size over a number of samples. What if one of them is somehow generating the large amount of redo that the I am trying to track down?
If you do choose to exclude background processes, there's a column for that in gv$session.Â Just select where type != 'BACKGROUND'.
Another trick to avoid doing deltas (why are we avoiding that when we have Tanel's snapper handy?) might be to divide the redo size by the length of time the session has been logged in.Â That way you get sort of redo size per unit time figure, instead of the grand total for the lifetime of the session (which is what is making SMON look so large).