I've asked the users to let me know which processes need to be evaluated
but they haven't been able to tell me thus far. Application speed is an
issue with some users, but not others. The only SLA I've been given is
99.98 uptime on a very buggy application (sigh).
I normally use 10046 traces on individual, longer running jobs and I
know this isn't the technically correct way to do this, but the only
question I'm being asked at the moment is 'Is there a communication
problem between the app server and the db?'
Summary of the trace file info looks like this:
call count cpu elapsed disk query current rows
Parse 9388 2.29 2.75 0 10 0 0
Execute 21338 14.84 18.42 50 46799 12979 2733
Fetch 18272 56.16 61.66 62591 4929531 4158 217312
Total 48998 73.29 82.83 62641 4976340 17137 220045
Event waited times max total
db file scattered read 6792 0.02 2.49
db file sequential read 2537 0.03 5.65
latch free 47 0.02 0.12
log file sync 5 0.03 0.06
SQL*Net message from client: 14414 28.86 804.08 *
SQL*Net message to client: 14415 0.01 0.05 *
SQL*Net more data from client: 50 0.01 0.01
SQL*Net more data to client: 4296 0.01 0.23
These numbers are from pooled connection traces and there is lots of
user think time that would be included. That's my real question: can I use
a 10046 trace to evaluate pooled client connections? Since the database
appears to be waiting on a client input, can I ignore the 'from client'
waits or could they be hiding other problems?
No problems have been noted on the database or the os. The servers are
all sized beyond the vendor specs; client hardware is another story.
If there's a better way to monitor for connection problems between an app
server and the database, please let me know ...
From: Bobak, Mark
Sent: Tuesday, March 16, 2004 4:32 PM
Subject: RE: SQL*Net message waits
Ok, of a 30 minute window, 0.05 seconds total was attributed to SQL*Net
message waits. How many total SQL*Net waits were in the trace?
I'm a little concerned because with the scoping error you admit is present
in your tracing, it's going to obfuscate the solution a bit.
I'm also a little concerned that "the users can't tell me exactly which
actions they believe are slow....". That statement tells me
your system has no real measurable SLA. Without that, how will you know
when you've succeeded? Ok, ok, I know, it wasn't your idea
but you're stuck with it. Which business process is the most business
critical and performing the worst. Start there. Try to get
an accurately scoped trace of that process. This will reveal much in
determining where the problem is.
Now, going back to the trace you already got:
What does the profile of the other 29 minutes 59.95 seconds look like?
How many database calls are in the trace file?
Database calls are PARSE, EXEC and FETCH. Also, how much CPU time is
represented in that 30 minutes? How much wait time?
What events were you waiting on?
Also, consider what's going on the in the O/S. Are the CPUs hammered?
Are there lots of processes waiting on I/O?
Hope that helps,
Please see the official ORACLE-L FAQ: http://www.orafaq.com
To unsubscribe send email to: oracle-l-request_at_freelists.org
put 'unsubscribe' in the subject line.
Archives are at http://www.freelists.org/archives/oracle-l/
FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html