FAQ
Greetings

HP-UX 11.11, 4X4 Server

Oracle version:

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
PL/SQL Release 10.2.0.1.0 - Production
CORE 10.2.0.1.0 Production
TNS for HPUX: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

PARAMETER VALUE
------------------------- ---------------
_pga_large_extent_size 1048576
_pga_max_size 209715200

pga_aggregate_target big integer 100M

We do see warnigns in AWR reports to bump pga. we haven't done that.
Could smaller pga cause this memory leak?

We started with this version, plans are there to go to the Latest Patch Set.
But for now we are stuck with this

DEV folks did a release last Friday, its a simple change in a package

Problem:

One process at a time goes upto 715 MB
anothher process starts goes up to 715 MB
another process start goes upto 715 MB
. . . . . .

Until system Crashes

We Backeduout the code, we are back to normal.

We have almost simialr load in one of our DEV server (4X4), We tried same
code, we are not able to reproduce the issue.

Only other option for us is to disable PGA and go for manual memory
allocation by define sort_* values

We have an SR open, we gave all the dumps oralce asked before we backed out
the code in prod.
Now oracle wants HEAP dumps, but managment doesn't want this faulty code in
prod until we fix the issue.

The same code works fine in DEV. Server.

Any Ideas and suggestions are appreciated

Regards
BN

Search Discussions

  • Alvaro Jose Fernandez at Sep 19, 2007 at 8:32 pm
    Hello Bn,



    Problem:



    One process at a time goes upto 715 MB

    anothher process starts goes up to 715 MB

    another process start goes upto 715 MB

    . . . . . .



    Until system Crashes



    We Backeduout the code, we are back to normal.



    We have almost simialr load in one of our DEV server (4X4), We tried
    same code, we are not able to reproduce the issue.



    similar load? well, but the generated plan for the query (in the
    package), is the same in DEV as in prod?



    Only other option for us is to disable PGA and go for manual memory
    allocation by define sort_* values



    We have an SR open, we gave all the dumps oralce asked before we backed
    out the code in prod.

    Now oracle wants HEAP dumps, but managment doesn't want this faulty
    code in prod until we fix the issue.



    The same code works fine in DEV. Server.

    , yeah, but, does it generate the same optimizer plan in both db's?



    Any Ideas and suggestions are appreciated



    put the plan in Prod here (look in v$sql_plan and v$sql_plan_statistics,
    for the "hash" value of the problematic query (if it is localized in 1
    query)



    Regards

    BN
  • BN at Sep 19, 2007 at 9:30 pm
    Greetings

    I took 10046 Level 12 trace of all the Problem Processes, we see nothing
    that can cause this.

    I don't see any SQL, Package/proc accessing huge amount of data in the
    tkprofd output.
    The MAx number of rows touched is around 100k (10519)

    Sicne these connections are always there, I wouldn't know if I have missed
    the real issue,
    But I did trace 5 sessions and none of them have any thing thats noticable

    We submitted all the 10046 Level 12 trace files to SR, they came back with
    nothing.

    thats why I am looking for ideas to

    where to look and ways to reproduce the issues.

    BN
    On 9/19/07, Alvaro Jose Fernandez wrote:

    Hello Bn,



    Problem:



    One process at a time goes upto 715 MB

    anothher process starts goes up to 715 MB

    another process start goes upto 715 MB

    . . . . . .



    Until system Crashes



    We Backeduout the code, we are back to normal.



    We have almost simialr load in one of our DEV server (4X4), We tried same
    code, we are not able to reproduce the issue.



    similar load? well, but the generated plan for the query (in the package),
    is the same in DEV as in prod?



    Only other option for us is to disable PGA and go for manual memory
    allocation by define sort_* values



    We have an SR open, we gave all the dumps oralce asked before we backed
    out the code in prod.

    Now oracle wants HEAP dumps, but managment doesn't want this faulty code
    in prod until we fix the issue.



    The same code works fine in DEV. Server.



    , yeah, but, does it generate the same optimizer plan in both db's?



    Any Ideas and suggestions are appreciated



    put the plan in Prod here (look in v$sql_plan and v$sql_plan_statistics,
    for the "hash" value of the problematic query (if it is localized in 1
    query)



    Regards

    BN





    --
    Regards & Thanks
    BN
    --
    Regards & Thanks
    BN

    --
    http://www.freelists.org/webpage/oracle-l
  • Alvaro Jose Fernandez at Sep 19, 2007 at 9:47 pm
    Ok Sarma,



    This would be a case of bug "5947623 - PGA MUCH OVERALLOCATED THAN
    PGA_AGGREGATE_TARGET". Best is to have the SR guy check for this, so
    that they can rule out this possibility also.



    alvaro



    We submitted all the 10046 Level 12 trace files to SR, they came back
    with nothing.





    thats why I am looking for ideas to



    where to look and ways to reproduce the issues.



    BN
  • LS Cheng at Sep 19, 2007 at 9:50 pm
    Hi

    Do you know what is the simple change? Also do you see any ORA- error?

    Thanks

    --
    LSC
    On 9/19/07, BN wrote:

    Greetings

    HP-UX 11.11, 4X4 Server



    Oracle version:

    Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
    PL/SQL Release 10.2.0.1.0 - Production
    CORE 10.2.0.1.0 Production
    TNS for HPUX: Version 10.2.0.1.0 - Production
    NLSRTL Version 10.2.0.1.0 - Production


    PARAMETER VALUE
    ------------------------- ---------------
    _pga_large_extent_size 1048576
    _pga_max_size 209715200

    pga_aggregate_target big integer 100M


    We do see warnigns in AWR reports to bump pga. we haven't done that.
    Could smaller pga cause this memory leak?


    We started with this version, plans are there to go to the Latest Patch
    Set. But for now we are stuck with this

    DEV folks did a release last Friday, its a simple change in a package

    Problem:

    One process at a time goes upto 715 MB
    anothher process starts goes up to 715 MB
    another process start goes upto 715 MB
    . . . . . .

    Until system Crashes

    We Backeduout the code, we are back to normal.

    We have almost simialr load in one of our DEV server (4X4), We tried same
    code, we are not able to reproduce the issue.

    Only other option for us is to disable PGA and go for manual memory
    allocation by define sort_* values

    We have an SR open, we gave all the dumps oralce asked before we backed
    out the code in prod.
    Now oracle wants HEAP dumps, but managment doesn't want this faulty code
    in prod until we fix the issue.

    The same code works fine in DEV. Server.

    Any Ideas and suggestions are appreciated

    Regards
    BN



    --
    Regards & Thanks
    BN
    --
    http://www.freelists.org/webpage/oracle-l
  • BN at Sep 19, 2007 at 10:13 pm
    Greetings Cheng

    According to DEV Team and DEV DBA Team, its an additional UPDATE and a new
    trigger

    BN
    On 9/19/07, LS Cheng wrote:

    Hi

    Do you know what is the simple change? Also do you see any ORA- error?

    Thanks

    --
    LSC

    On 9/19/07, BN wrote:

    Greetings

    HP-UX 11.11, 4X4 Server



    Oracle version:

    Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
    PL/SQL Release 10.2.0.1.0 - Production
    CORE 10.2.0.1.0 Production
    TNS for HPUX: Version 10.2.0.1.0 - Production
    NLSRTL Version 10.2.0.1.0 - Production


    PARAMETER VALUE
    ------------------------- ---------------
    _pga_large_extent_size 1048576
    _pga_max_size 209715200

    pga_aggregate_target big integer 100M


    We do see warnigns in AWR reports to bump pga. we haven't done that.
    Could smaller pga cause this memory leak?


    We started with this version, plans are there to go to the Latest Patch
    Set. But for now we are stuck with this

    DEV folks did a release last Friday, its a simple change in a package

    Problem:

    One process at a time goes upto 715 MB
    anothher process starts goes up to 715 MB
    another process start goes upto 715 MB
    . . . . . .

    Until system Crashes

    We Backeduout the code, we are back to normal.

    We have almost simialr load in one of our DEV server (4X4), We tried
    same code, we are not able to reproduce the issue.

    Only other option for us is to disable PGA and go for manual memory
    allocation by define sort_* values

    We have an SR open, we gave all the dumps oralce asked before we backed
    out the code in prod.
    Now oracle wants HEAP dumps, but managment doesn't want this faulty
    code in prod until we fix the issue.

    The same code works fine in DEV. Server.

    Any Ideas and suggestions are appreciated

    Regards
    BN



    --
    Regards & Thanks
    BN
    --
    Regards & Thanks
    BN

    --
    http://www.freelists.org/webpage/oracle-l
  • Alvaro Jose Fernandez at Sep 19, 2007 at 10:07 pm
    Sicne these connections are always there, I wouldn't know if I have
    missed the real issue,

    But I did trace 5 sessions and none of them have any thing thats
    noticable



    one question here. You mean by "connections are always here", that the
    problematic connections are long-lived or something? have you restarted
    the instance or flushed the shared pool

    since the problem started?





    We submitted all the 10046 Level 12 trace files to SR, they came back
    with nothing.



    One thing. are you using OEM/db control or AWR (I'm sorry I'm on 9i
    yet), and maybe you could extract from here some info about the
    sessions' PGA usage, which are accelerating the leakage?





    thats why I am looking for ideas to



    where to look and ways to reproduce the issues.



    BN
  • BN at Sep 19, 2007 at 10:16 pm
    Greetings

    Its a sort of connection pooling implemented by app.

    App opens connections to the database, and keeps them open to service the
    requests.

    We have never flushed the shared_pool, We didn't think they are connected
    and we didn't see any issues towards SP .

    Servers got rebooted for running out of Memory because of this issue, in
    that process
    DB got restarted.

    BN
    On 9/19/07, Alvaro Jose Fernandez wrote:

    Sicne these connections are always there, I wouldn't know if I have
    missed the real issue,

    But I did trace 5 sessions and none of them have any thing thats noticable



    one question here. You mean by "connections are always here", that the
    problematic connections are long-lived or something? have you restarted the
    instance or flushed the shared pool

    since the problem started?





    We submitted all the 10046 Level 12 trace files to SR, they came back with
    nothing.



    One thing. are you using OEM/db control or AWR (I'm sorry I'm on 9i yet),
    and maybe you could extract from here some info about the sessions' PGA
    usage, which are accelerating the leakage?





    thats why I am looking for ideas to



    where to look and ways to reproduce the issues.



    BN
    --
    Regards & Thanks
    BN

    --
    http://www.freelists.org/webpage/oracle-l
  • Alvaro Jose Fernandez at Sep 19, 2007 at 10:30 pm
    Hi Sarma,



    What is puzzling me about the problem, is that if you have DEV with the
    same configured parameters, system statistics, and with a similar load
    (tables), the faulty code don't reproduce the problem on DEV.

    Apart from the bug cited, subtle plan variations (discarded as per you
    investigation), and other hidden things, honestly I can only now
    propose:



    to "load" DEV with the most similar load as prod, and submit
    at "representative" number of connections.

    thorougly check DEV kernel settings (compare the output of a
    "sysdef" on both HP-UX machines). Compare also the output of a glance
    or "top" for similar stack/resident memory usage for the processes
    involved in the "faulty" code. Check differences also at the HP-UX level
    between DEV and prod .

    regards.
  • Gerald Cunningham at Sep 19, 2007 at 10:35 pm
    Are you by any chance parsing XML in the package that was changed?

    From: oracle-l-bounce_at_freelists.org on behalf of BN
    Sent: Wed 9/19/2007 4:13 PM
    To: oracle_L_list
    Subject: PGA Memory Leak - crashing Server

    Greetings


    HP-UX 11.11, 4X4 Server




    Oracle version:


    Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
    PL/SQL Release 10.2.0.1.0 - Production
    CORE 10.2.0.1.0 Production
    TNS for HPUX: Version 10.2.0.1.0 - Production
    NLSRTL Version 10.2.0.1.0 - Production



    PARAMETER VALUE
    ------------------------- ---------------
    _pga_large_extent_size 1048576
    _pga_max_size 209715200

    pga_aggregate_target big integer 100M

    We do see warnigns in AWR reports to bump pga. we haven't done that.
    Could smaller pga cause this memory leak?



    We started with this version, plans are there to go to the Latest Patch Set. But for now we are stuck with this


    DEV folks did a release last Friday, its a simple change in a package


    Problem:


    One process at a time goes upto 715 MB
    anothher process starts goes up to 715 MB
    another process start goes upto 715 MB
    . . . . . .


    Until system Crashes


    We Backeduout the code, we are back to normal.


    We have almost simialr load in one of our DEV server (4X4), We tried same code, we are not able to reproduce the issue.


    Only other option for us is to disable PGA and go for manual memory allocation by define sort_* values


    We have an SR open, we gave all the dumps oralce asked before we backed out the code in prod.
    Now oracle wants HEAP dumps, but managment doesn't want this faulty code in prod until we fix the issue.


    The same code works fine in DEV. Server.


    Any Ideas and suggestions are appreciated


    Regards
    BN
  • Nigel Thomas at Sep 20, 2007 at 7:46 am
    Original Message ----
    From: BN bnsarma_at_gmail.com

    Its a sort of connection pooling implemented by app.
    App opens connections to the database, and keeps them open to service the requests.

    ---
    BN

    It sounds like there may be a "problem" (design feature/fault) in your code. When a particular piece of code is executed, it causes the PGA to grow. Because you have a connection pool, that piece of code can be (more or less randomly) executed by any of the open sessions. So eventually it will be executed by all of them. From your description, it's a one off "leak" - ie the PGA inflates just once for each session, then stays there, rather than continuing to grow.

    First you need to identify what application operation is causing this to happen (and obviously your recent dev change should point you in the right direction). Use your application logging - it is instrumented, right? - to track down the root cause. Do you have large operations on XML (as another poster suggested)? Large PL/SQL collections (eg any kind of in-process caching)?

    Once you've identified the problem, consider isolating that operation into a separate pool of connections. Then instead of having 100 sessions which ALL go up to 715Mb, you can have a pool of say 10 which do, and 90 which don't. You'll need to identify which operations need access to the 'family size' connection pool, and which can use the regular pool. So for example if you are building a cache, anything which depends on the cache should use the new pool.

    Regards Nigel
  • VIVEK_SHARMA at Sep 20, 2007 at 10:15 am
    Folks

    In order to take a Decision on doing a BIG Benchmark on HP Servers with Oracle RAC.

    Qs 1 What are the pros & cons of using HP Serviceguard Cluster File System for Oracle RAC versus ASM-RAC?

    Number of Years in Production.
    Performance comparison

    Qs 2 With ASM, when using External Redundancy, does Striping from ASM anyways happen when presenting Multiple Disks from the External Storage?

    Hardware Config:-

    DB Servers HP-Itanium Dual Core Montecito - 80 CPUs
    Storage Box Model - XP24000
    Oracle 10gR2

    Cheers & Thanks

    CAUTION - Disclaimer *****************
    This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system.
    ***INFOSYS******** End of Disclaimer ********INFOSYS***
  • Greg Rahn at Sep 20, 2007 at 7:29 pm
    BIG in all caps must really be BIG! :)

    Personally I have a hard time finding a reason NOT to use ASM these
    days. ASM is an Oracle database filesystem written by Oracle database
    people for Oracle databases. The performance benefit of ASM comes
    from its striping. It is very difficult, and very time consuming to
    manually lay out datafiles and get the performance that ASM does.
    Before ASM, for each benchmark I used to spend one week with a storage
    admin, configuring luns, building filesystems and figuring out how to
    best lay out datafiles. Now with ASM, I spend half of a day and I'm
    done. That gives me 4.5 days to work on tuning SQL statements: a
    place where orders of magnatude of performance could exist.

    WRT Question #2: If you use hardware RAID with external ASM
    redundancy, ASM will stripe over every lun in the diskgroup. IMO,
    this is the best configuration.
    On 9/20/07, VIVEK_SHARMA wrote:
    In order to take a Decision on doing a BIG Benchmark on HP Servers
    with Oracle RAC.
    Qs 1 What are the pros & cons of using HP Serviceguard Cluster File System for Oracle RAC versus ASM-RAC?

    Number of Years in Production.
    Performance comparison

    Qs 2 With ASM, when using External Redundancy, does Striping from ASM anyways happen when presenting Multiple Disks from the External Storage?

    Hardware Config:-

    DB Servers HP-Itanium Dual Core Montecito � 80 CPUs

    Storage Box Model - XP24000

    Oracle 10gR2
    --
    Regards,

    Greg Rahn
    http://structureddata.org
    --
    http://www.freelists.org/webpage/oracle-l

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouporacle-l @
categoriesoracle
postedSep 19, '07 at 8:13p
activeSep 20, '07 at 7:29p
posts13
users7
websiteoracle.com

People

Translate

site design / logo © 2022 Grokbase