FAQ
I've run across a memory leak in a long running process which I can't
determine if its my issue or if its the logger.

The long and short is I'm doing load testing on an application server
which spawns handlers threads which in turn each spawn a single
application thread. A graphic representation would be One Server ->
(to many pairs of) [ Handler <-> Application ].

Each application thread gets a logger instance in it's init() method
via:

self.logger = logging.getLogger('ivr-'+str(self.rand))

where self.rand is a suitably large random number to avoid collisions
of the log file's name. Until the log file gets created I attach am
memory handler

self.memhandler =
logging.handlers.MemoryHandler(1000)
self.memhandler.setLevel(10)
formatter = logging.Formatter('%
(levelname)s %(message)s')
self.memhandler.setFormatter(formatter)
self.logger.addHandler(self.memhandler)

when the application thread formally starts with the run() method I
create the log file and terminate the memory handler

filehandler = logging.FileHandler(logfilename)
filehandler.setLevel(10)
formatter = logging.Formatter('%(levelname)s %(message)s')
filehandler.setFormatter(formatter)

self.memhandler.setTarget(filehandler)
self.memhandler.close()
self.logger.removeHandler(self.memhandler)
self.logger.addHandler(filehandler)


finally the last statements in the run() method are:

filehandler.close()
self.logger.removeHandler(filehandler)
del self.logger #this was added to try and force a clean up of
the logger instances.

Using the objgraph to look at the objects in memory I find the number
of logger instances equal to the total number of threads to have lived
despite the fact that either a) there are only the standard load
testing number of threads alive, 35 or b) that there no threads
running nor are there any stale waiting for the GC.
From objgraph a selection of the most prevalent objects in memory are
(this is with the system idle post-run):

list 256730
dict 128933
Logger 128164 # total
application threads executed running load testing.
function 2356
wrapper_descriptor 1028
builtin_function_or_method 702
method_descriptor 648
tuple 643
weakref 629
getset_descriptor 304
type 252
set 224
member_descriptor 209
module 128
WeakSet 102


The only references to self.logger other than those listed are wrapper
methods defined in the application thread to wrap up the log / debug
methods. Any help or direction would be much appreciated.

Search Discussions

  • Chris Torek at Jun 18, 2011 at 10:28 pm
    In article <ebafe7b6-aa93-4847-81d6-12d396a4ff3c at j28g2000vbp.googlegroups.com>
    foobar wrote:
    I've run across a memory leak in a long running process which I can't
    determine if its my issue or if its the logger.
    You do not say what version of python you are using, but on the
    other hand I do not know how much the logger code has evolved
    over time anyway. :-)
    Each application thread gets a logger instance in it's init() method
    via:

    self.logger = logging.getLogger('ivr-'+str(self.rand))

    where self.rand is a suitably large random number to avoid collisions
    of the log file's name.
    This instance will "live forever" (since the thread shares the
    main logging manager with all other threads).
    ---------
    class Manager:
    """
    There is [under normal circumstances] just one Manager instance, which
    holds the hierarchy of loggers.
    """
    def __init__(self, rootnode):
    """
    Initialize the manager with the root node of the logger hierarchy.
    """
    [snip]
    self.loggerDict = {}

    def getLogger(self, name):
    """
    Get a logger with the specified name (channel name), creating it
    if it doesn't yet exist. This name is a dot-separated hierarchical
    name, such as "a", "a.b", "a.b.c" or similar.

    If a PlaceHolder existed for the specified name [i.e. the logger
    didn't exist but a child of it did], replace it with the created
    logger and fix up the parent/child references which pointed to the
    placeholder to now point to the logger.
    """
    [snip]
    self.loggerDict[name] = rv
    [snip]
    [snip]
    Logger.manager = Manager(Logger.root)
    ---------

    So you will find all the various ivr-* loggers in
    logging.Logger.manager.loggerDict[].
    finally the last statements in the run() method are:

    filehandler.close()
    self.logger.removeHandler(filehandler)
    del self.logger #this was added to try and force a clean up of
    the logger instances.
    There appears to be no __del__ handler and nothing that allows
    removing a logger instance from the manager's loggerDict. Of
    course you could do this "manually", e.g.:

    ...
    self.logger.removeHandler(filehandler)
    del logging.Logger.manager.loggerDict[self.logger.name]
    del self.logger # optional

    I am curious as to why you create a new logger for each thread.
    The logging module has thread synchronization in it, so that you
    can share one log (or several logs) amongst all threads, which is
    more typically what one wants.
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40?39.22'N, 111?50.29'W) +1 801 277 2603
    email: gmail (figure it out) http://web.torek.net/torek/index.html
  • Vinay Sajip at Jun 19, 2011 at 2:29 pm

    foobar <wjshipman <at> gmail.com> writes:

    I've run across a memory leak in a long running process which I can't
    determine if its my issue or if its the logger.
    As Chris Torek said, it's not a good idea to create a logger for each thread. A
    logger name represents a place in your application; typically, a module, or
    perhaps some part of a module. If you want to include information in the log to
    see what different threads are doing, do that using the information provided
    here:

    http://docs.python.org/howto/logging-cookbook.html#adding-contextual-information-to-your-logging-output

    Regards,

    Vinay Sajip
  • Vinay Sajip at Jun 19, 2011 at 2:42 pm
    foobar <wjshipman <at> gmail.com> writes:
    I've run across a memory leak in a long running process which I can't
    determine if its my issue or if its the logger.
    BTW did you also ask this question on Stack Overflow? I've answered there, too.

    http://stackoverflow.com/questions/6388514/

    Regards,

    Vinay Sajip
  • Foobar at Jun 20, 2011 at 2:50 pm
    Yes, I asked it on stack overflow first and didn't see an quick
    reply. I'm trying to tighten up this code as much as possible in a
    final pre-production push; I apologize for being overly antsy about
    this. This is my pet project to upgrade our core systems from an
    ancient IBM language that Moses might have used.

    Currently I'm using python 3.1.2 (sorry for the obvious omission).

    Regarding adding a new logger for each thread - each thread represents
    a telephone call in a data collection system. I need to be able to
    cleanly provided call-logging for debugging to my programmers as well
    as data logging and verification; having a single log file is somewhat
    impractical. To use the logging filtering then I would have to be
    dynamically adding to the filtering hierarchy continuously, no?

    Thanks!
    Bill


    On Jun 19, 10:42?am, Vinay Sajip wrote:
    foobar <wjshipman <at> gmail.com> writes:


    I've run across a memory leak in a long running process which I can't
    determine if its my issue or if its the logger.
    BTW did you also ask this question on Stack Overflow? I've answered there, too.

    http://stackoverflow.com/questions/6388514/

    Regards,

    Vinay Sajip
  • Vinay Sajip at Jun 21, 2011 at 12:19 am

    On Jun 20, 3:50?pm, foobar wrote:

    Regarding adding a new logger for each thread - each thread represents
    a telephone call in a data collection system. I need to be able to
    cleanly provided call-loggingfor debugging to my programmers as well
    as dataloggingand verification; having a single log file is somewhat
    impractical. ?To use theloggingfiltering then I would have to be
    dynamically adding to the filtering hierarchy continuously, no?
    You could, for example, have a different *handler* for each thread.
    There are a number of possibilities according to exactly what you want
    to do, but there's certainly no need to create one *logger* per
    thread.

    Regards,

    Vinay Sajip

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedJun 18, '11 at 8:02p
activeJun 21, '11 at 12:19a
posts6
users3
websitepython.org

People

Translate

site design / logo © 2023 Grokbase