FAQ
Hi,

I currently build a process management system which is able to fork child
processes (fork()) and keep them alive (waitpid() ).

if pid in self.current_workers:
os.waitpid(pid, 0)

If a child process dies, it should trigger a SIGCHLD signal and a handler is
installed to catch the signal and start a new child process. The code is
nothing special, just can be seen in any Python tutorial you can find on the
net.

signal.signal(signal.SIGCHLD, self.restart_child_process)
signal.signal(signal.SIGHUP, self.handle) # reload
signal.signal(signal.SIGINT, self.handle)
signal.signal(signal.SIGTERM, self.handle)
signal.signal(signal.SIGQUIT, self.handle)

However, this code does not always work as expected. Most of the time, it
works. When a child process exits, the master process receives a SIGCHLD and
restart_child_process() method is invoked automatically to start a new child
process. But the problem is that sometimes, I know a child process exits due
to an unexpected exception (via log file) but it seems that master process
does not know about it. No SIGCHLD and so restart_child_process() is not
triggered. Therefore, no new child process is forked.

Could you please kindly tell me why this happens? Is there any special code
that need being installed to ensure that every dead child will be informed
correctly?

Mac OSX 10.6
Python 2.6.6

Thanks


Dinh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20110215/609a9dd4/attachment-0001.html>

Search Discussions

  • Dan Stromberg at Feb 15, 2011 at 6:28 pm

    On Tue, Feb 15, 2011 at 2:57 AM, Dinh wrote:
    Hi,

    I currently build a process management system which is able to fork child
    processes (fork()) and keep them alive (waitpid() ).

    ???????????? if pid in self.current_workers:
    ???????????????? os.waitpid(pid, 0)

    If a child process dies, it should trigger a SIGCHLD signal and a handler is
    installed to catch the signal and start a new child process. The code is
    nothing special, just can be seen in any Python tutorial you can find on the
    net.

    ??????????? signal.signal(signal.SIGCHLD, self.restart_child_process)
    ??????????? signal.signal(signal.SIGHUP, self.handle) # reload
    ??????????? signal.signal(signal.SIGINT, self.handle)
    ??????????? signal.signal(signal.SIGTERM, self.handle)
    ??????????? signal.signal(signal.SIGQUIT, self.handle)

    However, this code does not always work as expected. Most of the time, it
    works. When a child process exits, the master process receives a SIGCHLD and
    restart_child_process() method is invoked automatically to start a new child
    process. But the problem is that sometimes, I know a child process exits due
    to an unexpected exception (via log file) but it seems that master process
    does not know about it. No SIGCHLD and so restart_child_process() is not
    triggered. Therefore, no new child process is forked.

    Could you please kindly tell me why this happens? Is there any special code
    that need being installed to ensure that every dead child will be informed
    correctly?

    Mac OSX 10.6
    Python 2.6.6
    Hi Dinh.

    I've done no Mac OS/X programming, but I've done Python and *ix
    signals some - so I'm going to try to help you, but it'll be kind of
    stabbing in the dark.

    *ix signals have historically been rather unreliable and troublesome
    when used heavily.

    There are BSD signals, SysV signals, and POSIX signals - they all try
    to solve the problems in different ways. Oh, and Linux has a way of
    doing signals using file descriptors that apparently helps quite a
    bit. I'm guessing your Mac will have available BSD and maybe POSIX
    signals, but you might check on that.

    You might try using ktrace on your Mac to see if any SIGCHLD signals
    are getting lost (it definitely happens in some scenarios), and
    hopefully, which kind of (C level) signal API CPython is using on your
    Mac also.

    You might also make sure your SIGCHLD signal handler is not just
    waitpid'ing once per invocation, but rather doing a nonblocking
    waitpid in a loop until no process is found, in case signals are lost
    (especially if/when signals occur during signal handler processing).

    If the loop in your signal handler doesn't help (enough), you could
    also try using a nonblocking waitpid in a SIGALARM handler in addition
    to your SIGCHLD handler.

    Some signal API's want you to reenable the signal as your first action
    in your signal handler to shorten a race window. Hopefully Mac OS/X
    doesn't need this, but you might check on it.

    BTW, CPython signals and CPython threads don't play very nicely
    together; if you're combining them, you might want to study up on
    this.

    Oh, also, signals in CPython will tend to cause system calls to return
    without completing, and giving an EINTR in errno, and not all CPython
    modules will understand what to do with that. :( Sadly, many
    application programmers tend to ignore the EINTR possibility.

    HTH
  • Adam Skutt at Feb 16, 2011 at 5:22 am

    On Feb 15, 1:28?pm, Dan Stromberg wrote:
    *ix signals have historically been rather unreliable and troublesome
    when used heavily.

    There are BSD signals, SysV signals, and POSIX signals - they all try
    to solve the problems in different ways.
    No, there are just signals[1]. There are several different APIs for
    handling signals, depending on the situation, but they're all driving
    the same functionality underneath the covers. These days, only
    sigaction(2) is remotely usable (in C) for installing handlers and all
    the other APIs should normally be ignored.
    You might also make sure your SIGCHLD signal handler is not just
    waitpid'ing once per invocation, but rather doing a nonblocking
    waitpid in a loop until no process is found, in case signals are lost
    (especially if/when signals occur during signal handler processing).
    This is the most likely the issue. Multiple instances of the same
    pending signals are coalesced together automatically.

    It would also help to make sure the signal handler just sets a flag,
    within the application's main loop it should then respond to that flag
    appropriately. Running anything inside a signal handler is a recipe
    for disaster.

    Also, SIGCHLD handlers may not get reinstalled on some operating
    systems (even in Python), so the application code needs to reinstall
    it. If not done within the signal handler, this can caused signals to
    get "lost".

    That being said, I'd just spawn a thread and wait there and avoid
    SIGCHLD altogether. It's typically not worth the hassle.
    Oh, also, signals in CPython will tend to cause system calls to return
    without completing, and giving an EINTR in errno, and not all CPython
    modules will understand what to do with that. ?:( ?Sadly, many
    application programmers tend to ignore the EINTR possibility.
    This can be disabled by signal.siginterrupt(). Regardless, the signal
    handling facilities provided by Python are rather poor.

    Adam

    [1] Ok, I lied, there's regular signals and realtime signals, which
    have a few minor differences.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedFeb 15, '11 at 10:57a
activeFeb 16, '11 at 5:22a
posts3
users3
websitepython.org

3 users in discussion

Dinh: 1 post Dan Stromberg: 1 post Adam Skutt: 1 post

People

Translate

site design / logo © 2022 Grokbase