FAQ
Hello,

For several weeks my mailman instance sometimes just stops delivering
messages. Restarting Mailman seems to fix it and messages that were in
the queue do get sent eventually.

Today I got a hint as to what might be happening ...
As Mailman seemed to be stuck again, I ran "mailmanctl stop" and watched
all the log files.
When I ran "mailmanctl stop", the following line suddenly appeared in
the "smtp" log file:

Apr 24 07:39:11 2005 (26501)
<BAY104-F2054D7459D5E9CBDBFD56ACC2D0 at phx.gbl> smtp for 155 recips,
completed in 159722.267 seconds

Wow ! So it seems that Mailman was stuck (for more than 40 hours) while
trying to send this message and that stopping it made it stop trying to
send this message and move on to the next ones (after I restarted it).

I also saw a lot of these appear in smtp-failure:

smtp-failure:Apr 24 07:39:11 2005 (26501) delivery to
<removed-email-address> failed with code -1: please run connect() first


Does anyone have any idea as to why Mailman gets stuck trying to send a
message sometimes ? If a resource (DNS, remote mail server, ...) is
unavailable, shouldn't it timeout after a while ?

I'm currently running version 2.1.4. Could this be fixed by upgrading to
the latest version ?

Thanks,

Remi.

Search Discussions

  • Michael Loftis at Apr 24, 2005 at 6:37 pm
    --On Sunday, April 24, 2005 7:28 PM +0100 Remi Delon wrote:

    Does anyone have any idea as to why Mailman gets stuck trying to send a
    message sometimes ? If a resource (DNS, remote mail server, ...) is
    unavailable, shouldn't it timeout after a while ?
    BAsically with *any* mailing list software it needs to be configured to
    just take the mail from the mailing list app, and ask questions later.
    Same rule goes for LSoft ListServ, majordomo, etc. The problem you're
    running into could be almost anything but i'd bet it has something to do
    with bad addresses in the mailing list rather than your mailman version.
    Find out what the address is, go to the list admin interface and using your
    site pass logging and unsubscribe the junk address.
    I'm currently running version 2.1.4. Could this be fixed by upgrading to
    the latest version ?

    Thanks,

    Remi.
    --
    Undocumented Features quote of the moment...
    "It's not the one bullet with your name on it that you
    have to worry about; it's the twenty thousand-odd rounds
    labeled `occupant.'"
    --Murphy's Laws of Combat
  • Remi Delon at Apr 27, 2005 at 10:14 am

    Does anyone have any idea as to why Mailman gets stuck trying to send a
    message sometimes ? If a resource (DNS, remote mail server, ...) is
    unavailable, shouldn't it timeout after a while ?

    BAsically with *any* mailing list software it needs to be configured to
    just take the mail from the mailing list app, and ask questions later.
    Same rule goes for LSoft ListServ, majordomo, etc. The problem you're
    running into could be almost anything but i'd bet it has something to do
    with bad addresses in the mailing list rather than your mailman version.
    Find out what the address is, go to the list admin interface and using
    your site pass logging and unsubscribe the junk address.
    Thanks for the insight Michael... If a junk address is indeed all it
    takes to cause Mailman to hang then it's quite disapointing ...
    How would I go about finding that junk address (there are more than 200
    in that list) ? Isn't it true that all addresses must have been valid at
    some point ? (otherwise people wouldn't have been able to confirm their
    subscription). That makes it hard to see which ones are junk ...

    Thanks,

    Remi.
  • Stephen J. Turnbull at Apr 27, 2005 at 11:55 am
    "Remi" == Remi Delon <remi at cherrypy.org> writes:
    Remi> Thanks for the insight Michael... If a junk address is
    Remi> indeed all it takes to cause Mailman to hang then it's quite
    Remi> disapointing ...

    It's not. I've never seen Mailman hang like that in more than 4
    years, with versions 2.0.13 and 2.1.5.

    Remi> How would I go about finding that junk address (there are
    Remi> more than 200 in that list) ?

    It's probably the one listed in smtp-failure, see below.

    Remi> Isn't it true that all addresses must have been valid at
    Remi> some point ? (otherwise people wouldn't have been able to
    Remi> confirm their subscription).

    True. But that's not the kind of "junk" that causes problems for
    Mailman; Mailman will disable delivery or unsubscribe addresses that
    become invalid. The "junk" that causes Mailman to blow up are
    syntactically invalid (eg, contain illegal characters) and must be
    introduced by the list admin using "mass subscribe" or a command line
    tool. If you've never done that, ie, all your users subscribed
    themselves, then you will have no problematic junk addresses, just
    undeliverable ones that will be automatically weeded out.

    In your earlier message you provided this log:

    smtp-failure:Apr 24 07:39:11 2005 (26501) delivery to
    <removed-email-address> failed with code -1: please run connect() first

    While it's possible that there's a connect() function in Mailman that
    Mailman is failing to run, that seems very unlikely ... we would have
    heard about it by now. There is a connect() function in many OSes,
    for the "socket" interface layered over raw Internet connections. It
    seems quite possible to me that your OS or network is hosed, resulting
    socket connections appearing to be internally broken, and Mailman
    being unable to connect to a remote host for long periods of time. In
    that case, it might very well hang assuming that it should keep trying
    until the network admin fixes whatever the problem is.

    As far as I can tell, Mac OS X sometimes wedges itself in this way.
    An SSH connection will simply go away ... until I connect to the same
    host in a separate connection. Then the shell will come back up, the
    X tunnel will start working again, etc, etc.

    Remi> I'm currently running version 2.1.4. Could this be fixed by
    Remi> upgrading to the latest version ?

    Waiting a bit and upgrading to 2.1.6 (due out real soon) might (or
    might not) be a good idea.

    However, if my guess that the smtp-failure message is coming from the
    OS is correct, upgrading Mailman is not very likely to fix it.

    --
    School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
    University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
    Ask not how you can "do" free software business;
    ask what your business can "do for" free software.
  • Stephen J. Turnbull at Apr 28, 2005 at 8:58 am
    Forwarding to mailman-users, I can't do anything about this....
    "Remi" == Remi Delon <remi at cherrypy.org> writes:
    Remi> Actually, you have [heard about it] ...
    Remi> http://mail.python.org/pipermail/mailman-users/2004-October/040344.html

    Remi> Looks like I'm not the only one with that problem ...

    I said "maybe it's a system issue?" and Remi said:

    Remi> Hmm ... Seems a bit unlikely to me ... This is a production
    Remi> machine running linux RedHat with sendmail as the SMTP
    Remi> server and there are about 100 people using the machine and
    Remi> sendmail on it all the time ...

    So is the other guy. That's not enough to apply the law of large
    numbers, but I wouldn't ignore it, either.

    Are you using a RedHat-supplied Python? Did you build Mailman yourself?


    --
    School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
    University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
    Ask not how you can "do" free software business;
    ask what your business can "do for" free software.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmailman-users @
categoriespython
postedApr 24, '05 at 6:28p
activeApr 28, '05 at 8:58a
posts5
users3
websitelist.org

People

Translate

site design / logo © 2023 Grokbase