FAQ
We are running Mailman v2.1.13 on Solaris 9 Sparc. We have a list with about 13,000 members that we currently have "emergency moderation" turned on because of previous problems with reply storms.

Today somebody sent a message to the list, which (of course) was held for moderation. The list owner approved it, but then it disappeared. Here is what was in the "vette" log:

Oct 04 10:35:12 2010 (21271) XXXXX_Cohorts post from michael.xxxx at xxxx.edu held, message-id=<2034674BF9D4714BAE35B22BF073EEF402E70996 at MS-XXXX.xxxx.edu>: Post to moderated list
Oct 04 10:38:26 2010 (5774) held message approved, message-id: <2034674BF9D4714BAE35B22BF073EEF402E70996 at MS-XXXX.xxxx.edu>
Oct 04 10:38:28 2010 (21271) Message discarded, msgid: 2034674BF9D4714BAE35B22BF073EEF402E70996 at MS-XXXX.xxxx.edu

What caused the message to be discarded after it was approved?

Thanks,

Larry M. Rosenbaum
Oak Ridge National Laboratory

Search Discussions

  • Ralf Hildebrandt at Oct 4, 2010 at 8:06 pm

    * Rosenbaum, Larry M. <rosenbaumlm at ornl.gov>:
    We are running Mailman v2.1.13 on Solaris 9 Sparc. We have a list with about 13,000 members that we currently have "emergency moderation" turned on because of previous problems with reply storms.

    Today somebody sent a message to the list, which (of course) was held for moderation. The list owner approved it, but then it disappeared. Here is what was in the "vette" log:

    Oct 04 10:35:12 2010 (21271) XXXXX_Cohorts post from michael.xxxx at xxxx.edu held, message-id=<2034674BF9D4714BAE35B22BF073EEF402E70996 at MS-XXXX.xxxx.edu>: Post to moderated list
    Oct 04 10:38:26 2010 (5774) held message approved, message-id: <2034674BF9D4714BAE35B22BF073EEF402E70996 at MS-XXXX.xxxx.edu>
    Oct 04 10:38:28 2010 (21271) Message discarded, msgid: 2034674BF9D4714BAE35B22BF073EEF402E70996 at MS-XXXX.xxxx.edu
    The first and third line seem to match, but why is the size for the
    second line only a 5k instead of 20k? (if the number in () is actually
    the size)

    --
    Ralf Hildebrandt
    Gesch?ftsbereich IT | Abteilung Netzwerk
    Charit? - Universit?tsmedizin Berlin
    Campus Benjamin Franklin
    Hindenburgdamm 30 | D-12203 Berlin
    Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962
    ralf.hildebrandt at charite.de | http://www.charite.de
  • Mark Sapiro at Oct 4, 2010 at 8:21 pm

    On 10/4/2010 1:06 PM, Ralf Hildebrandt wrote:
    * Rosenbaum, Larry M. <rosenbaumlm at ornl.gov>:
    Oct 04 10:35:12 2010 (21271) XXXXX_Cohorts post from michael.xxxx at xxxx.edu held, message-id=<2034674BF9D4714BAE35B22BF073EEF402E70996 at MS-XXXX.xxxx.edu>: Post to moderated list
    Oct 04 10:38:26 2010 (5774) held message approved, message-id: <2034674BF9D4714BAE35B22BF073EEF402E70996 at MS-XXXX.xxxx.edu>
    Oct 04 10:38:28 2010 (21271) Message discarded, msgid: 2034674BF9D4714BAE35B22BF073EEF402E70996 at MS-XXXX.xxxx.edu
    The first and third line seem to match, but why is the size for the
    second line only a 5k instead of 20k? (if the number in () is actually
    the size)

    The number in parens is thi PID of the process that made the log entry.
    In the first and third cases, this is IncomingRunner and in the second
    it is the admindb CGI. It has nothing to do with message size.


    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • Mark Sapiro at Oct 4, 2010 at 8:24 pm

    On 10/4/2010 11:51 AM, Rosenbaum, Larry M. wrote:
    What caused the message to be discarded after it was approved?

    Most likely, Content filtering -> filter_action is Discard, and there
    was nothing left of the message after content filtering. E.g. the
    message was text/html or maybe multipart/related and the list doesn't
    accept that content type.

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • Mark Sapiro at Oct 30, 2010 at 4:02 am

    On 10/29/2010 8:11 AM, Andr? van Kan wrote:
    Hi Larry,

    Although we have the same problem, ours is even weird in
    some aspects. I made an test-list - groep3 - with two members.
    If mail is send by a non-member, all messages are discarded
    after approval:

    Oct 29 15:59:09 2010 (20460) groep3: Discarded posting:
    From: acvankan at gmail.com
    Subject: test after restart
    Oct 29 16:04:30 2010 (21937) groep3: Discarded posting:
    From: acvankan at gmail.com
    Subject: last one

    Please post to the list, not to me personally.

    However, the above messages, presumably from Mailman's vette log, come
    from two different PIDs. Thus, I'm guessing that you are approving these
    messages via the admindb web interface, and something is going wrong in
    that process resulting in the message being discarded by the admindb CGI
    itself.

    BUT if messages are sent by senders as mentioned in [Sender Filters] /
    "List of non-member addresses whose postings should be automatically accepted"
    THEN messages are delivered (of course without the necessity of approval).

    Since the vette-contents is rather concise, I editted
    IncomingRunner.py'sErrors.DiscardMessage as follows, according
    a earlier posting about this subject:
    syslog('vette', 'Message discarded, msgid: %s, handler: %s',
    msg.get('message-id', 'n/a'), handler)
    but to no avail. Nothing changes in the output-format.

    Because the message is not discarded by IncomingRunner. It is discarded
    by the HandleRequest method in Mailman/ListAdmin.py called from
    Mailman/Cgi/admindb.py.

    Since all our lists suffer the same problem, a generic
    configuration error may be assumed, but I've no idea
    how to fix it. Any hint would be gratly appreciated.

    Have you modified admindb.py in any way?

    Does the web admin interface work?

    What Mailman version is this?

    How exactly are you approving messages?

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • André van Kan at Nov 3, 2010 at 4:55 pm
    Hi Mark,

    Thank you for your reply.
    Please post to the list, not to me personally.
    Point taken. And sure, you are Mark, not Larry, sorry ;-)
    However, the above messages, presumably from Mailman's vette log, come
    from two different PIDs. Thus, I'm guessing that you are approving these
    messages via the admindb web interface, and something is going wrong in
    that process resulting in the message being discarded by the admindb CGI
    itself.
    That is right, these are vette-log entries and the approving attempt is done
    via the admindb web interface.

    I found out that as soon as the list-administrator just selects the link in
    the ".... requires approval" message as sent by mailman, ("Reason: Post
    by non-member to a members-only list") the message is discarded at that
    very moment! (vette log update detected)

    After that the following is displayed:
    Action to take on all these held messages:
    Defer Accept Reject Discard
    but *no* pending message is there to Accept.
    Have you modified admindb.py in any way? No!
    Does the web admin interface work?
    Except for the murdering of pending messages the interface works
    flawlessly.
    What Mailman version is this?
    mailman-2.1.9-4.el5 (RPM)
    How exactly are you approving messages?
    As said before: by selecting the link in the mailman-sent message.



    Best regards and thank you for your help,


    Andr? van Kan

    -------------------------------------------------------
    * Computer Technology Group * Nikhef, Amsterdam *
    -------------------------------------------------------
  • Mark Sapiro at Nov 3, 2010 at 11:47 pm

    Andr? van Kan wrote:
    I found out that as soon as the list-administrator just selects the link in
    the ".... requires approval" message as sent by mailman, ("Reason: Post
    by non-member to a members-only list") the message is discarded at that
    very moment! (vette log update detected)

    After that the following is displayed:
    Action to take on all these held messages:
    Defer Accept Reject Discard
    but *no* pending message is there to Accept.

    This is very unusual. Is there an http proxy server involved here?

    Can you send me off list a copy of Mailman/Cgi/admindb.py from your
    system?

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • Mark Sapiro at Nov 4, 2010 at 2:12 pm

    Mark Sapiro wrote:

    Can you send me off list a copy of Mailman/Cgi/admindb.py from your
    system?
    I received that. Thank you. Unfortunately, there's nothing there that
    helps explain this issue. I can work up a patch to provide debug
    logging, but before I do that, is there anything in Mailman's 'error'
    log from either the time of the visit to the admindb page or from the
    time the held post arrived? If so, what's in the error log. If not, is
    it possible that some non-mailman process has removed any of Mailman's
    data/heldmsg* files between the holding of the post and the visit to the
    admindb page?

    Also, it wouldn't hurt to run Mailman's bin/check_perms to see if there
    might be permission issues on Mailman's data/ directory.

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • Mark Sapiro at Nov 4, 2010 at 6:06 pm

    Andre van Kan wrote:
    On Thu, 2010-11-04 at 07:12 -0700, Mark Sapiro wrote:
    Mark Sapiro wrote:
    Can you send me off list a copy of Mailman/Cgi/admindb.py from your
    system?
    I received that. Thank you. Unfortunately, there's nothing there that
    helps explain this issue. I can work up a patch to provide debug
    logging, but before I do that, is there anything in Mailman's 'error'
    log from either the time of the visit to the admindb page or from the
    time the held post arrived?
    Not a single entry in any of the log files.

    If so, what's in the error log. If not, is
    it possible that some non-mailman process has removed any of Mailman's
    data/heldmsg* files between the holding of the post and the visit to the
    admindb page?
    I don't think so. The message is removed at the
    very moment that the link in Mailman's notification
    message is activated.

    I understand that, but the data/heldmsg* file could have been removed
    external to Mailman before that.
    Also, it wouldn't hurt to run Mailman's bin/check_perms to see if there
    might be permission issues on Mailman's data/ directory.
    See attachment. Mail restarted, just in case ...
    I sent a message to the list. Unfortunately there was no message
    waiting for me to be accepted and NO entry was in vette.
    I think that it would be great if the logging became a bit
    more verbose. Thanks in advance.

    The attached admindb.patch.txt file contains a patch that will log more
    information to Mailman's 'debug' log.

    Try applying the patch and then the next time the problem occurs, post
    the log - note the log may contain passwords; be careful to remove
    them before posting.

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan

    -------------- next part --------------
    An embedded and charset-unspecified text was scrubbed...
    Name: admindb.patch.txt
    URL: <http://mail.python.org/pipermail/mailman-users/attachments/20101104/b4b28165/attachment.txt>
  • André van Kan at Nov 5, 2010 at 1:44 pm

    I understand that, but the data/heldmsg* file could have been removed
    external to Mailman before that.
    I never found any heldmsg* on the system, nor any complaint
    that such a file could not be written. Maybe could an strace of
    the appropriate qrunner be of some help?
    The attached admindb.patch.txt file contains a patch that will log more
    information to Mailman's 'debug' log.
    See attachment.

    Best regards,

    Andr? van Kan
    -------------- next part --------------
    An embedded and charset-unspecified text was scrubbed...
    Name: debug
    URL: <http://mail.python.org/pipermail/mailman-users/attachments/20101105/4d9a253d/attachment.ksh>
  • Mark Sapiro at Nov 5, 2010 at 2:48 pm

    On 11/5/2010 6:44 AM, Andr? van Kan wrote:
    I understand that, but the data/heldmsg* file could have been removed
    external to Mailman before that.
    I never found any heldmsg* on the system, nor any complaint
    that such a file could not be written. Maybe could an strace of
    the appropriate qrunner be of some help?
    The attached admindb.patch.txt file contains a patch that will log more
    information to Mailman's 'debug' log.
    See attachment.


    This excerpt from your log shows one interaction with the admindb interface.

    Nov 05 13:20:14 2010 (7482) cgidata = FieldStorage(None, None, [])
    The above is logged at the beginning of the interaction and says there
    was no POST data which is expected.

    Nov 05 13:20:14 2010 (7482) heldmsg file non-existent, info = (1288958492.854609, 'acvankan at gmail.com', 'test na patch', 'Post by non-member to a members-only list', 'heldmsg-groep3-18.pck', {'lang': 'en', 'listname': 'groep3', 'received_time': 1288958491.90411, '_parsemsg': True, 'tolist': 1, 'pipeline': ['Hold', 'MimeDel', 'Scrubber', 'Emergency', 'Tagger', 'CalcRecips', 'AvoidDuplicates', 'Cleanse', 'CleanseDKIM', 'CookHeaders', 'ToDigest', 'ToArchive', 'ToUsenet', 'AfterDelivery', 'Acknowledge', 'ToOutgoing'], 'version': 3, 'rejection_notice': 'Non-members are not allowed to post messages to this list.'})
    Nov 05 13:20:14 2010 (7482) heldmsg file non-existent, info = (1288958518.006875, 'acvankan at gmail.com', 'test na patch', 'Post by non-member to a members-only list', 'heldmsg-groep3-19.pck', {'lang': 'en', 'listname': 'groep3', 'received_time': 1288958517.782223, '_parsemsg': True, 'tolist': 1, 'pipeline': ['Hold', 'MimeDel', 'Scrubber', 'Emergency', 'Tagger', 'CalcRecips', 'AvoidDuplicates', 'Cleanse', 'CleanseDKIM', 'CookHeaders', 'ToDigest', 'ToArchive', 'ToUsenet', 'AfterDelivery', 'Acknowledge', 'ToOutgoing'], 'version': 3, 'rejection_notice': 'Non-members are not allowed to post messages to this list.'})

    Then these two say that there were two held messages, both held as
    non-member posts to the groep3 list, and in this case, both posted from
    the same address with subject 'test na patch',

    The first was received at Fri Nov 5 13:01:31 2010 +0100 and held at Fri
    Nov 5 13:01:32 2010 +0100. At the time it was held, the message itself
    should have been saved in Mailman's data/ directory in a file named
    'heldmsg-groep3-18.pck'.

    Likewise, the second was received at Fri Nov 5 13:01:57 2010 +0100 and
    held at Fri Nov 5 13:01:58 2010 +0100, and the time it was held, the
    message itself should have been saved in Mailman's data/ directory in a
    file named 'heldmsg-groep3-19.pck'.

    The problem is that when the admindb interface CGI attempted to retrieve
    these message files at Nov 05 13:20:14 2010, the OS said the two files
    didn't exist. At that point, admindb.py logged the two messages and
    "discarded" the post in order to clean up the list's request database.
    This is what caused the vette log entries for these posts, but the real
    issue is that the posts had already been lost or possibly (highly
    unlikely) that the OS said the files didn't exist even though they did.

    So, either the OS lied (unlikely), or the files were removed from
    Mailman's data/ directory sometime in the 19 or so minutes between the
    time they were created and the time the admindb.py script ran, or for
    some reason they weren't successfully created in the first place. I
    actually think this latter explanation is the most likely, except that
    this should have resulted in an exception and error log entries at the
    time the post was held, and this exception should have occurred before
    the request database was written, so the end result would be that the
    post was logged and shunted and not held at all.

    So assuming the OS works this seems to say that either the files were
    removed in that 19 minute window, or possibly there are two Mailman
    installations, one used by mailmanctl and the qrunners and one used by
    the web interface and these two installations both point to the same
    lists/ directory, but different data/ directories so the the heldmsg-*
    files are created by IncomingRunner in one directory and looked for by
    admindb.py in a different directory. Again, this doesn't seem likely.

    You could look for a data/heldmsg* file immediately after sending a post
    from a non-member and before going to the admindb interface, or you
    could strace IncomingRunner when it is processing a non-member post and
    see if it writes the file.

    You could also try patching Mailman/ListAdmin.py by finding these lines

    try:
    fp = open(os.path.join(mm_cfg.DATA_DIR, filename), 'w')
    try:
    if mm_cfg.HOLD_MESSAGES_AS_PICKLES:
    cPickle.dump(msg, fp, 1)
    else:
    g = Generator(fp)
    g.flatten(msg, 1)
    fp.flush()
    os.fsync(fp.fileno())
    finally:
    fp.close()
    finally:
    os.umask(omask)

    in the definition of the HoldMessage method and adding the lines

    syslog('debug', 'HoldMessage: %s written to %s',
    filename, mm_cfg.DATA_DIR)

    to make it

    try:
    fp = open(os.path.join(mm_cfg.DATA_DIR, filename), 'w')
    try:
    if mm_cfg.HOLD_MESSAGES_AS_PICKLES:
    cPickle.dump(msg, fp, 1)
    else:
    g = Generator(fp)
    g.flatten(msg, 1)
    fp.flush()
    os.fsync(fp.fileno())
    syslog('debug', 'HoldMessage: %s written to %s',
    filename, mm_cfg.DATA_DIR)
    finally:
    fp.close()
    finally:
    os.umask(omask)

    This will write an entry to the debug log confirming that the file was
    written.

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • Andre van Kan at Nov 13, 2010 at 12:12 pm

    On Fri, 2010-11-05 at 07:48 -0700, Mark Sapiro wrote:

    I understand that, but the data/heldmsg* file could have been removed
    external to Mailman before that.
    I never found any heldmsg* on the system, nor any complaint
    that such a file could not be written. Maybe could an strace of
    the appropriate qrunner be of some help?
    I should have firstly explained the consistency of our recently
    installed mailsystem: three separate IP-clusters (!) cooperate to
    handle our e-mail and except for MM they perform very well.
    (btw, it was designed and installed by an external party)

    - cluster 1: runs the MTA (Postfix) and MM
    - cluster 2: runs Horde and the MM-webinterface
    - cluster 3: runs Cyrus-imapd

    Cluster 1 and 2 share - via NFS - the MM-lists etc. I found out however
    that data/heldms* was only available on cluster 1.
    From the moment that I shared this directory between cluster 1 and 2 the
    approved messages were visible with the admin webinterface and approving
    resulted in a transfer of the messages to cluster2:/var/spool/mailman/in
    and are waiting there for the things to come.
    So the problem is partly solved, albeit that the messages still are not
    delivered, supposedly because cluster 1 should be able to transfer the
    messages to cluster 3, the IMAP-server (?).
    So I want to share the directory cluster2:/var/spool/mailman/in with
    cluster 1 since this cluster is configured to send the messages to the
    IMAP-server.
    Does this sound reasonable to you, or should cluster 2 itself better be
    configured to send the approved messages to the IMAP-server directly?

    Thank you for your time and useful hints &c.

    Regards,

    Andre van Kan
  • Adam McGreggor at Nov 13, 2010 at 12:46 pm

    On Sat, Nov 13, 2010 at 01:12:18PM +0100, Andre van Kan wrote:
    - cluster 1: runs the MTA (Postfix) and MM
    - cluster 2: runs Horde and the MM-webinterface
    - cluster 3: runs Cyrus-imapd

    Cluster 1 and 2 share - via NFS - the MM-lists etc. I found out however
    that data/heldms* was only available on cluster 1.
    From the moment that I shared this directory between cluster 1 and 2 the
    approved messages were visible with the admin webinterface and approving
    resulted in a transfer of the messages to cluster2:/var/spool/mailman/in
    and are waiting there for the things to come.
    So the problem is partly solved, albeit that the messages still are not
    delivered, supposedly because cluster 1 should be able to transfer the
    messages to cluster 3, the IMAP-server (?).
    Why on earth are you adding in IMAP at this stage? Mailman is designed
    to deliver via a pipe, to the scripts. I believe that's the case, even
    with Postfix.
    So I want to share the directory cluster2:/var/spool/mailman/in with
    cluster 1 since this cluster is configured to send the messages to the
    IMAP-server.
    I suspect that that's a completely wrong approach.
    Does this sound reasonable to you, or should cluster 2 itself better be 's/2/1/'
    configured to send the approved messages to the IMAP-server directly?
    's/to the IMAP-server//'

    --
    "I would nationalise Elizabeth Hurley and allow each of us to claim
    our share."
    -- JG Ballard ('lost' NS interview, mid 1990s)
  • Mark Sapiro at Nov 13, 2010 at 4:15 pm

    On 11/13/2010 4:12 AM, Andre van Kan wrote:
    I should have firstly explained the consistency of our recently
    installed mailsystem: three separate IP-clusters (!) cooperate to
    handle our e-mail and except for MM they perform very well.
    (btw, it was designed and installed by an external party)

    - cluster 1: runs the MTA (Postfix) and MM
    - cluster 2: runs Horde and the MM-webinterface
    - cluster 3: runs Cyrus-imapd

    Cluster 1 and 2 share - via NFS - the MM-lists etc. I found out however
    that data/heldms* was only available on cluster 1.
    From the moment that I shared this directory between cluster 1 and 2 the
    approved messages were visible with the admin webinterface and approving
    resulted in a transfer of the messages to cluster2:/var/spool/mailman/in
    and are waiting there for the things to come.
    So the problem is partly solved, albeit that the messages still are not
    delivered, supposedly because cluster 1 should be able to transfer the
    messages to cluster 3, the IMAP-server (?).

    The approved message is requeued in cluster 2 /var/spool/mailman/in, but
    no qrunners are running on cluster 2 so there it sits. This directory
    (in fact all of the /var/spool/mailman/ directory) must be shared with
    cluster 1.

    In fact, all of Mailman's mutable directories except maybe logs should
    be shared between clusters 1 and 2. The rest of Mailman's directories
    can be either shared or duplicated as you wish. Assuming this is
    RedHat's Mailman as indicated by the queues being in /var/spool/mailman,
    the directories that must be shared are

    /var/lib/mailman/archives
    /var/lib/mailman/data
    /var/lib/mailman/lists
    /var/lock/mailman
    /var/spool/mailman

    Sharing of /var/log/mailman is optional.

    So I want to share the directory cluster2:/var/spool/mailman/in with
    cluster 1 since this cluster is configured to send the messages to the
    IMAP-server.

    I'm not sure what the IMAP server has to do with it except for the
    ultimate delivery of "local user" messages from a mailbox or maildir to
    the user's MUA.

    Mailman's OutgoingRunner on cluster 1 delivers to Postfix, and delivery
    from there is handled by Postfix.

    Does this sound reasonable to you, or should cluster 2 itself better be
    configured to send the approved messages to the IMAP-server directly?

    No. Don't try to configure cluster 2 to deliver mail from Mailman. This
    will result in much unnecessary duplication. Just share the directories
    I indicate above between clusters 1 and 2. Note that it is critically
    important to share the locks directory (/var/lock/mailman).

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmailman-users @
categoriespython
postedOct 4, '10 at 6:51p
activeNov 13, '10 at 4:15p
posts14
users5
websitelist.org

People

Translate

site design / logo © 2022 Grokbase