FAQ
Hello,

I am in the process of trying to automate our Mailman Archive
maintenance before it gets unruly. I looked in the FAQ and wiki for
information and found some about rebuilding the archives (which will be
handy) but nothing about automating it.

The assumptions I am working under:
1. the html files for the archives are located in <some
prefix>/<listname> (to be called DIR-A)
2. the directory the mbox file to rebuild the archive html files are in
<some prefix>/<listname>.mbox (to be called DIR-B)
3. our automated process will process the mbox files in DIR-B and delete
completely or mark for deletion any messages older than a given timeframe.

Now, the questions:
1. If I run bin/arch --wipe <listname> to rebuild the archives for
<listname>, do I have to delete the files in DIR-A first or will
bin/arch do it?

2. When a message is added to the mbox file in DIR-B, is it appended to
the file or does it get added through some interface?

3. When a message is added to the mbox file in DIR-B, are any existing
messages that are marked for deletion removed or is the message just
added to the mbox file?

4. When bin/arch is run and builds the html files, does it ignore
messages marked for deletion or does it add the message to the html
files no matter how it is marked?

5. Should Mailman be shutdown prior to running my automated process,
which includes running bin/arch, or can I leave Mailman running?

6. In our installation, the public archives directory for each list is a
link to the private archives directory for each list, is that the
standard or should I be prepared to see some archives in the public area
and other in the private depending on the particular list's setting?

7. Is there any other gotcha I should watch out for when using an
automated process?

Thanks in advance,
Chris

Search Discussions

  • Mark Sapiro at May 21, 2011 at 3:44 am

    C Nulk wrote:
    I am in the process of trying to automate our Mailman Archive
    maintenance before it gets unruly. I looked in the FAQ and wiki for
    information and found some about rebuilding the archives (which will be
    handy) but nothing about automating it.

    The assumptions I am working under:
    1. the html files for the archives are located in <some
    prefix>/<listname> (to be called DIR-A)
    2. the directory the mbox file to rebuild the archive html files are in
    <some prefix>/<listname>.mbox (to be called DIR-B)
    3. our automated process will process the mbox files in DIR-B and delete
    completely or mark for deletion any messages older than a given timeframe.

    Now, the questions:
    1. If I run bin/arch --wipe <listname> to rebuild the archives for
    <listname>, do I have to delete the files in DIR-A first or will
    bin/arch do it?

    You do not have to delete any DIR-A files. That's what the --wipe
    option does.

    2. When a message is added to the mbox file in DIR-B, is it appended to
    the file or does it get added through some interface?

    It is appended.

    3. When a message is added to the mbox file in DIR-B, are any existing
    messages that are marked for deletion removed or is the message just
    added to the mbox file?

    It is just appended by a file open and append operation. The process
    does not in any way emulate an MDA or any IMAP or other mail access
    type process.

    4. When bin/arch is run and builds the html files, does it ignore
    messages marked for deletion or does it add the message to the html
    files no matter how it is marked?

    It totally ignores any message status type headers.

    5. Should Mailman be shutdown prior to running my automated process,
    which includes running bin/arch, or can I leave Mailman running?

    It's OK for Mailman to be running. There are archive locks that will
    prevent concurrent updates.

    6. In our installation, the public archives directory for each list is a
    link to the private archives directory for each list, is that the
    standard or should I be prepared to see some archives in the public area
    and other in the private depending on the particular list's setting?

    All archive data is in archives/private/. archives/public/ contains
    only symlinks.

    7. Is there any other gotcha I should watch out for when using an
    automated process?

    I don't think so.

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • C Nulk at May 23, 2011 at 4:44 pm
    Thank you very much, Mark. I appreciate the information and help.

    I have written a small php app which uses the imap libraries to open and
    process the mbox files. Given a time frame, the app will mark for
    deletion or remove any messages older than the time frame. Now to write
    a script to call bin/arch for each of the lists I process with my php
    app. I think I will still shutdown mailman when I run the php app since
    I don't have any way to check on any locks. Afterwards, I can restart
    Mailman, then call the script to rebuild the archives.

    The combination of the two apps/scripts will allow us to automate
    removing old archives. Run once a month for marking old messages, then
    once a year to remove marked messages. We can then keep a running 2 - 3
    year archive going.

    Thanks again,
    Chris
    On 5/20/2011 8:44 PM, Mark Sapiro wrote:
    C Nulk wrote:
    I am in the process of trying to automate our Mailman Archive
    maintenance before it gets unruly. I looked in the FAQ and wiki for
    information and found some about rebuilding the archives (which will be
    handy) but nothing about automating it.

    The assumptions I am working under:
    1. the html files for the archives are located in <some
    prefix>/<listname> (to be called DIR-A)
    2. the directory the mbox file to rebuild the archive html files are in
    <some prefix>/<listname>.mbox (to be called DIR-B)
    3. our automated process will process the mbox files in DIR-B and delete
    completely or mark for deletion any messages older than a given timeframe.

    Now, the questions:
    1. If I run bin/arch --wipe <listname> to rebuild the archives for
    <listname>, do I have to delete the files in DIR-A first or will
    bin/arch do it?
    You do not have to delete any DIR-A files. That's what the --wipe
    option does.

    2. When a message is added to the mbox file in DIR-B, is it appended to
    the file or does it get added through some interface?
    It is appended.

    3. When a message is added to the mbox file in DIR-B, are any existing
    messages that are marked for deletion removed or is the message just
    added to the mbox file?
    It is just appended by a file open and append operation. The process
    does not in any way emulate an MDA or any IMAP or other mail access
    type process.

    4. When bin/arch is run and builds the html files, does it ignore
    messages marked for deletion or does it add the message to the html
    files no matter how it is marked?
    It totally ignores any message status type headers.

    5. Should Mailman be shutdown prior to running my automated process,
    which includes running bin/arch, or can I leave Mailman running?
    It's OK for Mailman to be running. There are archive locks that will
    prevent concurrent updates.

    6. In our installation, the public archives directory for each list is a
    link to the private archives directory for each list, is that the
    standard or should I be prepared to see some archives in the public area
    and other in the private depending on the particular list's setting?
    All archive data is in archives/private/. archives/public/ contains
    only symlinks.

    7. Is there any other gotcha I should watch out for when using an
    automated process?
    I don't think so.

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • Mark Sapiro at May 23, 2011 at 10:32 pm

    C Nulk wrote:
    I have written a small php app which uses the imap libraries to open and
    process the mbox files. Given a time frame, the app will mark for
    deletion or remove any messages older than the time frame. Now to write
    a script to call bin/arch for each of the lists I process with my php
    app. I think I will still shutdown mailman when I run the php app since
    I don't have any way to check on any locks. Afterwards, I can restart
    Mailman, then call the script to rebuild the archives.

    That should work fine, but if you wrote a Python script, it could use
    Mailman list methods to manage the archive locking and use either
    Python's imaplib or mailbox modules to actually manipulate messages in
    the .mbox.

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • C Nulk at May 24, 2011 at 9:50 pm
    I agree with you, Mark. Unfortunately, as you may know, my Python
    skills aren't the best. I am working on little bits of improvements for
    us here as time permits so I use the tools with which I am most familiar.

    Since we are using Mailman v2.1.9, we are a bit behind where Mailman
    currently sits. Your gracious help assisted us with the LDAP plus I
    have made some custom modifications - one you help with was for "Special
    Posters" and a set of changes to add more logging information to the log
    files. At some point, we will be migrating to at least v2.1.12 or
    later, so I need to look at incorporating my mods into the later code.
    And very little free time to do it.

    Thanks again for you help,
    Chris

    P.S. While my automation php app isn't the best written thing in the
    world, if anyone wants a copy to use as a starting point for conversion
    to a Python script, let me know.
    On 5/23/2011 3:32 PM, Mark Sapiro wrote:
    C Nulk wrote:
    I have written a small php app which uses the imap libraries to open and
    process the mbox files. Given a time frame, the app will mark for
    deletion or remove any messages older than the time frame. Now to write
    a script to call bin/arch for each of the lists I process with my php
    app. I think I will still shutdown mailman when I run the php app since
    I don't have any way to check on any locks. Afterwards, I can restart
    Mailman, then call the script to rebuild the archives.
    That should work fine, but if you wrote a Python script, it could use
    Mailman list methods to manage the archive locking and use either
    Python's imaplib or mailbox modules to actually manipulate messages in
    the .mbox.

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmailman-users @
categoriespython
postedMay 20, '11 at 8:57p
activeMay 24, '11 at 9:50p
posts5
users2
websitelist.org

2 users in discussion

C Nulk: 3 posts Mark Sapiro: 2 posts

People

Translate

site design / logo © 2022 Grokbase