FAQ
Hi All

In the archives, the URL to attachments is getting mangled.

Somehow an "=" gets inserted into the URL (see snippet below), and the
links returns "Private archive file not found".

If I remove the "=" from the URL, all is well.

Can anyone shed any light on this, and how I might correct things so the
URL appears correctly (without the "=") in the archive?

BTW. The attachments are travelling okay with the messages. It's just in
the archive that there is a problem.

cheers
Mark Dale


-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.pdf
Type: application/pdf
Size: 5929 bytes
Desc: not available
Url :
http://myDomain.com/cgi-bin/mailman/private/myListname/attachments/200=
81119/d1fa80ed/test.pdf
-----

But this works:

http://myDomain.com/cgi-bin/mailman/private/myListname/attachments/20081119/d1fa80ed/test.pdf

Search Discussions

  • Brad Knowles at Nov 19, 2008 at 4:53 am

    on 11/18/08 6:52 PM, Mark Dale said:

    Somehow an "=" gets inserted into the URL (see snippet below), and the
    links returns "Private archive file not found".
    That's URL-encoded line folding that is not being properly interpreted
    by the recipient.
    If I remove the "=" from the URL, all is well.

    Can anyone shed any light on this, and how I might correct things so the
    URL appears correctly (without the "=") in the archive?
    Good question. I'm not sure that this problem can be fixed, short of
    fixing the clients to properly understand URL-encoded line folding.

    But I do hold out hope that Mark Sapiro or one of the other core
    developers can prove me wrong.

    --
    Brad Knowles <brad at shub-internet.org>
    LinkedIn Profile: <http://tinyurl.com/y8kpxu>
  • Mark Sapiro at Nov 19, 2008 at 4:58 am

    Mark Dale wrote:
    In the archives, the URL to attachments is getting mangled.

    Somehow an "=" gets inserted into the URL (see snippet below), and the
    links returns "Private archive file not found".

    If I remove the "=" from the URL, all is well.

    Can anyone shed any light on this, and how I might correct things so the
    URL appears correctly (without the "=") in the archive?

    BTW. The attachments are travelling okay with the messages. It's just in
    the archive that there is a problem.

    cheers
    Mark Dale


    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: test.pdf
    Type: application/pdf
    Size: 5929 bytes
    Desc: not available
    Url :
    http://myDomain.com/cgi-bin/mailman/private/myListname/attachments/200=
    81119/d1fa80ed/test.pdf

    This appears to be a problem with encoding/decoding of quoted-printable.

    What Mailman version is this?

    If Mailman is recent enough to have it, is Non-digest options ->
    scrub-nondigest Yes or No? (your "The attachments are travelling okay
    with the messages" implies no, but just checking.)

    Is your list digestable and if so, are the links in the "plain" digest
    OK or are they like the archive?

    Does this happen with every attachment, or only in some messages?

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • Mark Dale at Nov 19, 2008 at 5:56 am
    Hi Mark

    In answer to your questions ...

    I am using Mailman version: 2.1.5

    Subscribers can choose digest if they want.

    The link is broken in the digest messages, just like in the archive.

    The broken link happens for every PDF file, even though the PDF arrives
    with the email. (Doesn't arrive in the digest, just a broken link)

    The link is almost okay, it's just the "=" sign that gets inserted that
    messes things up.

    If it is a Word.doc that gets attached, no URL appears at all in the
    archive, not is there even a "scrubbed" message. Also, Word.doc files
    don't even arrive with the email.

    cheers

    MArk Dale


    Mark Sapiro wrote:
    Mark Dale wrote:
    In the archives, the URL to attachments is getting mangled.

    Somehow an "=" gets inserted into the URL (see snippet below), and the
    links returns "Private archive file not found".

    If I remove the "=" from the URL, all is well.

    Can anyone shed any light on this, and how I might correct things so the
    URL appears correctly (without the "=") in the archive?

    BTW. The attachments are travelling okay with the messages. It's just in
    the archive that there is a problem.

    cheers
    Mark Dale


    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: test.pdf
    Type: application/pdf
    Size: 5929 bytes
    Desc: not available
    Url :
    http://myDomain.com/cgi-bin/mailman/private/myListname/attachments/200=
    81119/d1fa80ed/test.pdf

    This appears to be a problem with encoding/decoding of quoted-printable.

    What Mailman version is this?

    If Mailman is recent enough to have it, is Non-digest options ->
    scrub-nondigest Yes or No? (your "The attachments are travelling okay
    with the messages" implies no, but just checking.)

    Is your list digestable and if so, are the links in the "plain" digest
    OK or are they like the archive?

    Does this happen with every attachment, or only in some messages?
    --


    --------------------------------------
    Mark Dale
    GeniusMoon
    Tel: 02 6100 3131
    Fax: 02 6103 9130
    Mob: 0403 831 748
    email: mdale at geniusmoon.com.au
    http://www.geniusmoon.com.au
    --------------------------------------
  • Mark Sapiro at Nov 19, 2008 at 3:28 pm

    Mark Dale wrote:
    In answer to your questions ...

    I am using Mailman version: 2.1.5

    Subscribers can choose digest if they want.

    The link is broken in the digest messages, just like in the archive.

    The broken link happens for every PDF file, even though the PDF arrives
    with the email. (Doesn't arrive in the digest, just a broken link)

    The link is almost okay, it's just the "=" sign that gets inserted that
    messes things up.

    There have been many changes in Scrubber.py (the module which does
    this) since 2.1.5. I will look into it, but I don't know if I will be
    able to duplicate the problem.

    If it is a Word.doc that gets attached, no URL appears at all in the
    archive, not is there even a "scrubbed" message. Also, Word.doc files
    don't even arrive with the email.

    Because these attachments are removed by content filtering before the
    message is archived and delivered.

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • Mark Dale at Nov 19, 2008 at 9:05 pm

    There have been many changes in Scrubber.py (the module which does
    this) since 2.1.5. I will look into it, but I don't know if I will be
    able to duplicate the problem.
    For what it's worth Mark, I copied a Scrubber.py from a Version 2.19 on
    the wild chance that it might do some good - but as was to be expected,
    no magic occurred.

    Here's hoping you do find a way to duplicate the problem. Upgrading this
    installation would be like changing a flat tyre on a moving car.

    If it is a Word.doc that gets attached, no URL appears at all in the
    archive, not is there even a "scrubbed" message. Also, Word.doc files
    don't even arrive with the email.
    Because these attachments are removed by content filtering before the
    message is archived and delivered.
    Understood. Thanks.



    cheers
    Mark Dale
    --
  • Mark Sapiro at Nov 20, 2008 at 5:09 pm

    Mark Dale wrote:
    There have been many changes in Scrubber.py (the module which does
    this) since 2.1.5. I will look into it, but I don't know if I will be
    able to duplicate the problem.
    For what it's worth Mark, I copied a Scrubber.py from a Version 2.19 on
    the wild chance that it might do some good - but as was to be expected,
    no magic occurred.

    Did you remember to restart Mailman after replacing Scrubber.py?

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • Mark Dale at Nov 23, 2008 at 3:24 am
    Hi Mark
    (Apologies for late reply, have been away)

    Yes, I sure did. And also rebooted the server (and checked that the
    qrunner was starting up on boot)
    I also noticed that with the new Scrubber.py, emails with attachments
    were neither delivered nor archived. Emails without attachments were okay.

    Once I returned the original Scrubber.py and restarted, emails with
    attachments got delivered and also sent to archive. (Broken link to
    attachment still in archived message though).

    Mark Dale



    Mark Sapiro wrote:
    Did you remember to restart Mailman after replacing Scrubber.py?
  • Mark Sapiro at Nov 23, 2008 at 4:50 am

    Mark Dale wrote:
    I also noticed that with the new Scrubber.py, emails with attachments
    were neither delivered nor archived. Emails without attachments were okay.

    There is probably some incompatibility between the 2.1.9 Scrubber.py
    and the rest of your installation that threw an exception on mail with
    attachments. Check Mailman's error log and shunt queue.

    Once I returned the original Scrubber.py and restarted, emails with
    attachments got delivered and also sent to archive. (Broken link to
    attachment still in archived message though).

    I have been able to duplicate the problem with the 2.1.5 version of
    Scrubber.py. I'll try to come up with a simple patch that you can
    apply to fix it. It is fixed in recent Scrubber.py versions.

    The issue is as I thought. The scrubbed message is quoted-printable
    encoded, but the Content-Transfer-Encoding: header says 8bit and not
    quoted-printable so the message is not properly decoded for the
    archive.

    In my case, I could see other symptoms of this besides just the URL of
    the scrubbed attachment. For example, the "-- " separator before my
    signature was rendered as "-- =" followed by a blank line.

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • Mark Sapiro at Nov 23, 2008 at 5:01 am

    Mark Sapiro wrote:
    I have been able to duplicate the problem with the 2.1.5 version of
    Scrubber.py. I'll try to come up with a simple patch that you can
    apply to fix it. It is fixed in recent Scrubber.py versions.

    I think you'd be better off upgrading your Mailman, but if you want to
    try a patch to the 2.1.5 Scrubber.py, I think this should do it. It is
    only lightly tested, but I think it's OK.

    --- Scrubber.py 2008-11-22 20:21:38.375000000 -0800
    +++ Scrubberx.py 2008-11-22 20:54:47.250000000 -0800
    @@ -326,9 +326,8 @@
    # Now join the text and set the payload
    sep = _('-------------- next part --------------\n')
    del msg['content-type']
    - msg.set_payload(sep.join(text), charset)
    del msg['content-transfer-encoding']
    - msg.add_header('Content-Transfer-Encoding', '8bit')
    + msg.set_payload(sep.join(text), charset)
    return msg



    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • Mark Dale at Nov 23, 2008 at 9:48 pm
    Hi Mark

    Your patch has worked a treat. Thank you very much for your patience and
    generosity. Many of us would be a long way up the proverbial creek if it
    weren't for your support.

    In summary, for those interested:

    Using Mailman 2.1.5 - in the Archives, the URL to attachments was being
    malformed by it getting wrapped and an "=" sign inserted at the line break.

    This was fixed (albeit lightly tested so far) by replacing, in the
    Scrubber.py file, as follows:

    --------
    OLD CODE
    --------
    # Now join the text and set the payload
    sep = _('-------------- next part --------------\n')
    del msg['content-type']
    msg.set_payload(sep.join(text), charset)
    del msg['content-transfer-encoding']
    msg.add_header('Content-Transfer-Encoding', '8bit')
    return msg

    --------
    NEW CODE
    --------
    # Now join the text and set the payload
    sep = _('-------------- next part --------------\n')
    del msg['content-type']
    del msg['content-transfer-encoding']
    msg.set_payload(sep.join(text), charset)
    return msg


    cheers
    Mark



    Mark Sapiro wrote:
    Mark Sapiro wrote:
    I have been able to duplicate the problem with the 2.1.5 version of
    Scrubber.py. I'll try to come up with a simple patch that you can
    apply to fix it. It is fixed in recent Scrubber.py versions.

    I think you'd be better off upgrading your Mailman, but if you want to
    try a patch to the 2.1.5 Scrubber.py, I think this should do it. It is
    only lightly tested, but I think it's OK.

    --- Scrubber.py 2008-11-22 20:21:38.375000000 -0800
    +++ Scrubberx.py 2008-11-22 20:54:47.250000000 -0800
    @@ -326,9 +326,8 @@
    # Now join the text and set the payload
    sep = _('-------------- next part --------------\n')
    del msg['content-type']
    - msg.set_payload(sep.join(text), charset)
    del msg['content-transfer-encoding']
    - msg.add_header('Content-Transfer-Encoding', '8bit')
    + msg.set_payload(sep.join(text), charset)
    return msg
  • Mark Dale at Nov 23, 2008 at 10:34 am

    Mark Sapiro wrote:
    There is probably some incompatibility between the 2.1.9 Scrubber.py
    and the rest of your installation that threw an exception on mail with
    attachments. Check Mailman's error log and shunt queue.
    Yes, it was a shot in the dark replacing 2.1.5 Srubber with a 2.1.9
    There are a number of errors in the logfile e.g.

    Nov 23 03:05:27 2008 (1800) SHUNTING:
    1227409525.3357961+31e6f6226a3b5cbc4a8dadaf3f4dde064947822e

    Nov 23 03:07:07 2008 (1800) uncaught archiver exception at filepos: 0

    Nov 23 03:07:07 2008 (1800) Uncaught runner exception: 'module' object
    has no attribute 'SCRUBBER_USE_ATTACHMENT_FILENAME_EXTENSION'
    I have been able to duplicate the problem with the 2.1.5 version of
    Scrubber.py. I'll try to come up with a simple patch that you can
    apply to fix it.
    That would be brilliant!
    The issue is as I thought. The scrubbed message is quoted-printable
    encoded, but the Content-Transfer-Encoding: header says 8bit and not
    quoted-printable so the message is not properly decoded for the
    archive.
    Yes. Grant Taylor mentioned this. He made reference to format=flow being
    the go, rather that quoted=printable. I confess it's something I don't
    understand. I had thought f=f was something the email client decided.
    In my case, I could see other symptoms of this besides just the URL of
    the scrubbed attachment. For example, the "-- " separator before my
    signature was rendered as "-- =" followed by a blank line.
    Yes. I saw that also, I was pretending not to. ;-)

    cheers
    Mark Dale
  • Mark Sapiro at Nov 23, 2008 at 5:12 pm

    Mark Dale wrote:
    Mark Sapiro wrote:
    I have been able to duplicate the problem with the 2.1.5 version of
    Scrubber.py. I'll try to come up with a simple patch that you can
    apply to fix it.
    That would be brilliant!

    It is not clear if you saw my followup post with the patch. It's at
    <http://mail.python.org/pipermail/mailman-users/2008-November/064117.html>
    in case you missed it.

    The issue is as I thought. The scrubbed message is quoted-printable
    encoded, but the Content-Transfer-Encoding: header says 8bit and not
    quoted-printable so the message is not properly decoded for the
    archive.
    Yes. Grant Taylor mentioned this. He made reference to format=flow being
    the go, rather that quoted=printable. I confess it's something I don't
    understand. I had thought f=f was something the email client decided.

    Format=flowed and/or quoted-printable encoding are all things decided
    by a mail client or in this case by the Python email library.
    Scrubber.py is building a new, text/plain message containing the
    text/plain parts of the original messages and the notes and URLs for
    scrubbed parts. It creates the body by concatenating these parts
    separated by the '-------------- next part --------------\n' separator.

    It then calls an email message method to set this as the message body,
    and the email library decides on the appropriate
    Content-Transfer-Encoding and sets the appropriate header. The bug in
    Scrubber is that it removed that header and replaced it with
    "Content-Transfer-Encoding: 8bit".

    The current email library will create quoted-printable encoded parts
    where appropriate, but won't create format=flowed which is a good
    thing because the pipermail archiver doesn't understand it.

    --
    Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
    San Francisco Bay Area, California better use your sense - B. Dylan
  • Grant Taylor at Nov 20, 2008 at 7:04 am

    On 11/18/2008 06:52 PM, Mark Dale wrote:
    Somehow an "=" gets inserted into the URL (see snippet below), and
    the links returns "Private archive file not found".
    *nod*

    If I recall correctly when "quoted=printable" encoding is used, the "="
    at the end of the line by its self is an indication to the receiving MUA
    that the line is suppose to be unwrapped. I.e. remove the "=" from the
    end of the "...attachments/200" line and add the "81119..." line to the end.

    However if the quoted=printable MIME encoding is broken, things like
    this will happen.

    Will you please forward me one of the problem messages (as an attachment
    because I need to see the message source) and I will verify this.
    If I remove the "=" from the URL, all is well.
    *nod* This is as I would expect.

    Actually I bet that you are removing the "=" and adding the subsequent
    line to the end of the URL, which is what quoted=printable is suppose to
    do for you.
    Can anyone shed any light on this, and how I might correct things so
    the URL appears correctly (without the "=") in the archive?
    I think this is probably another small bug somewhere in Mailman in how
    it handles / folds lines of text. Format=flowed (my preference) suffers
    equally in older versions of Mailman.
    *nod*



    Grant. . . .

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmailman-users @
categoriespython
postedNov 19, '08 at 12:52a
activeNov 23, '08 at 9:48p
posts14
users4
websitelist.org

People

Translate

site design / logo © 2022 Grokbase