FAQ
I just checked the python site documentation on marshal and pickle and I
consider them to be irresponsibly and dangerously misleading.

For example. Suppose Mercurial is implemented using pickle.load (I sure
hope it isn't -- is it?).

1) I send someone a "patch" for their software claiming it makes their
package run faster.

2) That person uses mercurial to "unpack" the patch and mercurial uses
pickle.load.

BAM! That person's filesystem is GONE! AND I'M NOT ASSUMING
THAT THERE IS ANY BUG IN MERCURIAL!

Now: suppose Mercurial is implemented using marshal: no such scenario is
possible
unless there is a security bug in mercurial where they explicitly execute
something.

RESOLVED: pickle should come with a large red label:

WARNING: LARK'S VOMIT --
NEVER USE PICKLE TO IMPLEMENT UNTRUSTED ARCHIVING OF ANY KIND.

It doesn't have one.

Marshal needs no such label: but it has one:

*Warning:* The marshal module is not intended to be secure against erroneous
or maliciously constructed data. Never unmarshal data received from an
untrusted or unauthenticated source.

This is bullshit.

Sorry, for the french and the caps, but this is REALLY IMPORTANT.

-- Aaron Watters
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-ideas/attachments/20080305/2b718355/attachment.htm

Search Discussions

  • George Sakkis at Mar 5, 2008 at 3:25 pm

    On Wed, Mar 5, 2008 at 10:11 AM, Aaron Watters wrote:

    I just checked the python site documentation on marshal and pickle and I
    consider them to be irresponsibly and dangerously misleading.
    RESOLVED: pickle should come with a large red label:

    WARNING: LARK'S VOMIT --
    NEVER USE PICKLE TO IMPLEMENT UNTRUSTED ARCHIVING OF ANY KIND.

    It doesn't have one.
    So what is this [1] ?

    '''
    Warning: The pickle module is not intended to be secure against
    erroneous or maliciously constructed data. Never unpickle data
    received from an untrusted or unauthenticated source.
    '''

    You may want to check your facts better next time you go on a rampage.

    George

    [1] http://docs.python.org/lib/node314.html
  • Oleg Broytmann at Mar 5, 2008 at 3:27 pm

    On Wed, Mar 05, 2008 at 10:11:48AM -0500, Aaron Watters wrote:
    RESOLVED: pickle should come with a large red label:

    WARNING: LARK'S VOMIT --
    NEVER USE PICKLE TO IMPLEMENT UNTRUSTED ARCHIVING OF ANY KIND.
    http://docs.python.org/lib/node314.html

    "Warning: The pickle module is not intended to be secure against
    erroneous or maliciously constructed data. Never unpickle data received
    from an untrusted or unauthenticated source."

    Enough for me, though it is not as big or as red...

    Oleg.
    --
    Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru
    Programmers don't die, they just GOSUB without RETURN.
  • Aaron Watters at Mar 5, 2008 at 4:11 pm
    In response to Oleg and George.

    Yes apparently there is an acknowledgement in some subordinate page
    somewhere that there might be some problem with security and pickle. This
    should be on the first page in bold face like the unneeded one for marshal.
    I missed it just now because I just looked at the first page for marshal and
    pickle, like most people probably would, sorry.

    Also this line from the marshal doc has got to go:

    "For general persistence and transfer of Python objects through RPC calls,
    see the modules pickle <http://docs.python.org/lib/module-pickle.html> and
    shelve <http://docs.python.org/lib/module-shelve.html>. "
    http://docs.python.org/lib/module-marshal.html

    which should read
    "For RPC calls never use pickle."

    And the security warning for marshal benieth it should be removed because it
    is nonsense.

    The implication of the current documentation is that most of my public
    projects contain serious security holes when they don't.
    And if you don't read the documentation carefully (like the implementers of
    Plone apparently didn't) the docs seem to suggest
    that pickle is somehow "safer" when it is about as unsafe as it could be.

    -- Aaron Watters
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: http://mail.python.org/pipermail/python-ideas/attachments/20080305/b49dd673/attachment.htm
  • Guido van Rossum at Mar 5, 2008 at 5:36 pm
    I'm assuming that someone confronted you with this security issue
    somehow? Otherwise I don't understand why you'd be so upset about it.

    BTW the warning for marshal is legit -- the C code that unpacks
    marshal data has not been carefully analyzed against buffer overflows
    and so on. Remember the first time someone broke into a system through
    a malicious JPEG? The same could happen with marshal. Seriously.

    I agree that the pickle module's warning needs to be moved to a more
    prominent place (Georg has probably aready done this by the time I'm
    finished typing this message :-). But I see no reason to get so upset
    about it as to use all caps.

    --Guido
    On Wed, Mar 5, 2008 at 8:11 AM, Aaron Watters wrote:
    In response to Oleg and George.

    Yes apparently there is an acknowledgement in some subordinate page
    somewhere that there might be some problem with security and pickle. This
    should be on the first page in bold face like the unneeded one for marshal.
    I missed it just now because I just looked at the first page for marshal and
    pickle, like most people probably would, sorry.

    Also this line from the marshal doc has got to go:

    "For general persistence and transfer of Python objects through RPC calls,
    see the modules pickle and shelve. "
    http://docs.python.org/lib/module-marshal.html

    which should read
    "For RPC calls never use pickle."

    And the security warning for marshal benieth it should be removed because it
    is nonsense.

    The implication of the current documentation is that most of my public
    projects contain serious security holes when they don't.
    And if you don't read the documentation carefully (like the implementers of
    Plone apparently didn't) the docs seem to suggest
    that pickle is somehow "safer" when it is about as unsafe as it could be.

    -- Aaron Watters


    _______________________________________________
    Python-ideas mailing list
    Python-ideas at python.org
    http://mail.python.org/mailman/listinfo/python-ideas


    --
    --Guido van Rossum (home page: http://www.python.org/~guido/)
  • Georg Brandl at Mar 5, 2008 at 7:02 pm

    Guido van Rossum schrieb:
    I'm assuming that someone confronted you with this security issue
    somehow? Otherwise I don't understand why you'd be so upset about it.

    BTW the warning for marshal is legit -- the C code that unpacks
    marshal data has not been carefully analyzed against buffer overflows
    and so on. Remember the first time someone broke into a system through
    a malicious JPEG? The same could happen with marshal. Seriously.

    I agree that the pickle module's warning needs to be moved to a more
    prominent place (Georg has probably aready done this by the time I'm
    finished typing this message :-). But I see no reason to get so upset
    about it as to use all caps.
    I used the time machine :)

    Though the warning is at the same location in
    <http://docs.python.org/dev/library/pickle>, since all pickle docs are
    on the same page it's visible enough in my opinion.

    cheers,
    Georg
  • Arne Babenhauserheide at Mar 5, 2008 at 7:46 pm
    I'd also agree, that the warning should be really prominent (especially since
    I just saw someone saying "for game states: Just pickle them", which could
    result in people getting problems when they get a mail saying "hey, look, I
    got to the 14th level"), but I don't think the warning was irresponsibly
    small.

    At least I saw it, when I began to learn python (but I had forgotten it until
    now).

    Maybe it could be replaced by yaml at some point, though, which offers a mode
    that doesn't execute everything (safe_load):

    http://pyyaml.org/wiki/PyYAMLDocumentation#LoadingYAML

    "safe_load(stream) parses the given stream and returns a Python object
    constructed from for the first document in the stream. If there are no
    documents in the stream, it returns None. safe_load recognizes only standard
    YAML tags and cannot construct an arbitrary Python object."

    And there's also a C implementation: http://pyyaml.org/browser/libyaml/trunk
    Which can be relicensed under the Python License:
    http://pyyaml.org/browser/libyaml/trunk/LICENSE

    Or pickle could get a safe_load function itself (if it doesn't yet have it).

    Best wishes,
    Arne


    El Wednesday, 5 de March de 2008 18:36:56 Guido van Rossum escribi?:
    I'm assuming that someone confronted you with this security issue
    somehow? Otherwise I don't understand why you'd be so upset about it.

    BTW the warning for marshal is legit -- the C code that unpacks
    marshal data has not been carefully analyzed against buffer overflows
    and so on. Remember the first time someone broke into a system through
    a malicious JPEG? The same could happen with marshal. Seriously.

    I agree that the pickle module's warning needs to be moved to a more
    prominent place (Georg has probably aready done this by the time I'm
    finished typing this message :-). But I see no reason to get so upset
    about it as to use all caps.

    --Guido
    On Wed, Mar 5, 2008 at 8:11 AM, Aaron Watters wrote:
    In response to Oleg and George.

    Yes apparently there is an acknowledgement in some subordinate page
    somewhere that there might be some problem with security and pickle.
    This should be on the first page in bold face like the unneeded one for
    marshal. I missed it just now because I just looked at the first page for
    marshal and pickle, like most people probably would, sorry.

    Also this line from the marshal doc has got to go:

    "For general persistence and transfer of Python objects through RPC
    calls, see the modules pickle and shelve. "
    http://docs.python.org/lib/module-marshal.html

    which should read
    "For RPC calls never use pickle."

    And the security warning for marshal benieth it should be removed because
    it is nonsense.

    The implication of the current documentation is that most of my public
    projects contain serious security holes when they don't.
    And if you don't read the documentation carefully (like the implementers
    of Plone apparently didn't) the docs seem to suggest
    that pickle is somehow "safer" when it is about as unsafe as it could be.

    -- Aaron Watters


    _______________________________________________
    Python-ideas mailing list
    Python-ideas at python.org
    http://mail.python.org/mailman/listinfo/python-ideas


    --
    Unpolitisch sein
    Hei?t politisch sein
    Ohne es zu merken.
    - Arne Babenhauserheide ( http://draketo.de )
    -- Weblog: http://blog.draketo.de

    -- Mein ?ffentlicher Schl?ssel (PGP/GnuPG):
    http://draketo.de/inhalt/ich/pubkey.txt
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: not available
    Type: application/pgp-signature
    Size: 189 bytes
    Desc: This is a digitally signed message part.
    Url : http://mail.python.org/pipermail/python-ideas/attachments/20080305/56b93a26/attachment.pgp
  • Greg Ewing at Mar 5, 2008 at 10:24 pm

    Guido van Rossum wrote:
    BTW the warning for marshal is legit -- the C code that unpacks
    marshal data has not been carefully analyzed against buffer overflows
    and so on.
    I thought the main issue with marshal is that it's happy
    to create code objects, which pickle doesn't do -- ostensibly
    for security reasons.

    But if pickle is inherently insecure anyway, does the
    exclusion of code objects really make much difference?

    BTW, I only consider pickle suitable for quick and dirty
    uses anyway, because it ties the external representation very
    closely to internal details of your program, which can make
    it difficult to evolve the program without invalidating
    previously written files.

    For long-term use, it's better to invest time in a
    properly-thought-out external format for the task, designed
    with extensibility in mind.

    --
    Greg
  • Aahz at Mar 6, 2008 at 8:25 pm

    On Thu, Mar 06, 2008, Greg Ewing wrote:
    BTW, I only consider pickle suitable for quick and dirty uses anyway,
    because it ties the external representation very closely to internal
    details of your program, which can make it difficult to evolve the
    program without invalidating previously written files.

    For long-term use, it's better to invest time in a
    properly-thought-out external format for the task, designed with
    extensibility in mind.
    Maybe so, but my company has been using pickle as a primary long-term
    storage mechanism for more than a decade. We only rarely have problems
    with code changes causing pickle problems (less than once per year).
    OTOH, we mostly only have a growing internal format -- we almost never
    change the internal format.
    --
    Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/

    "All problems in computer science can be solved by another level of
    indirection." --Butler Lampson
  • Leonardo Santagada at Mar 6, 2008 at 9:00 pm

    On 06/03/2008, at 17:25, Aahz wrote:

    Maybe so, but my company has been using pickle as a primary long-term
    storage mechanism for more than a decade. We only rarely have
    problems
    with code changes causing pickle problems (less than once per year).
    OTOH, we mostly only have a growing internal format -- we almost never
    change the internal format.

    And then tons of companies are using ZODB wich uses pickle... so no
    biggie... but using pickle directly can be a problem if you have lots
    of data.

    --
    Leonardo Santagada
  • Leonardo Santagada at Mar 5, 2008 at 6:12 pm

    On 05/03/2008, at 13:11, Aaron Watters wrote:

    like the implementers of Plone apparently didn't

    I know this is just a strange angry rant, but why do you say this? Do
    you mean that ZODB and ZRPC should be implemented using marshal? Can
    this even be done?

    Now if you want to do secure pickles, just sign them with a cripto
    method (completely secure). Simple and I think, the only way to do
    this being secure. Or using a secure transport layer like ssl to
    transfer things with signatures to identify your peers. Not a reason
    to rant, but to install the crypto package.

    --
    Leonardo Santagada
  • Aaron Watters at Mar 5, 2008 at 7:03 pm
    What follows is a brief summary of offline discussions with Guido and
    Leonardo (I hope represented correctly, please complain if not):

    Guido pointed out that previous versions of marshal could crash python.

    I replied that that is a bug and all known instances have been fixed.
    Pickle executes arbitrary code by design -- which is much worse than just
    crashing a program.

    Leonardo mentioned that pickle security concerns could be addressed using
    crypto tricks.

    I replied that I would be comfortable unmarshalling a file from a known
    hostile party -- no crypto verification required, because the worst that
    could happen is that it would crash the interpreter. With pickle I'd be
    handing my keyboard to a villian.

    In summary: I think marshal.loads(s) is just as safe as unicode(s) or
    file.read(). pickle.loads(s) is morally equivalant to __import__(s) or
    eval(s).
    I think the security warning for marshal and the implied recommendation that
    pickle is okay for RPC should be removed.

    alright already, 'nuff said. whatever. -- Aaron Watters
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: http://mail.python.org/pipermail/python-ideas/attachments/20080305/bc3aa705/attachment-0001.htm
  • Greg Ewing at Mar 5, 2008 at 10:29 pm

    Aaron Watters wrote:

    In summary: I think marshal.loads(s) is just as safe as unicode(s) or
    file.read(). pickle.loads(s) is morally equivalant to __import__(s) or
    eval(s).
    According to the docs, you can use a customised unpickler
    to restrict the set of things it can use as constructors.
    It might be worth mentioning that in a prominent place near
    the security warning as well.

    --
    Greg
  • Leonardo Santagada at Mar 6, 2008 at 1:33 am

    On 05/03/2008, at 16:03, Aaron Watters wrote:
    Guido pointed out that previous versions of marshal could crash
    python.

    I replied that that is a bug and all known instances have been
    fixed. Pickle executes arbitrary code by design -- which is much
    worse than just crashing a program.
    Just read carefully what Guido said, if there is a bug it can not just
    crash your program, it can execute any kind of code, as bad or even
    worse than pickle... that is what is called a buffer overflow

    Talking about it the pypy project has a directory somewhere with lots
    of snippets of ways to crash cpython... Not just the set recursion
    limit and overflow the stack one.
    Leonardo mentioned that pickle security concerns could be addressed
    using crypto tricks.
    For some uses, for others some modified version of pure python pickle
    could be used, so you have a controled and almost safe pickle.
    I replied that I would be comfortable unmarshalling a file from a
    known hostile party -- no crypto verification required, because the
    worst that could happen is that it would crash the interpreter.
    With pickle I'd be handing my keyboard to a villian.

    In summary: I think marshal.loads(s) is just as safe as unicode(s)
    or file.read(). pickle.loads(s) is morally equivalant to
    __import__(s) or eval(s).
    No marshall load do lots of stuff in pure unverified C code...
    anything could happen, as guido pointed out.
    I think the security warning for marshal and the implied
    recommendation that pickle is okay for RPC should be removed.

    No, AFAIK marshal can only load ints and simple objects... and that
    will give you a very poor rpc (for example it could never be used to
    replace pickle as it is used in ZODB and ZRPC).

    --
    Leonardo Santagada
  • Aaron Watters at Mar 6, 2008 at 5:40 pm
    On Wed, Mar 5, 2008 at 8:33 PM, Leonardo Santagada wrote:
    On 05/03/2008, at 16:03, Aaron Watters wrote:
    Guido pointed out that previous versions of marshal could crash
    python.

    I replied that that is a bug and all known instances have been
    fixed. Pickle executes arbitrary code by design -- which is much
    worse than just crashing a program.
    Just read carefully what Guido said, if there is a bug it can not just
    crash your program, it can execute any kind of code, as bad or even
    worse than pickle... that is what is called a buffer overflow

    I'd like to know the actual number of successful
    buffer overflow attacks that have ever happened on the planet in the wild.
    Maybe one? Okay, according to Wikipedia there have been 4. I don't really
    know but I think an overflowing buffer in marshal is not very likely to be
    somewhere
    near where a code segment could jump to because almost everything
    in marshal is dynamically
    allocated. The known attacks have been where the arrays were in static
    locations,
    I believe.

    And it's not worse than pickle because pickle is perfectly capable of
    compiling and
    loading an assembly language component without you knowing anything about
    it.
    Pickle can do anything that the computer can do.

    Also it's not worse than pickle because you have to be a highly experienced
    and
    perverted assembly language programmer to construct
    an overflow attack and there has to be a bug in
    marshal to allow it. To abuse pickle requires almost no skill at all, and
    you
    don't have to be perverted, you just have to be stupid. In fact pickle is
    designed
    to execute arbitrary code, and even documented.

    For all I know it's just as feasible to stage buffer overflow attacks in
    many other
    places in python as it is in marshal -- like maybe
    unicode.join or anyplace else where an array
    is constructed. Which is to say it's not very feasible in those places
    either.

    I was clearly off my medication to start this discussion. I suppose
    misleading
    people into thinking marshal is dangerous is better than suggesting pickle
    is safe.
    Peace and love everyone. bye now.

    -- Aaron Watters
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: http://mail.python.org/pipermail/python-ideas/attachments/20080306/a3ca741d/attachment.htm
  • Christian Heimes at Mar 6, 2008 at 5:59 pm

    Leonardo Santagada wrote:
    I replied that that is a bug and all known instances have been
    fixed. Pickle executes arbitrary code by design -- which is much
    worse than just crashing a program.
    Just read carefully what Guido said, if there is a bug it can not just
    crash your program, it can execute any kind of code, as bad or even
    worse than pickle... that is what is called a buffer overflow
    marshal is *ONLY* designed to store and load trusted pyc files. It's not
    desinged for anything else. It *CAN* be used for simple stuff, too. But
    it doesn't support fancy stuff and it can easily be broken. IIRC it
    doesn't support nested structured like a list containing a reference to
    itself. Use it on your own risk.

    Christian
  • Boris Borcic at Mar 6, 2008 at 12:45 pm
    Aaron Watters wrote:
    [...]
    Sorry, for the french and the caps, but this is REALLY IMPORTANT.
    I see nothing french in your post.

    BB
  • Christian Heimes at Mar 6, 2008 at 5:56 pm

    Boris Borcic wrote:
    Aaron Watters wrote:
    [...]
    Sorry, for the french and the caps, but this is REALLY IMPORTANT.
    I see nothing french in your post.
    http://en.wikipedia.org/wiki/Pardon_my_French

    HTH

    Christian
  • Neil Toronto at Mar 6, 2008 at 7:17 pm

    Boris Borcic wrote:
    Aaron Watters wrote:
    [...]
    Sorry, for the french and the caps, but this is REALLY IMPORTANT.
    I see nothing french in your post.
    I could have sworn I read something like "I fart in your general direction".

    Neil

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-ideas @
categoriespython
postedMar 5, '08 at 3:11p
activeMar 6, '08 at 9:00p
posts19
users12
websitepython.org

People

Translate

site design / logo © 2019 Grokbase