Here's an interesting situation I am faced with as I port Mailman 3 to Python
3. I haven't seen any other discussion of it, so I thought I'd post here for
posterity.


Let's say you have a pickle created in Python 2 that is to be read in Python
3. In Mailman 2, persistent mailing list data is stored in pickles.


It seems like both Python 2 types (unicode and str/bytes) get unpickled as
Python 3 str types. It makes sense because:


* Python 2 unicode should unpickle as Python 3 str
* In Python 2, bytes are just an alias for str
* There's no way to know the intent of whether Python 2 "bytes" should be
   unpickled as Python 3 bytes or str.


Code:


-----put.py-----
from six.moves.cPickle import dump


d = {
     b'b': b'b',
     u'u': u'u',
     's': 's',
     }


with open('/tmp/foo.pck', 'wb') as fp:
     dump(d, fp)
-----put.py-----


-----get.py-----
from pprint import pprint
from six.moves.cPickle import load


with open('/tmp/foo.pck', 'rb') as fp:
     d = load(fp)


pprint(d)
-----get.py-----


$ python2 /tmp/put.py
$ python3 /tmp/get.py
{'b': 'b', 's': 's', 'u': 'u'}
$ python2 /tmp/get.py
{'b': 'b', 's': 's', u'u': u'u'}


Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-porting/attachments/20141216/9826d73a/attachment.sig>

Search Discussions

  • Benjamin Peterson at Dec 17, 2014 at 1:32 am

    On Tue, Dec 16, 2014, at 20:26, Barry Warsaw wrote:
    Here's an interesting situation I am faced with as I port Mailman 3 to
    Python
    3. I haven't seen any other discussion of it, so I thought I'd post here
    for
    posterity.

    Let's say you have a pickle created in Python 2 that is to be read in
    Python
    3. In Mailman 2, persistent mailing list data is stored in pickles.

    It seems like both Python 2 types (unicode and str/bytes) get unpickled
    as
    Python 3 str types.

    You can change this by passing encoding="bytes" to pickle.loads. See
    https://docs.python.org/3/library/pickle.html#pickle.load

    * Python 2 unicode should unpickle as Python 3 str
    * In Python 2, bytes are just an alias for str
    * There's no way to know the intent of whether Python 2 "bytes" should be
    unpickled as Python 3 bytes or str.

    Yeah, this is problematic. Probably the best you can do is unpickle
    everything with bytes then manually decode actual string things into
    str.
  • Gregory P. Smith at Dec 22, 2014 at 4:50 pm
    Or update your Python 2 code to save its data in a proper well defined
    language agnostic portable data format instead of a pickle. Pickles are
    definitely a problem. My own advice is to never use them for anything you
    don't mind throwing away (ie: simple caches).


    On Tue Dec 16 2014 at 5:32:55 PM Benjamin Peterson wrote:

    On Tue, Dec 16, 2014, at 20:26, Barry Warsaw wrote:
    Here's an interesting situation I am faced with as I port Mailman 3 to
    Python
    3. I haven't seen any other discussion of it, so I thought I'd post here
    for
    posterity.

    Let's say you have a pickle created in Python 2 that is to be read in
    Python
    3. In Mailman 2, persistent mailing list data is stored in pickles.

    It seems like both Python 2 types (unicode and str/bytes) get unpickled
    as
    Python 3 str types.
    You can change this by passing encoding="bytes" to pickle.loads. See
    https://docs.python.org/3/library/pickle.html#pickle.load
    * Python 2 unicode should unpickle as Python 3 str
    * In Python 2, bytes are just an alias for str
    * There's no way to know the intent of whether Python 2 "bytes" should be
    unpickled as Python 3 bytes or str.
    Yeah, this is problematic. Probably the best you can do is unpickle
    everything with bytes then manually decode actual string things into
    str.
    _______________________________________________
    Python-porting mailing list
    Python-porting at python.org
    https://mail.python.org/mailman/listinfo/python-porting
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/python-porting/attachments/20141222/0e46664c/attachment.html>
  • Neal Becker at Dec 23, 2014 at 4:39 pm

    Gregory P. Smith wrote:


    Or update your Python 2 code to save its data in a proper well defined
    language agnostic portable data format instead of a pickle. Pickles are
    definitely a problem. My own advice is to never use them for anything you
    don't mind throwing away (ie: simple caches).

    Such as?
  • Gregory P. Smith at Dec 23, 2014 at 8:12 pm
    JSON is popular and well supported by virtually everyone. Myself, I prefer
    data formats with an explicit schema. But those are much heavier weight
    (Protocol Buffers, Thrift, etc).


    You can bolt a schema onto JSON. Any code serializing as such effectively
    already is. Unless they declare the structure, field names and types
    somewhere the definition is just the implementation code... which tends to
    change over time without necessarily thinking about the schema, leading to
    problems. Thus why I like having a well defined schema for storage. :)


    On Tue Dec 23 2014 at 8:39:35 AM Neal Becker wrote:

    Gregory P. Smith wrote:
    Or update your Python 2 code to save its data in a proper well defined
    language agnostic portable data format instead of a pickle. Pickles are
    definitely a problem. My own advice is to never use them for anything you
    don't mind throwing away (ie: simple caches).
    Such as?


    _______________________________________________
    Python-porting mailing list
    Python-porting at python.org
    https://mail.python.org/mailman/listinfo/python-porting
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/python-porting/attachments/20141223/962e4072/attachment.html>
  • Barry Warsaw at Dec 23, 2014 at 8:52 pm
    On Dec 23, 2014, at 08:12 PM, Gregory P. Smith wrote:

    JSON is popular and well supported by virtually everyone.

    AFAICT, JSON won't preserve the differences between bytes and unicodes in
    Python 2, so it also can't restore the equivalent types to Python 3.


    -Barry

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-porting @
categoriespython
postedDec 17, '14 at 1:26a
activeDec 23, '14 at 8:52p
posts6
users4
websitepython.org

People

Translate

site design / logo © 2018 Grokbase