FAQ

[Clojure] ANN: deep-freeze serialization library

Timothy Baldridge
Dec 30, 2011 at 6:00 am
A few months back I released 1.0 of deep-freeze, a binary
serialization library for Clojure. Due to recent additions by Peter
Taoussanis I thought it would be about time to let some more people
know about this project, and bump the version number to 1.2

deep-freeze is a simple serialization library that aims to be fast,
generate concise data, and support as many Clojure structures as
possible. Currently it outperforms read-string/print-str by quite a
margin, supports (optionally) the "Snappy" Google compression library,
and supports atoms, refs, and the standard Clojure structures. Support
for deftype and defrecord, is not in yet, but is on the list.

The data structure should not be considered stable by any means, as we
will continue to optimize it as needed. But the actual API calls have
stayed the same since version 1.0, and to be honest, it's such a
simple lib, it can probably be copy and pasted into any lib anyone
chooses.

At any rate I thought it might come in useful for those experimenting
with 0MQ, distributed computing, etc.

Timothy Baldridge

--
“One of the main causes of the fall of the Roman Empire was
that–lacking zero–they had no way to indicate successful termination
of their C programs.”
(Robert Firth)

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
reply

Search Discussions

13 responses

  • James Reeves at Dec 30, 2011 at 2:38 pm

    On 30 December 2011 06:00, Timothy Baldridge wrote:
    A few months back I released 1.0 of deep-freeze, a binary
    serialization library for Clojure.
    You might want to provide a link to the project home page :)
    deep-freeze is a simple serialization library that aims to be fast,
    generate concise data, and support as many Clojure structures as
    possible. Currently it outperforms read-string/print-str by quite a
    margin, supports (optionally) the "Snappy" Google compression library,
    and supports atoms, refs, and the standard Clojure structures. Support
    for deftype and defrecord, is not in yet, but is on the list.
    How are atoms/refs supported?

    It would be nice if there was an option to make deep-freeze
    interchangable with read/pr.

    - James

    --
    You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to clojure@googlegroups.com
    Note that posts from new members are moderated - please be patient with your first post.
    To unsubscribe from this group, send email to
    clojure+unsubscribe@googlegroups.com
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en
  • Timothy Baldridge at Dec 30, 2011 at 2:59 pm

    On Fri, Dec 30, 2011 at 8:38 AM, James Reeves wrote:
    On 30 December 2011 06:00, Timothy Baldridge wrote:
    A few months back I released 1.0 of deep-freeze, a binary
    serialization library for Clojure.
    You might want to provide a link to the project home page :)
    Wow, I can't believe I forgot that. Anyway, here's the link. And it's
    also on clojars.

    https://github.com/halgari/deep-freeze

    Timothy

    --
    You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to clojure@googlegroups.com
    Note that posts from new members are moderated - please be patient with your first post.
    To unsubscribe from this group, send email to
    clojure+unsubscribe@googlegroups.com
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en
  • Peter Taoussanis at Dec 30, 2011 at 5:42 pm
    How are atoms/refs supported?
    Very simply: they're just dereferenced during freezing and that value
    is reinserted into an atom/ref/whatever during thawing. Any metadata
    is also retained.
    It would be nice if there was an option to make deep-freeze
    interchangable with read/pr.
    Hmm- I don't follow. Interchangeable how? Could you explain what use
    you have in mind?

    --
    Peter

    --
    You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to clojure@googlegroups.com
    Note that posts from new members are moderated - please be patient with your first post.
    To unsubscribe from this group, send email to
    clojure+unsubscribe@googlegroups.com
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en
  • James Reeves at Dec 30, 2011 at 6:31 pm

    On 30 December 2011 17:42, Peter Taoussanis wrote:
    It would be nice if there was an option to make deep-freeze
    interchangable with read/pr.
    Hmm- I don't follow. Interchangeable how? Could you explain what use
    you have in mind?
    Well, currently it's not directly interchangeable because deep-freeze
    dereferences atoms and refs, whilst the core read and pr functions do
    not.

    I think it might be more useful if freeze/thaw worked on the same
    domain as read/pr by default. That way I can swap in and out different
    serialization functions without altering the behaviour of the
    application.

    I'm also not really sold on serializing atoms and refs. If I serialize
    a data structure like [1 2 3], then when I deserialize it I get back a
    structure that can be substituted for the original. But if I serialize
    an atom, then when I deserialize it I get a completely different atom
    containing the same data, one which cannot be substituted for the
    original.

    I guess what I'd like a mode where I can say "act like read/pr" and
    for deep-freeze to ignore metadata, refs and atoms.

    - James

    --
    You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to clojure@googlegroups.com
    Note that posts from new members are moderated - please be patient with your first post.
    To unsubscribe from this group, send email to
    clojure+unsubscribe@googlegroups.com
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en
  • Timothy Baldridge at Dec 30, 2011 at 6:54 pm

    I think it might be more useful if freeze/thaw worked on the same
    domain as read/pr by default. That way I can swap in and out different
    serialization functions without altering the behaviour of the
    application.
    Well, from my view, read-string/print-str is completely broken when it
    comes to refs:

    user=>(read-string (print-str (atom "foo")))
    RuntimeException Unreadable form clojure.lang.Util.runtimeException
    (Util.java:156)

    deep-freeze attempts to make all data structures round-trip for uses
    of writing to disk, writing to the network, etc. Making deep-freeze
    behave like read-string/print-str is not really the goal here. If, at
    some point, Clojure re-writes it's read/print functions to round-trip,
    then we could look at better compatibility. I'm open to suggestions
    here, but simply spitting out "#<Atom@10987197: foo>" is not an
    acceptable answer.

    Timothy

    --
    You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to clojure@googlegroups.com
    Note that posts from new members are moderated - please be patient with your first post.
    To unsubscribe from this group, send email to
    clojure+unsubscribe@googlegroups.com
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en
  • James Reeves at Dec 30, 2011 at 9:49 pm

    On 30 December 2011 18:54, Timothy Baldridge wrote:
    I think it might be more useful if freeze/thaw worked on the same
    domain as read/pr by default. That way I can swap in and out different
    serialization functions without altering the behaviour of the
    application.
    Well, from my view, read-string/print-str is completely broken when it
    comes to refs:

    user=>(read-string (print-str (atom "foo")))
    RuntimeException Unreadable form  clojure.lang.Util.runtimeException
    (Util.java:156)
    See, I kinda think that's exactly the behaviour it should have: if you
    cannot serialize something, then throw an exception.

    However, apart from that, deep-freeze looks like a really good
    library, and does almost everything right. I'm just nit-picking over a
    few details.

    - James

    --
    You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to clojure@googlegroups.com
    Note that posts from new members are moderated - please be patient with your first post.
    To unsubscribe from this group, send email to
    clojure+unsubscribe@googlegroups.com
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en
  • Peter Taoussanis at Dec 31, 2011 at 4:36 am
    I guess what I'd like a mode where I can say "act like read/pr" and> for deep-freeze to ignore metadata, refs and atoms.
    I'm still not sure I'm getting this argument though. In its current
    form, deep-freeze makes an attempt to preserve as much information as
    it can. In the specific case of STM/metadata it happens to preserve a
    little more information than read/pr.

    Now I can understand the desire for a drop-in replacement, but why
    would you be calling read/pr on refs and atoms in the first place if
    it'd be throwing an exception? Presumably you're thinking here in
    terms of processing Clojure data outside of your control?

    I just worry that attempting to keep parity with read/pr will require
    undue effort in the long-term. For example, the ref/atom handling is
    obviously different to read/pr now: but maybe other types are
    different in subtle ways too? It'd be necessary to introduce a
    consistency test between the two, and that would require maintenance
    down the line. And what would happen when performance concerns
    conflict with consistency ones?

    Sure, it'd be possible- but I'm just wondering how widespread the
    desire is that serialization be one-to-one with read/pr? Will people
    often be switching back and forth between the two? Why? (Sincere
    question- maybe I'm missing something!)

    (Just my 2c BTW: it's Timothy's call)

    --
    Peter

    --
    You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to clojure@googlegroups.com
    Note that posts from new members are moderated - please be patient with your first post.
    To unsubscribe from this group, send email to
    clojure+unsubscribe@googlegroups.com
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en
  • Meikel Brandmeyer at Dec 31, 2011 at 8:08 am
    Hi,

    Am 31.12.2011 um 05:36 schrieb Peter Taoussanis:
    I guess what I'd like a mode where I can say "act like read/pr" and> for deep-freeze to ignore metadata, refs and atoms.
    I'm still not sure I'm getting this argument though. In its current
    form, deep-freeze makes an attempt to preserve as much information as
    it can. In the specific case of STM/metadata it happens to preserve a
    little more information than read/pr.
    I think, what James wants to say is: serialising reference types is non-trivial. Reference types are identities. So the instance itself (as in identical?) carries information. When you have references to a ref you have to make sure that they all refer to the same ref again after thawing. Let's say you have a data structure like this:

    (let [left-and-right (ref :state)]
    {:left left-and-right :right left-and-right})

    What is the state of this map after thawing? Does it refer to the same ref? Or different ones? If it's different ones, your program is broken now. If left-and-right was a value, the program would now need more memory, but it would be still ok.

    Looking at the source of deep-freeze your program will be broken after a round trip. You'll need to add a marker for the identity and kind of memoize the thaw function for refs on this marker, so that it returns the same ref again. Ironically pr provides this information already. So I would rather say that pr preserves more information than deep-freeze in this situation.

    BTW: Do you wrap everything in a dosync when freezing a data structure? If not, you cannot guarantee that refs are frozen in a consistent state.

    Many, many pitfalls. Freezing and thawing of reference types play in a different league than freezing and thawing of values. There needs more thought to go into that. (And the fact that read/pr doesn't support reference types (yet), should be a hint that this is not a simple puzzle for an afternoon.)

    Just my 0.02€.

    Sincerely
    Meikel

    --
    You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to clojure@googlegroups.com
    Note that posts from new members are moderated - please be patient with your first post.
    To unsubscribe from this group, send email to
    clojure+unsubscribe@googlegroups.com
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en
  • Peter Taoussanis at Dec 31, 2011 at 9:29 am
    I think, what James wants to say is: serialising reference types is non-trivial. Reference types are identities. So the instance itself (as in identical?) carries information. When you have references to a ref you have to make sure that they all refer to the same ref again after thawing.
    Okay, gotcha! Actually your example was perfect: yes, this is clearly
    a problem. Indeed I hadn't put much thought into it since I personally
    have no use for de/serialization of STM objects.

    I think the identity issue should be solvable but before opening that
    door: is this something anyone _wants_ solved? Is there a good use-
    case for trying to freeze STM objects?

    What do you think, Timothy? Maybe best to disable the STM stuff for
    now to avoid possible confusion?

    --
    Peter

    --
    You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to clojure@googlegroups.com
    Note that posts from new members are moderated - please be patient with your first post.
    To unsubscribe from this group, send email to
    clojure+unsubscribe@googlegroups.com
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en
  • James Reeves at Dec 31, 2011 at 3:32 pm

    On 31 December 2011 08:08, Meikel Brandmeyer wrote:
    I think, what James wants to say is: serialising reference types is non-trivial. Reference types are identities. So the instance itself (as in identical?) carries information. When you have references to a ref you have to make sure that they all refer to the same ref again after thawing. Let's say you have a data structure like this:

    (let [left-and-right (ref :state)]
    {:left left-and-right :right left-and-right})

    What is the state of this map after thawing? Does it refer to the same ref? Or different ones? If it's different ones, your program is broken now. If left-and-right was a value, the program would now need more memory, but it would be still ok.
    This is pretty much what I was trying to say, but Meikel puts it far
    more clearly than I managed to.

    Serializing refs just seems such a briar patch that I'd prefer an
    exception be raised so I know something is wrong.

    - James

    --
    You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to clojure@googlegroups.com
    Note that posts from new members are moderated - please be patient with your first post.
    To unsubscribe from this group, send email to
    clojure+unsubscribe@googlegroups.com
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en
  • Timothy Baldridge at Dec 31, 2011 at 5:44 pm
    First of all, I've updated the README.md to show the license. It's a
    BSD 3 clause license.

    Yeah, I hadn't thought of the issue of the refs, but I don't think
    it's a exceptionally hard problem to solve. What we are discussing is
    basically the same thing that mutable languages deal with every day.
    At work I'm forced to work with a business object library known as
    CSLA.NET and although most of the library is the best example of
    complected code I've ever seen, the serialization isn't all that bad.

    Basically you follow this method:

    Serialize the tree putting each ref into a list. Then you write the
    indexed id of that ref instead of the contents of that ref. When you
    are done serializing the object, you loop through every object in the
    ref list and serialize the contents of the refs. The same pattern is
    followed each time. If the ref is in the list, you write the id (index
    in the list) of the ref. If it's not in the list, you add it to the
    list and write the id.

    This sort of serialization is needed in mutable languages such as C#
    because it's fairly easy to create loops in the data that would cause
    a simpler serializer to enter a infinite loop.

    Does anyone see any issues with this method?

    Timothy

    --
    You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to clojure@googlegroups.com
    Note that posts from new members are moderated - please be patient with your first post.
    To unsubscribe from this group, send email to
    clojure+unsubscribe@googlegroups.com
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en
  • James Reeves at Jan 1, 2012 at 2:09 pm

    On 31 December 2011 17:44, Timothy Baldridge wrote:
    Yeah, I hadn't thought of the issue of the refs, but I don't think
    it's a exceptionally hard problem to solve.
    It's not really possible to solve completely.

    If I serialize, then deserialize an immutable data structure, then for
    all intents and purposes I'm left with something equivalent to the
    original.

    If I serialize and deserialize a ref, then what I have is a completely
    different ref that happens to contain the same data structure. I
    cannot use this new ref as a substitute for the original, as I could
    with a normal data structure.

    This is not to say that serializing a ref is not useful, just that we
    can't put it in the same category as a data structure. I'd like a way
    to distinguish between data that can be serialized, like maps or
    vectors, and data that can only be "sort-of" serialized, like refs.

    - James

    --
    You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to clojure@googlegroups.com
    Note that posts from new members are moderated - please be patient with your first post.
    To unsubscribe from this group, send email to
    clojure+unsubscribe@googlegroups.com
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en
  • Glen Stampoultzis at Dec 31, 2011 at 5:46 am

    On 30 December 2011 17:00, Timothy Baldridge wrote:

    A few months back I released 1.0 of deep-freeze, a binary
    serialization library for Clojure. Due to recent additions by Peter
    Taoussanis I thought it would be about time to let some more people
    know about this project, and bump the version number to 1.2
    I was just wondering what the license for this project is. I couldn't find
    anything at the github page.

    --
    You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to clojure@googlegroups.com
    Note that posts from new members are moderated - please be patient with your first post.
    To unsubscribe from this group, send email to
    clojure+unsubscribe@googlegroups.com
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en

Related Discussions