FAQ
Given that attachments are seemingly stored as key/value pairs within a
document, does that mean that each revision of a document contains the
attachments as well? Or are they stored independently?

For instance, given a 5kb document with a 100Mb attachment that has 10 revs
(where the attachment was added in rev 1), will the total storage
requirements be 5kb * 10 + 100Mb or (5kb + 10Mb) * 10?

Thanks,

Eric

Search Discussions

  • Brian Mitchell at Oct 1, 2014 at 8:02 pm
    The attachment will be stored once and each revision will retain a reference to that attachment (including when it was added,
    called revpos, so replication should be efficient too). Compaction will copy the attachments over and should retain a single
    copy for each unique attachment.

    Attachments are identified by name and can be replaced without mutating old references to documents with attachments of
    the same name. If you pass the _attachments section and leave out stubs for any existing attachments, that is interpreted as
    a delete.

    Brian.
    On Oct 1, 2014, at 3:53 PM, Eric B wrote:

    Given that attachments are seemingly stored as key/value pairs within a
    document, does that mean that each revision of a document contains the
    attachments as well? Or are they stored independently?

    For instance, given a 5kb document with a 100Mb attachment that has 10 revs
    (where the attachment was added in rev 1), will the total storage
    requirements be 5kb * 10 + 100Mb or (5kb + 10Mb) * 10?

    Thanks,

    Eric
  • Eric Benzacar at Oct 1, 2014 at 8:17 pm

    On Wed, Oct 1, 2014 at 4:02 PM, Brian Mitchell wrote:

    The attachment will be stored once and each revision will retain a
    reference to that attachment (including when it was added,
    called revpos, so replication should be efficient too). Compaction will
    copy the attachments over and should retain a single
    copy for each unique attachment.
    Thanks for the confirmation. That's what I was suspecting, but wasn't sure.

    Attachments are identified by name and can be replaced without mutating
    old references to documents with attachments of
    the same name.
    This is where you lose me a little. How can I have multiple references to
    the same attachment? Am I not able to have 2 documents with 2 distinct
    attachments with the same name? For example, if each user uploads a
    "photo.jpg" that is attached to their profile?

    Or are you referring to the ability to retrieve an older rev of the
    document and retrieve the older rev of the attachment? For example, in
    rev1 of a doc I attach photo.jpg and in rev2 I update the photo.jpg. Do
    you mean I can still retrieve rev1 and the original photo.jpg?

    Thanks,

    Eric


    Brian.
    On Oct 1, 2014, at 3:53 PM, Eric B wrote:

    Given that attachments are seemingly stored as key/value pairs within a
    document, does that mean that each revision of a document contains the
    attachments as well? Or are they stored independently?

    For instance, given a 5kb document with a 100Mb attachment that has 10 revs
    (where the attachment was added in rev 1), will the total storage
    requirements be 5kb * 10 + 100Mb or (5kb + 10Mb) * 10?

    Thanks,

    Eric
  • Eric B at Oct 1, 2014 at 8:17 pm

    On Wed, Oct 1, 2014 at 4:02 PM, Brian Mitchell wrote:

    The attachment will be stored once and each revision will retain a
    reference to that attachment (including when it was added,
    called revpos, so replication should be efficient too). Compaction will
    copy the attachments over and should retain a single
    copy for each unique attachment.
    Thanks for the confirmation. That's what I was suspecting, but wasn't sure.

    Attachments are identified by name and can be replaced without mutating
    old references to documents with attachments of
    the same name.
    This is where you lose me a little. How can I have multiple references to
    the same attachment? Am I not able to have 2 documents with 2 distinct
    attachments with the same name? For example, if each user uploads a
    "photo.jpg" that is attached to their profile?

    Or are you referring to the ability to retrieve an older rev of the
    document and retrieve the older rev of the attachment? For example, in
    rev1 of a doc I attach photo.jpg and in rev2 I update the photo.jpg. Do
    you mean I can still retrieve rev1 and the original photo.jpg?

    Thanks,

    Eric


    Brian.
    On Oct 1, 2014, at 3:53 PM, Eric B wrote:

    Given that attachments are seemingly stored as key/value pairs within a
    document, does that mean that each revision of a document contains the
    attachments as well? Or are they stored independently?

    For instance, given a 5kb document with a 100Mb attachment that has 10 revs
    (where the attachment was added in rev 1), will the total storage
    requirements be 5kb * 10 + 100Mb or (5kb + 10Mb) * 10?

    Thanks,

    Eric
  • Brian Mitchell at Oct 1, 2014 at 8:50 pm

    On Oct 1, 2014, at 4:16 PM, Eric B wrote:
    Attachments are identified by name and can be replaced without mutating
    old references to documents with attachments of
    the same name.
    This is where you lose me a little. How can I have multiple references to
    the same attachment? Am I not able to have 2 documents with 2 distinct
    attachments with the same name? For example, if each user uploads a
    "photo.jpg" that is attached to their profile?

    Or are you referring to the ability to retrieve an older rev of the
    document and retrieve the older rev of the attachment? For example, in
    rev1 of a doc I attach photo.jpg and in rev2 I update the photo.jpg. Do
    you mean I can still retrieve rev1 and the original photo.jpg?

    Yes. I just wanted to make it clear that you have a version of each attachment
    at a given revpos. So this means that you can replace or delete the attachment
    but old revisions will reference them until they are culled by compaction. If you
    have conflicting revisions they will both be able to keep different attachments
    under the same name.

    DocA-1 + cat.gif (md5-hash 1234, revpos 1)
    DocA-2 + cat.gif (md5-hash 1234, revpos 1)
    [at this point we only have one copy]
    DocA-3 + cat.git (md5-hash 7890, revpos 3)
    [now I have two different gifs but I can get both until I compact]
    Compact [DocA-3 is the only leaf revision with no conflicts]
    [now I should only have one gif (7890)]

    If we had conflicts, then we’d possibly have more images. You can play around
    with this more by using the atts_since=N query parameter. Keep in mind that
    content is not currently deduplicated between different documents so that is
    where the application can do some work to ensure that only one of anything is
    stored by using a digest like SHA1 or similar as the document id. This
    artificially restricts one attachment per doc but I find things work a bit better
    when you avoid having huge numbers of attachments to manage per
    document.

    Brian.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriescouchdb
postedOct 1, '14 at 7:54p
activeOct 1, '14 at 8:50p
posts5
users3
websitecouchdb.apache.org
irc#couchdb

People

Translate

site design / logo © 2020 Grokbase