FAQ
Hello,

I wish to create a tool which is capable of creating deterministic Tar
archives so that two separately created archives can be compared using a
cryptographic hash of each. I can do most of this already using the tar
Header/Writer types which allow me to do this by ensuring that I create a
tar archive where all of the entries I put in it are sorted by name first,
and I can explicitly zero out atime/mtime/ctime as I don't deem them
important for this tool. Normal header fields are all written in a
particular, deterministic order in a tar header, but extended attributes
are not - they are stored in memory as a map and written in whatever order
it happens to be iterated in. As the article "Go maps in action" states
(https://blog.golang.org/go-maps-in-action#TOC_7.):
When iterating over a map with a range loop, the iteration order is not
specified and is not guaranteed to be the same from one iteration to the
next.

This makes the tool I'm planning to write unreliable for tar entries which
have more than 1 xattr as I currently cannot ensure the order in which
these will be written.

I've never contributed to the Go project before and I'd like this to by my
first contribution. My plan is to simply use the method described in the
"Go maps in action" article linked above to change the way which the method
`archive.Tar.(Writer).writePAXHeader` iterates over and writes paxHeaders -
by first sorting the keys of the `paxHeaders` map and selecting and writing
them in that order.

Since this is my first contribution I'm not entirely sure if this is the
correct place to start. I started with the contribution guidelines
(https://golang.org/doc/contribute.html#Design) which say to start by
discussing my design on the mailing list. The article linked to the go-nuts
mailing list but I get the feeling that the golang-dev mailing list is more
appropriate. Let me know if I'm mistaken and I'll go ahead and cross post
this on that list instead.

Thanks!

- Josh

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

  • Ian Lance Taylor at Dec 24, 2014 at 4:49 pm

    On Wed, Dec 24, 2014 at 12:10 AM, wrote:
    I've never contributed to the Go project before and I'd like this to by my
    first contribution. My plan is to simply use the method described in the "Go
    maps in action" article linked above to change the way which the method
    `archive.Tar.(Writer).writePAXHeader` iterates over and writes paxHeaders -
    by first sorting the keys of the `paxHeaders` map and selecting and writing
    them in that order.
    Sounds like a good plan.

    Ian

    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Josh Hawn at Dec 24, 2014 at 6:05 pm

    On Wednesday, December 24, 2014 8:49:34 AM UTC-8, Ian Lance Taylor wrote:
    Sounds like a good plan.

    Ian
    It looks like there is yet another issue that prevents me from making
    creating a deterministic archive. The `writePaxHeader` method also includes
    the current pid as part of the name field of the pax header. There is a
    comment stating:
    The spec asks that we namespace our pseudo files with the current pid.
    I'm not sure where the spec is, but I've found a description of this field
    in the GNU Tar Manual
    (http://www.gnu.org/software/tar/manual/html_section/tar_71.html#SEC146).
    Scroll down from that point just a bit to find the description for the
    option which specifies the format for writing pax header names:
    exthdr.name=string
    It then goes on to define format specifiers for the file dirname (%d),
    basname (%f), and tar process pid (%p). GNU Tar allows the user to set the
    format but gives `%d/PaxHeaders.%p/%f` as a default. It seems pretty
    arbitrary, but `archive.tar.(Writer).writePaxHeader` includes the PID.
    Would you guys be agreeable to omitting it?

    - Josh

    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Alan Donovan at Aug 27, 2015 at 1:44 pm

    On Wednesday, 24 December 2014 13:05:03 UTC-5, josh...@docker.com wrote:
    It looks like there is yet another issue that prevents me from making
    creating a deterministic archive. The `writePaxHeader` method also includes
    the current pid as part of the name field of the pax header. There is a
    comment stating:
    The spec asks that we namespace our pseudo files with the current pid.
    I'm not sure where the spec is, but I've found a description of this field
    in the GNU Tar Manual (
    http://www.gnu.org/software/tar/manual/html_section/tar_71.html#SEC146).
    Scroll down from that point just a bit to find the description for the
    option which specifies the format for writing pax header names:
    exthdr.name=string
    It then goes on to define format specifiers for the file dirname (%d),
    basname (%f), and tar process pid (%p). GNU Tar allows the user to set the
    format but gives `%d/PaxHeaders.%p/%f` as a default. It seems pretty
    arbitrary, but `archive.tar.(Writer).writePaxHeader` includes the PID.
    Would you guys be agreeable to omitting it?

    - Josh
    I just ran into the same problem. Including the pid seems like a bad
    default and an even worse hard-wired behavior, POSIX be damned.
    I've filed https://github.com/golang/go/issues/12358.

    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Vincent Batts at Aug 27, 2015 at 2:55 pm
    seems less like a posix thing, and just an available formatting
    option. Seems fine to not actually use `os.Getpid()` but a generic
    numerical marker. Omitting the value entirely would not conform to the
    pattern other archive extractors expect.
    http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html
    (search for "PaxHeaders")
    On Thu, Aug 27, 2015 at 9:44 AM, Alan Donovan wrote:
    On Wednesday, 24 December 2014 13:05:03 UTC-5, josh...@docker.com wrote:

    It looks like there is yet another issue that prevents me from making
    creating a deterministic archive. The `writePaxHeader` method also includes
    the current pid as part of the name field of the pax header. There is a
    comment stating:
    The spec asks that we namespace our pseudo files with the current pid.
    I'm not sure where the spec is, but I've found a description of this field
    in the GNU Tar Manual
    (http://www.gnu.org/software/tar/manual/html_section/tar_71.html#SEC146).
    Scroll down from that point just a bit to find the description for the
    option which specifies the format for writing pax header names:
    exthdr.name=string
    It then goes on to define format specifiers for the file dirname (%d),
    basname (%f), and tar process pid (%p). GNU Tar allows the user to set the
    format but gives `%d/PaxHeaders.%p/%f` as a default. It seems pretty
    arbitrary, but `archive.tar.(Writer).writePaxHeader` includes the PID. Would
    you guys be agreeable to omitting it?

    - Josh

    I just ran into the same problem. Including the pid seems like a bad
    default and an even worse hard-wired behavior, POSIX be damned.
    I've filed https://github.com/golang/go/issues/12358.

    --
    You received this message because you are subscribed to the Google Groups
    "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Shane Hansen at Sep 14, 2015 at 7:30 pm
    Maybe we can use a deterministic method for feeling in the "pid" field such
    as a hash of the file name.
    On Thu, Aug 27, 2015 at 8:55 AM, Vincent Batts wrote:

    seems less like a posix thing, and just an available formatting
    option. Seems fine to not actually use `os.Getpid()` but a generic
    numerical marker. Omitting the value entirely would not conform to the
    pattern other archive extractors expect.
    http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html
    (search for "PaxHeaders")
    On Thu, Aug 27, 2015 at 9:44 AM, Alan Donovan wrote:
    On Wednesday, 24 December 2014 13:05:03 UTC-5, josh...@docker.com wrote:

    It looks like there is yet another issue that prevents me from making
    creating a deterministic archive. The `writePaxHeader` method also
    includes
    the current pid as part of the name field of the pax header. There is a
    comment stating:
    The spec asks that we namespace our pseudo files with the current pid.
    I'm not sure where the spec is, but I've found a description of this
    field
    in the GNU Tar Manual
    (http://www.gnu.org/software/tar/manual/html_section/tar_71.html#SEC146
    ).
    Scroll down from that point just a bit to find the description for the
    option which specifies the format for writing pax header names:
    exthdr.name=string
    It then goes on to define format specifiers for the file dirname (%d),
    basname (%f), and tar process pid (%p). GNU Tar allows the user to set
    the
    format but gives `%d/PaxHeaders.%p/%f` as a default. It seems pretty
    arbitrary, but `archive.tar.(Writer).writePaxHeader` includes the PID.
    Would
    you guys be agreeable to omitting it?

    - Josh

    I just ran into the same problem. Including the pid seems like a bad
    default and an even worse hard-wired behavior, POSIX be damned.
    I've filed https://github.com/golang/go/issues/12358.

    --
    You received this message because you are subscribed to the Google Groups
    "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups
    "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-dev @
categoriesgo
postedDec 24, '14 at 4:34p
activeSep 14, '15 at 7:30p
posts6
users5
websitegolang.org

People

Translate

site design / logo © 2021 Grokbase