Hi all..

I've collaborated with a number of people who have been in the packaging
world for quite a while and came up with something that would probably
be called XARv2.

More details can be found here [1], but wanted to share some quick
numbers..

1.3M /usr/portage/packages/app-editors/vim-7.2.021.oxe
1.5M /usr/portage/packages/app-editors/vim-7.2.021.tbz2


time xar -xf /usr/portage/packages/app-editors/vim-7.2.021.oxe
0.15u 0.01s 0:00.18 88.8%

time gtar -xf /usr/portage/packages/app-editors/vim-7.2.021.tbz2
0.24u 0.02s 0:00.26 100.0%

# Someone correct me if I'm doing this wrong
time /usr/bin/star -x -bz <
/usr/portage/packages/app-editors/vim-7.2.021.tbz2
0.24u 0.03s 0:00.35 77.1%

(When I get a chance I'll do it on a larger package..
java/sunstudioexpress..)

I ran it more than once and overall it seemed pretty consistent.. So
what exactly is under the hood
.tbz2 == bzip2 compressed data, block size = 900k + xpak tagged on
after EOF so that metadata can be easily extracted..
.oxe == xar archive - version 1 + xz (lzma2) compressed and using xar's
built-in toc to store metadata (gzip)

Who else is using something similar?

tbz2 is used by Gentoo.. (anyone else?)
xar was/is used by macports, (Apple?) and also by rpm5.

I'll be ironing out metadata details a bit later today, but open to
suggestions. I'm initially including a qalevel in the metadata so that
a policy by administrators can be used to filter out packages not
meeting a certain level of assurance/testing. I have other ideas of how
to build in tools so that customizing different aspects of the OS are
policy based instead of requiring a reinstall. The bottom line for
administrators will be subscribing to "channels" which will have the
choice between things like kde vs gnome vs E vs something else or mysql
vs postgresql.. So if you have a custom postfix setup that's backed by
postgresql you'll have a full stack compiled just for that and not just
using the everything approach. Of course this won't fit everyone, but
making and maintaining custom channels should be fairly easy.

If you're good with python or C code I could use help to speed this up.
Once the items on the packaging TODO [1] are done I'll push for more
public testing and eventually ask for more community mirrors.

Thanks

./Christopher

#ospkg @ irc.freenode.net
http://www.osunix.org

[1] http://www.osunix.org/docs/DOC-1012

Search Discussions

  • Fabian Groffen at Jan 20, 2009 at 11:41 am

    On 20-01-2009 11:55:55 +0100, "C. Bergstr?m" wrote:
    More details can be found here [1], but wanted to share some quick
    numbers..

    1.3M /usr/portage/packages/app-editors/vim-7.2.021.oxe
    1.5M /usr/portage/packages/app-editors/vim-7.2.021.tbz2


    time xar -xf /usr/portage/packages/app-editors/vim-7.2.021.oxe
    0.15u 0.01s 0:00.18 88.8%

    time gtar -xf /usr/portage/packages/app-editors/vim-7.2.021.tbz2
    0.24u 0.02s 0:00.26 100.0%
    I ran it more than once and overall it seemed pretty consistent.. So
    what exactly is under the hood
    .tbz2 == bzip2 compressed data, block size = 900k + xpak tagged on
    after EOF so that metadata can be easily extracted..
    .oxe == xar archive - version 1 + xz (lzma2) compressed and using xar's
    built-in toc to store metadata (gzip)
    I think what you're really measuring here is bzip2 vs lzma compression.
    Gentoo's xpak can be done with lzma or gzip as well IMO. It's not
    really an argument, but more like I think you should focus your decision
    on the metadata part. Things like how easy it can be extracted
    (with/without compression tools?), transparency, etc.

    My 0.02 ${currency}.

    I'll be ironing out metadata details a bit later today, but open to
    suggestions. I'm initially including a qalevel in the metadata so that
    a policy by administrators can be used to filter out packages not
    meeting a certain level of assurance/testing. I have other ideas of how
    to build in tools so that customizing different aspects of the OS are
    policy based instead of requiring a reinstall. The bottom line for
    administrators will be subscribing to "channels" which will have the
    choice between things like kde vs gnome vs E vs something else or mysql
    vs postgresql.. So if you have a custom postfix setup that's backed by
    postgresql you'll have a full stack compiled just for that and not just
    using the everything approach. Of course this won't fit everyone, but
    making and maintaining custom channels should be fairly easy.

    If you're good with python or C code I could use help to speed this up.
    Once the items on the packaging TODO [1] are done I'll push for more
    public testing and eventually ask for more community mirrors.

    --
    Fabian Groffen
    Gentoo on a different level
  • Christopher Bergström at Jan 20, 2009 at 11:46 am

    Fabian Groffen wrote:
    I think what you're really measuring here is bzip2 vs lzma compression.
    Gentoo's xpak can be done with lzma or gzip as well IMO. It's not
    really an argument, but more like I think you should focus your decision
    on the metadata part. Things like how easy it can be extracted
    (with/without compression tools?), transparency, etc.

    My 0.02 ${currency}.
    You're right.. bzip2 vs xz (lzma2) I just went with the default.. If I
    really wanted to play unfair I'd toss deb or some rpm variant in there..
    (In fairness suse which uses lzma may not do too badly?)

    However.. both these formats are directly extractable and at a
    functional level quite similar.

    In terms of accessing the metadata

    xar --dump-toc=foo.xml -f /usr/portage/packages/app-editors/vim-7.2.021.oxe

    From there it almost becomes semantics since parsing the very small
    amount of metadata is trivial. In this case I prefer a tool with
    built-in support that I can checksum instead of just tagging xpak on the
    end. Both have advantages depending how you look at it.

    ./C
  • Fabian Groffen at Jan 20, 2009 at 12:01 pm

    On 20-01-2009 12:46:26 +0100, "C. Bergstr?m" wrote:
    However.. both these formats are directly extractable and at a
    functional level quite similar.

    In terms of accessing the metadata

    xar --dump-toc=foo.xml -f /usr/portage/packages/app-editors/vim-7.2.021.oxe

    From there it almost becomes semantics since parsing the very small
    amount of metadata is trivial. In this case I prefer a tool with
    built-in support that I can checksum instead of just tagging xpak on the
    end. Both have advantages depending how you look at it.
    I would make the same conclusion as you just did.


    --
    Fabian Groffen
    Gentoo on a different level
  • Joerg Schilling at Jan 20, 2009 at 12:07 pm

    "C. Bergstr?m" wrote:

    More details can be found here [1], but wanted to share some quick
    numbers..

    1.3M /usr/portage/packages/app-editors/vim-7.2.021.oxe
    1.5M /usr/portage/packages/app-editors/vim-7.2.021.tbz2
    Why is the *.oxe file smaller?
    time xar -xf /usr/portage/packages/app-editors/vim-7.2.021.oxe
    0.15u 0.01s 0:00.18 88.8%
    Why is this extract faster?
    time gtar -xf /usr/portage/packages/app-editors/vim-7.2.021.tbz2
    0.24u 0.02s 0:00.26 100.0%

    # Someone correct me if I'm doing this wrong
    time /usr/bin/star -x -bz <
    /usr/portage/packages/app-editors/vim-7.2.021.tbz2
    0.24u 0.03s 0:00.35 77.1%
    star should be the fastest in case you have similar constraints.....

    star -xp -time < ../cdrtools-2.01.01.tar.bz2 -no-fsync
    star: WARNING: Archive is 'bzip2' compressed, trying to use the -bz option.
    star: 1021 blocks + 9728 bytes (total of 10464768 bytes = 10219.50k).
    star: Total time 0.508sec (20076 kBytes/sec)
    0.518r 0.480u 0.140s 124% 0M 0+0k 0st 0+0io 0pf+0w

    gtar xjf - < ../cdrtools-2.01.01.tar.bz2
    0.723r 0.490u 0.130s 88% 0M 0+0k 0st 0+0io 0pf+0w

    Star defaults to a mode that grants you that everything could be unpacked
    correctly to disk in case that the final star exit code is 0.
    Other tar implementaions do not support this, so I swichted the secure
    more off in the example above.

    I ran it more than once and overall it seemed pretty consistent.. So
    what exactly is under the hood
    .tbz2 == bzip2 compressed data, block size = 900k + xpak tagged on
    after EOF so that metadata can be easily extracted..
    .oxe == xar archive - version 1 + xz (lzma2) compressed and using xar's
    built-in toc to store metadata (gzip)
    What compression is lzma2 and which program supports it?

    J?rg

    --
    EMail:joerg@schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
    js@cs.tu-berlin.de (uni)
    joerg.schilling@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
    URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
  • Christopher Bergström at Jan 20, 2009 at 12:39 pm

    Joerg Schilling wrote:
    "C. Bergstr?m" wrote:

    More details can be found here [1], but wanted to share some quick
    numbers..

    1.3M /usr/portage/packages/app-editors/vim-7.2.021.oxe
    1.5M /usr/portage/packages/app-editors/vim-7.2.021.tbz2
    Why is the *.oxe file smaller?
    Not always smaller, but it's a result of xz vs bzip2 compression..
    time xar -xf /usr/portage/packages/app-editors/vim-7.2.021.oxe
    0.15u 0.01s 0:00.18 88.8%
    Why is this extract faster?
    xz is faster at decompressing than bzip2 and a smaller file?
    time gtar -xf /usr/portage/packages/app-editors/vim-7.2.021.tbz2
    0.24u 0.02s 0:00.26 100.0%

    # Someone correct me if I'm doing this wrong
    time /usr/bin/star -x -bz <
    /usr/portage/packages/app-editors/vim-7.2.021.tbz2
    0.24u 0.03s 0:00.35 77.1%
    star should be the fastest in case you have similar constraints.....

    star -xp -time < ../cdrtools-2.01.01.tar.bz2 -no-fsync
    star: WARNING: Archive is 'bzip2' compressed, trying to use the -bz option.
    star: 1021 blocks + 9728 bytes (total of 10464768 bytes = 10219.50k).
    star: Total time 0.508sec (20076 kBytes/sec)
    0.518r 0.480u 0.140s 124% 0M 0+0k 0st 0+0io 0pf+0w

    gtar xjf - < ../cdrtools-2.01.01.tar.bz2
    0.723r 0.490u 0.130s 88% 0M 0+0k 0st 0+0io 0pf+0w

    Star defaults to a mode that grants you that everything could be unpacked
    correctly to disk in case that the final star exit code is 0.
    Other tar implementaions do not support this, so I swichted the secure
    more off in the example above.

    time /usr/bin/star -xp -bz <
    /usr/portage/packages/app-editors/vim-7.2.021.tbz2 -no-fsync
    /usr/bin/star: WARNING: skipping leading '/' on filenames.
    /usr/bin/star: 416 blocks + 0 bytes (total of 4259840 bytes = 4160.00k).
    0.24u 0.02s 0:00.27 96.2%

    /usr/bin/star -xp -bz -time <
    /usr/portage/packages/app-editors/vim-7.2.021.tbz2 -no-fsync
    /usr/bin/star: WARNING: skipping leading '/' on filenames.
    /usr/bin/star: 416 blocks + 0 bytes (total of 4259840 bytes = 4160.00k).
    /usr/bin/star: Total time 0.256sec (16186 kBytes/sec)

    J?rg.. what's the short-cut way to have star compress "." and pipe it
    through xz so it's an apple vs apple comparison on the compression being
    used.

    Also does star have some functionality like..

    xar -cf filepath --compression=xz -n package_meta.xml -s foo.xml
    xar --dump-toc=foo.xml -f /usr/portage/packages/app-editors/vim-7.2.021.oxe
    I ran it more than once and overall it seemed pretty consistent.. So
    what exactly is under the hood
    .tbz2 == bzip2 compressed data, block size = 900k + xpak tagged on
    after EOF so that metadata can be easily extracted..
    .oxe == xar archive - version 1 + xz (lzma2) compressed and using xar's
    built-in toc to store metadata (gzip)
    What compression is lzma2 and which program supports it?
    http://tukaani.org/lzma/header-format-23.txt

    xz (lzma2) is beta work from Lasse Collin & Igor Pavlov.


    ./C
  • Joerg Schilling at Jan 20, 2009 at 1:16 pm

    "C. Bergstr?m" wrote:

    time /usr/bin/star -xp -bz <
    /usr/portage/packages/app-editors/vim-7.2.021.tbz2 -no-fsync
    /usr/bin/star: WARNING: skipping leading '/' on filenames.
    /usr/bin/star: 416 blocks + 0 bytes (total of 4259840 bytes = 4160.00k).
    0.24u 0.02s 0:00.27 96.2%

    /usr/bin/star -xp -bz -time <
    /usr/portage/packages/app-editors/vim-7.2.021.tbz2 -no-fsync
    /usr/bin/star: WARNING: skipping leading '/' on filenames.
    /usr/bin/star: 416 blocks + 0 bytes (total of 4259840 bytes = 4160.00k).
    /usr/bin/star: Total time 0.256sec (16186 kBytes/sec)
    Why do you have absolute path names?
    J?rg.. what's the short-cut way to have star compress "." and pipe it
    through xz so it's an apple vs apple comparison on the compression being
    used.

    Also does star have some functionality like..

    xar -cf filepath --compression=xz -n package_meta.xml -s foo.xml
    xar --dump-toc=foo.xml -f /usr/portage/packages/app-editors/vim-7.2.021.oxe
    See the star man page......

    star has -compress-program=abc
    What compression is lzma2 and which program supports it?
    http://tukaani.org/lzma/header-format-23.txt

    xz (lzma2) is beta work from Lasse Collin & Igor Pavlov.
    I cannot find "xz", but I find a program "lzma" which seems to use 7z
    compression (I already mentioned this one). A problem is: I cannot see
    any useful magic number at the beginning of the compressed file.

    J?rg

    --
    EMail:joerg@schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
    js@cs.tu-berlin.de (uni)
    joerg.schilling@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
    URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouposunix-dev @
categoriesopensolaris
postedJan 20, '09 at 10:55a
activeJan 20, '09 at 1:16p
posts7
users3
websiteopensolaris.org

People

Translate

site design / logo © 2017 Grokbase