FAQ
Hi!

I have worked on optimizing the standard library deflate function, and I am
happy to announce revised gzip/zip packages, that on x64 is about *30-50%
faster with slightly improved compression*. It contains no cgo.

Project: https://github.com/klauspost/compress


All packages are drop-in replacements for the standard libraries, so you
can use them by simply changing imports.

The biggest gains are on machines with SSE4.2 instructions available on
Intel Nehalem (2009) and AMD Bulldozer (2012). The optimized functions are:

* Minimum matches are 4 bytes, this leads to fewer searches and better
compression.
* Stronger hash (iSCSI CRC32) for matches on x64 with SSE 4.2 support. This
leads to fewer hash collisions.
* Literal byte matching using SSE 4.2 for faster long-match comparisons.
* Bulk hashing on matches.
* Much faster dictionary indexing with NewWriterDict()/Reset().
* Make Bit Coder faster by assuming we are on a 64 bit CPU.
* CRC32 optimized for 10x speedup on SSE 4.2. Available
separately: https://github.com/klauspost/crc32

For benchmarks see the project page.

In short, there will be better compression at levels 1 to 4 and about 1.5
times the throughput at higher compression levels.


Furthermore "pgzip" (multi-cpu gzip for longer streams) has also been
updated to the new deflate/crc32, so it you update the repo you will also
get a "free" speed boost there. See https://github.com/klauspost/pgzip


Comments, questions and other feedback is very welcome!

/Klaus

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

  • Sebastien Binet at Jul 28, 2015 at 11:32 am

    On Tue, Jul 28, 2015 at 1:06 PM, Klaus Post wrote:
    Hi!

    I have worked on optimizing the standard library deflate function, and I am
    happy to announce revised gzip/zip packages, that on x64 is about 30-50%
    faster with slightly improved compression. It contains no cgo.

    Project: https://github.com/klauspost/compress


    All packages are drop-in replacements for the standard libraries, so you can
    use them by simply changing imports.

    The biggest gains are on machines with SSE4.2 instructions available on
    Intel Nehalem (2009) and AMD Bulldozer (2012). The optimized functions are:

    * Minimum matches are 4 bytes, this leads to fewer searches and better
    compression.
    * Stronger hash (iSCSI CRC32) for matches on x64 with SSE 4.2 support. This
    leads to fewer hash collisions.
    * Literal byte matching using SSE 4.2 for faster long-match comparisons.
    * Bulk hashing on matches.
    * Much faster dictionary indexing with NewWriterDict()/Reset().
    * Make Bit Coder faster by assuming we are on a 64 bit CPU.
    * CRC32 optimized for 10x speedup on SSE 4.2. Available separately:
    https://github.com/klauspost/crc32

    For benchmarks see the project page.

    In short, there will be better compression at levels 1 to 4 and about 1.5
    times the throughput at higher compression levels.


    Furthermore "pgzip" (multi-cpu gzip for longer streams) has also been
    updated to the new deflate/crc32, so it you update the repo you will also
    get a "free" speed boost there. See https://github.com/klauspost/pgzip


    Comments, questions and other feedback is very welcome!
    here is your feedback: great work!

    do you have an idea of the minimum payload size at which point it's
    beneficial to use pgzip wrt gzip? (I surmise there is some overhead
    going from the latter to the former)

    -s

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Klaus Post at Jul 28, 2015 at 11:39 am

    On Tuesday, 28 July 2015 13:33:01 UTC+2, Sebastien Binet wrote:
    here is your feedback: great work!
    you are welcome :)

    do you have an idea of the minimum payload size at which point it's
    beneficial to use pgzip wrt gzip? (I surmise there is some overhead
    going from the latter to the former)
    If uncompressed input is below 1MB, just use regular gzip. Default block
    size is 250k bytes, so if you send 1MB, if will be split into parallel 4
    encodes. But below that any gains will mostly be eaten by communication
    overhead, and 1MB is pretty fast on 1 core anyway.


    -s

    /Klaus

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Klaus Post at Jul 28, 2015 at 2:54 pm
    Hi again!

    Carlos Cobo just notified that some fixes for inflate had been sent in, and
    merged to tip. Mostly they relate to
    https://github.com/golang/go/issues/11030

    I have merged these changes, so you get that as a free bonus if you are
    using 1.3 or 1.4.

    /Klaus

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Joetsai at Jul 29, 2015 at 3:41 am
    This is awesome and was something I was thinking about doing myself
    eventually. I may still play around with optimizing flate at the higher
    level with tweaks to the algorithm itself.

    Also, thanks for merging in #11030.

    JT
    On Tuesday, July 28, 2015 at 7:53:56 AM UTC-7, Klaus Post wrote:

    Hi again!

    Carlos Cobo just notified that some fixes for inflate had been sent in,
    and merged to tip. Mostly they relate to
    https://github.com/golang/go/issues/11030

    I have merged these changes, so you get that as a free bonus if you are
    using 1.3 or 1.4.

    /Klaus
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Arne Hormann at Jul 29, 2015 at 10:35 am
    Hi, thank you so much for this, your library is amazing.
    I checked it with https://gist.github.com/arnehormann/65421048f56ac108f6b5
    and love it so far!

    Am Dienstag, 28. Juli 2015 13:06:43 UTC+2 schrieb Klaus Post:
    Hi!

    I have worked on optimizing the standard library deflate function, and I
    am happy to announce revised gzip/zip packages, that on x64 is about *30-50%
    faster with slightly improved compression*. It contains no cgo.

    Project: https://github.com/klauspost/compress


    All packages are drop-in replacements for the standard libraries, so you
    can use them by simply changing imports.

    The biggest gains are on machines with SSE4.2 instructions available on
    Intel Nehalem (2009) and AMD Bulldozer (2012). The optimized functions are:

    * Minimum matches are 4 bytes, this leads to fewer searches and better
    compression.
    * Stronger hash (iSCSI CRC32) for matches on x64 with SSE 4.2 support.
    This leads to fewer hash collisions.
    * Literal byte matching using SSE 4.2 for faster long-match comparisons.
    * Bulk hashing on matches.
    * Much faster dictionary indexing with NewWriterDict()/Reset().
    * Make Bit Coder faster by assuming we are on a 64 bit CPU.
    * CRC32 optimized for 10x speedup on SSE 4.2. Available separately:
    https://github.com/klauspost/crc32

    For benchmarks see the project page.

    In short, there will be better compression at levels 1 to 4 and about 1.5
    times the throughput at higher compression levels.


    Furthermore "pgzip" (multi-cpu gzip for longer streams) has also been
    updated to the new deflate/crc32, so it you update the repo you will also
    get a "free" speed boost there. See https://github.com/klauspost/pgzip


    Comments, questions and other feedback is very welcome!

    /Klaus
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Klaus Post at Jul 29, 2015 at 11:32 am

    On Wednesday, 29 July 2015 12:35:38 UTC+2, Arne Hormann wrote:
    Hi, thank you so much for this, your library is amazing.
    I checked it with https://gist.github.com/arnehormann/65421048f56ac108f6b5
    and love it so far!
    Very cool! If you want to improve it further, you could look into
    real-world samples, like JSON/HTML/XML/CSS/JS, which are more likely
    candidates for real-world scenario.

    Artificial sources have a tendency to skew benchmarks, and to get the most
    usable results, real sources show the tradeoffs the best. Just so you know
    what it makes sense to compare.


    /Klaus

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Arne Hormann at Jul 29, 2015 at 11:42 am

    On 29.07 13:32, Klaus Post wrote:
    On Wednesday, 29 July 2015 12:35:38 UTC+2, Arne Hormann wrote:

    Hi, thank you so much for this, your library is amazing.
    I checked it with
    https://gist.github.com/arnehormann/65421048f56ac108f6b5 and love
    it so far!


    Very cool! If you want to improve it further, you could look into
    real-world samples, like JSON/HTML/XML/CSS/JS, which are more likely
    candidates for real-world scenario.

    Artificial sources have a tendency to skew benchmarks, and to get the
    most usable results, real sources show the tradeoffs the best. Just so
    you know what it makes sense to compare.


    /Klaus
    That's entirely possible with the program in that gist.
    Just use -r=raw for the input and pipe a file into it.
    I tried it with a tared directory and compared the result with diff -q
    to check the unpacked output matches the input.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Klaus Post at Jul 29, 2015 at 1:06 pm

    On Wednesday, 29 July 2015 13:42:18 UTC+2, Arne Hormann wrote:
    That's entirely possible with the program in that gist.
    Just use -r=raw for the input and pipe a file into it.
    I tried it with a tared directory and compared the result with diff -q to
    check the unpacked output matches the input.
    Even cooler! I took the liberty of adding an in/out parameter (since pipes
    perform quite bad on Windows), as well as adding pgzip and csv-output of
    stats, and cpu allocation (for pgzip). That makes it perfect for my testing.

    https://gist.github.com/klauspost/00f7c9a19e56581f5ead

    /Klaus

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Arne Hormann at Jul 29, 2015 at 1:25 pm
    Glad I could help - especially if this allows you to tune it better!

    Am Mittwoch, 29. Juli 2015 15:06:20 UTC+2 schrieb Klaus Post:
    On Wednesday, 29 July 2015 13:42:18 UTC+2, Arne Hormann wrote:

    That's entirely possible with the program in that gist.
    Just use -r=raw for the input and pipe a file into it.
    I tried it with a tared directory and compared the result with diff -q to
    check the unpacked output matches the input.
    Even cooler! I took the liberty of adding an in/out parameter (since
    pipes perform quite bad on Windows), as well as adding pgzip and csv-output
    of stats, and cpu allocation (for pgzip). That makes it perfect for my
    testing.

    https://gist.github.com/klauspost/00f7c9a19e56581f5ead

    /Klaus
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Demetriobenour at Jul 30, 2015 at 5:41 am
    How does performance compare to calling zlib/libzip via cgo?

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Klaus Post at Jul 30, 2015 at 11:43 am

    On Thursday, 30 July 2015 07:41:59 UTC+2, demetri...@gmail.com wrote:
    How does performance compare to calling zlib/libzip via cgo?
    Never ask that again. Getting it to compile under windows was a nightmare ;)

    I have benchmarked cgzip along with Go standard library, my revised gzip,
    pgzip as well as 7zip and gzip executable:

    https://docs.google.com/spreadsheets/d/1nuNE2nPfuINCZJRMt6wFWhKpToF95I47XjSsc-1rbPQ/edit?usp=sharing

    At level 1 cgzip is slower 20% slower than the new zip and compresses worse.
    At level 9 cgzip is 20% faster, but compresses worse.

    The test file is a 7GB highly compressible JSON. I might add Matt Mahoneys
    10GB corpus - http://mattmahoney.net/dc/10gb.html - as a formal test,
    although JSON and similar formats are probably more real-world scenario.

    /Klaus

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Thebrokentoaster at Jul 30, 2015 at 10:28 pm
    Just wondering, does gzkp allocate more memory resources than gzstd?
    On Thursday, July 30, 2015 at 4:43:03 AM UTC-7, Klaus Post wrote:
    On Thursday, 30 July 2015 07:41:59 UTC+2, demetri...@gmail.com wrote:

    How does performance compare to calling zlib/libzip via cgo?
    Never ask that again. Getting it to compile under windows was a nightmare
    ;)

    I have benchmarked cgzip along with Go standard library, my revised gzip,
    pgzip as well as 7zip and gzip executable:


    https://docs.google.com/spreadsheets/d/1nuNE2nPfuINCZJRMt6wFWhKpToF95I47XjSsc-1rbPQ/edit?usp=sharing

    At level 1 cgzip is slower 20% slower than the new zip and compresses
    worse.
    At level 9 cgzip is 20% faster, but compresses worse.

    The test file is a 7GB highly compressible JSON. I might add Matt Mahoneys
    10GB corpus - http://mattmahoney.net/dc/10gb.html - as a formal test,
    although JSON and similar formats are probably more real-world scenario.

    /Klaus
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Klaus Post at Jul 30, 2015 at 10:43 pm

    On Friday, 31 July 2015 00:28:55 UTC+2, thebroke...@gmail.com wrote:
    Just wondering, does gzkp allocate more memory resources than gzstd?
    It has an additional 1-2KB array (depending on the size of an int) used
    for bulk hashing. It is allocated when you create a new Writer, but
    otherwise the memory use should be the same.

    /Klaus


    PS. I have added benchmarks for high compressible input, and level 1-7 on
    medium compressible input on separate sheets. It looks as if there could be
    something gained by being able to quickly skip output that is hard to
    compress at lower compression levels.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Klaus Post at Aug 3, 2015 at 2:42 pm

    On Friday, 31 July 2015 00:28:55 UTC+2, thebroke...@gmail.com wrote:
    Just wondering, does gzkp allocate more memory resources than gzstd?
    Just finished eliminating allocations in deflate. The aggregate results
    when using gzip, which appears to do a few allocations by itself:

    * "BenchmarkOld" below are the allocations for the standard library.

    BenchmarkGzipL1 20 77454430 ns/op 64.06 MB/s 40386 B/op 0
    allocs/op
    BenchmarkGzipL2 20 84054810 ns/op 59.03 MB/s 40386 B/op 0
    allocs/op
    BenchmarkGzipL3 20 86904970 ns/op 57.09 MB/s 40386 B/op 0
    allocs/op
    BenchmarkGzipL4 10 118906800 ns/op 41.73 MB/s 80772 B/op 1
    allocs/op
    BenchmarkGzipL5 10 148208480 ns/op 33.48 MB/s 80772 B/op 1
    allocs/op
    BenchmarkGzipL6 10 148608500 ns/op 33.39 MB/s 80772 B/op 1
    allocs/op
    BenchmarkGzipL7 5 200011440 ns/op 24.81 MB/s 161545 B/op 3
    allocs/op
    BenchmarkGzipL8 3 396022666 ns/op 12.53 MB/s 269242 B/op 6
    allocs/op
    BenchmarkGzipL9 3 403356400 ns/op 12.30 MB/s 269242 B/op 6
    allocs/op
    BenchmarkOldGzipL1 10 104305970 ns/op 47.57 MB/s 396857 B/op 4333
    allocs/op
    BenchmarkOldGzipL2 10 123907090 ns/op 40.04 MB/s 386457 B/op 4249
    allocs/op
    BenchmarkOldGzipL3 10 137007830 ns/op 36.22 MB/s 379177 B/op 4180
    allocs/op
    BenchmarkOldGzipL4 5 200011440 ns/op 24.81 MB/s 503955 B/op 3615
    allocs/op
    BenchmarkOldGzipL5 5 216612400 ns/op 22.91 MB/s 499715 B/op 3558
    allocs/op
    BenchmarkOldGzipL6 5 254814560 ns/op 19.47 MB/s 499475 B/op 3651
    allocs/op
    BenchmarkOldGzipL7 5 299817140 ns/op 16.55 MB/s 500675 B/op 3696
    allocs/op
    BenchmarkOldGzipL8 2 529030250 ns/op 9.38 MB/s 936048 B/op 3678
    allocs/op
    BenchmarkOldGzipL9 2 577033000 ns/op 8.60 MB/s 936128 B/op 3681
    allocs/op

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Luna Duclos at Aug 3, 2015 at 2:43 pm
    is there any convincing reason to not integrate most of these improvements
    in the std lib once they're entirely finished ?
    On Mon, Aug 3, 2015 at 4:41 PM, Klaus Post wrote:
    On Friday, 31 July 2015 00:28:55 UTC+2, thebroke...@gmail.com wrote:

    Just wondering, does gzkp allocate more memory resources than gzstd?
    Just finished eliminating allocations in deflate. The aggregate results
    when using gzip, which appears to do a few allocations by itself:

    * "BenchmarkOld" below are the allocations for the standard library.

    BenchmarkGzipL1 20 77454430 ns/op 64.06 MB/s 40386 B/op 0
    allocs/op
    BenchmarkGzipL2 20 84054810 ns/op 59.03 MB/s 40386 B/op 0
    allocs/op
    BenchmarkGzipL3 20 86904970 ns/op 57.09 MB/s 40386 B/op 0
    allocs/op
    BenchmarkGzipL4 10 118906800 ns/op 41.73 MB/s 80772 B/op 1
    allocs/op
    BenchmarkGzipL5 10 148208480 ns/op 33.48 MB/s 80772 B/op 1
    allocs/op
    BenchmarkGzipL6 10 148608500 ns/op 33.39 MB/s 80772 B/op 1
    allocs/op
    BenchmarkGzipL7 5 200011440 ns/op 24.81 MB/s 161545 B/op 3
    allocs/op
    BenchmarkGzipL8 3 396022666 ns/op 12.53 MB/s 269242 B/op 6
    allocs/op
    BenchmarkGzipL9 3 403356400 ns/op 12.30 MB/s 269242 B/op 6
    allocs/op
    BenchmarkOldGzipL1 10 104305970 ns/op 47.57 MB/s 396857 B/op
    4333 allocs/op
    BenchmarkOldGzipL2 10 123907090 ns/op 40.04 MB/s 386457 B/op
    4249 allocs/op
    BenchmarkOldGzipL3 10 137007830 ns/op 36.22 MB/s 379177 B/op
    4180 allocs/op
    BenchmarkOldGzipL4 5 200011440 ns/op 24.81 MB/s 503955 B/op
    3615 allocs/op
    BenchmarkOldGzipL5 5 216612400 ns/op 22.91 MB/s 499715 B/op
    3558 allocs/op
    BenchmarkOldGzipL6 5 254814560 ns/op 19.47 MB/s 499475 B/op
    3651 allocs/op
    BenchmarkOldGzipL7 5 299817140 ns/op 16.55 MB/s 500675 B/op
    3696 allocs/op
    BenchmarkOldGzipL8 2 529030250 ns/op 9.38 MB/s 936048 B/op
    3678 allocs/op
    BenchmarkOldGzipL9 2 577033000 ns/op 8.60 MB/s 936128 B/op
    3681 allocs/op

    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Klaus Post at Aug 3, 2015 at 2:49 pm

    On Monday, 3 August 2015 16:43:42 UTC+2, Luna Duclos wrote:
    is there any convincing reason to not integrate most of these improvements
    in the std lib once they're entirely finished ?
    No. Initially I thought it would only be x64 specific, but a lot should
    make it generally better. I am still regression testing to see if there are
    any cases where it is significantly worse.

    Either way, it will not be in before 1.6.


    /Klaus

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Benoît Amiaux at Aug 3, 2015 at 5:55 pm
    Nice speed !
    Have you seen this article ?
    http://fastcompression.blogspot.fr/2015/07/huffman-revisited-part-2-decoder.html


    On Mon, Aug 3, 2015 at 4:49 PM, Klaus Post wrote:
    On Monday, 3 August 2015 16:43:42 UTC+2, Luna Duclos wrote:

    is there any convincing reason to not integrate most of these
    improvements in the std lib once they're entirely finished ?
    No. Initially I thought it would only be x64 specific, but a lot should
    make it generally better. I am still regression testing to see if there are
    any cases where it is significantly worse.

    Either way, it will not be in before 1.6.


    /Klaus

    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Klaus Post at Aug 4, 2015 at 10:57 am

    On Monday, 3 August 2015 19:55:34 UTC+2, Benoît Amiaux wrote:
    Nice speed !
    Have you seen this article ?
    http://fastcompression.blogspot.fr/2015/07/huffman-revisited-part-2-decoder.html
    No,I hadn't seen it, but funny enough, I was just experimenting with doing
    Huffman-only compression to create an alternative to the big variance in
    compression speed.

    I have a prototype working, although currently the speed isn't particularly
    impressive yet - even though usually faster than level 1.
    The compression hit is rather big though.

    https://github.com/klauspost/compress/pull/6

    My few initial tests (one CPU used):

    enwiki9:
    * Level 1: 40.17MB/s, 365,776,800 bytes
    * Huffman only: 70.82 MB/s, 641,017,571 bytes

    10GB Matt Mahoney corpus:
    * Level 1: 34.59 MB/s, 5,105,308,274 bytes
    * Huffman only: 82.15 MB/s, 6,485,492,430 bytes

    Pure Huffman should give a more predictable speed, meaning that different
    input doesn't make the compression speed "tank" as it can to some degree
    with ordinary deflate.

    My aim is to get the Huffman-only above 150MB/s to justify the size
    tradeoff and to make it a "well, it cannot hurt to add" option.

    Regarding "FSE" described in the article, we cannot use that, since we must
    remain compatible with the deflate format.

    /Klaus

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Klaus Post at Aug 5, 2015 at 7:11 pm
    Hi!

    I wrote up a summary of my findings when using Gzip for small payloads -
    specifically for web server style loads:

    http://blog.klauspost.com/gzip-performance-for-go-webservers/


    /Klaus

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Paul Graydon at Aug 6, 2015 at 5:25 am
    If you're building webservers/websites from scratch, you might want to consider precompressing all your static content in advance.

    Nginx will happily host the content directly http://nginx.org/en/docs/http/ngx_http_gzip_static_module.html and even supports the alternative approach of serving only gzip content and uncompressing for those clients that don't support compression.

    With such easy wins available, it somewhat surprises me that more static site generators don't provide it as a native setting.
    If you really want to get fancy you can use zopfli, Google's painfully slow, high compression libraries, they produce gzipped content at better compression ratios than gzip. You'd never use it to compress on the fly, but it's ideal for precompressing.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Frits van Bommel at Aug 6, 2015 at 7:28 am
    Klaus,

    About your example code:

       // Get a Writer from the Pool
       gz := zippers.Get().(*gzip.Writer)

       // We use Reset to set the writer we want to use.
       gz.Reset(w)
       defer gz.Close()

       // When done, put the Writer back in to the Pool
       defer zippers.Put(gz)
    Please note that defers run in reverse order <https://golang.org/ref/spec#Defer_statements>, so you want to defer the Put() before the Close() to avoid calling the latter on a writer that has already been Put() back into the pool and is potentially already being used by another goroutine.
    I'd suggest putting the Put() right after the Get() unless there's some way for Reset() to fail that would make it unsuitable for reuse (in that case it should be between Reset() and the deferred Close()).


    On Wednesday, August 5, 2015 at 9:11:04 PM UTC+2, Klaus Post wrote:

    Hi!

    I wrote up a summary of my findings when using Gzip for small payloads -
    specifically for web server style loads:

    http://blog.klauspost.com/gzip-performance-for-go-webservers/


    /Klaus
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Klaus Post at Aug 6, 2015 at 11:21 am

    On Thursday, 6 August 2015 09:28:25 UTC+2, Frits van Bommel wrote:
    Klaus,
    [...]


    Please note that defers run in reverse order <https://golang.org/ref/spec#Defer_statements>, so you want to defer the Put() before the Close() to avoid calling the latter on a writer that has already been Put() back into the pool and is potentially already being used by another goroutine.
    I'd suggest putting the Put() right after the Get() unless there's some way for Reset() to fail that would make it unsuitable for reuse (in that case it should be between Reset() and the deferred Close()).
    Thanks for that!

    I updated the sample code!

    /Klaus

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Darren Hoo at Jul 31, 2015 at 2:26 am
    how about compressing small chunks of data, like 4kb for each message?

    I have used your library, but it can not keep up with the speed of messages
    flows in, So I switch back to cgzip

    On Thursday, July 30, 2015 at 7:43:03 PM UTC+8, Klaus Post wrote:
    On Thursday, 30 July 2015 07:41:59 UTC+2, demetri...@gmail.com wrote:

    How does performance compare to calling zlib/libzip via cgo?
    Never ask that again. Getting it to compile under windows was a nightmare
    ;)

    I have benchmarked cgzip along with Go standard library, my revised gzip,
    pgzip as well as 7zip and gzip executable:


    https://docs.google.com/spreadsheets/d/1nuNE2nPfuINCZJRMt6wFWhKpToF95I47XjSsc-1rbPQ/edit?usp=sharing

    At level 1 cgzip is slower 20% slower than the new zip and compresses
    worse.
    At level 9 cgzip is 20% faster, but compresses worse.

    The test file is a 7GB highly compressible JSON. I might add Matt Mahoneys
    10GB corpus - http://mattmahoney.net/dc/10gb.html - as a formal test,
    although JSON and similar formats are probably more real-world scenario.

    /Klaus
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Klaus Post at Aug 1, 2015 at 8:44 am

    On Friday, 31 July 2015 04:25:59 UTC+2, Darren Hoo wrote:
    how about compressing small chunks of data, like 4kb for each message?

    I have used your library, but it can not keep up with the speed of
    messages flows in, So I switch back to cgzip
    I haven't tested small workloads yet. I will put that next on my to-do
    list. What is your average payload size and payloads per second?

    Meanwhile I have summarized my findings on high-throughput
    workloads: http://blog.klauspost.com/go-gzipdeflate-benchmarks/

    /Klaus

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Klaus Post at Aug 2, 2015 at 7:26 pm

    On Friday, 31 July 2015 04:25:59 UTC+2, Darren Hoo wrote:
    how about compressing small chunks of data, like 4kb for each message?
    I looked through it, and even got 5-10% additional performance in standard
    deflate.

    The most important thing is to use the Reset() function, and not use a
    NewWriter on every instance. This more close to doubles the number of
    compressed files with both the standard library and this. Use a sync.Pool
    if you need to share across goroutines.

    I created a test-set with 548 files, containing JSON, HTML, Javascript, SVG
    and CSS. Average size is a little more than 8KB/file. All are compressed
    separately to simulate a http server. ~500 of the files are small JSON
    files.

    With the modified gzip library I get 22 MB/sec (2656 files/s), using a
    Reset() between each.
    With cgzip I get about 28MB/sec(3391 files/sec). So it has about a 25%
    performance advantage.

    For comparison the standard library achieves 13MB/sec (1574 files/sec) with
    Reset(). Without Reset() the speed of the modified library is 16 MB/sec
    (1926 files/sec).


    I will do more tests and see if I can squeeze out more speed.

    /Klaus

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedJul 28, '15 at 11:06a
activeAug 6, '15 at 11:21a
posts26
users11
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase