FAQ
Hey all,

github: https://github.com/dsnet/compress
godoc: https://godoc.org/github.com/dsnet/compress/bzip2

I finished writing a bzip2 encoder. I haven't heavily tested it, and
there's still alot to do in the area of optimizations. However, it still
performs relatively well relative to the C implementation and is actually
faster in encoding than the standard Go library is at in decoding. The
speed is about half that of the C version, but I think I can close them gap
to be about 1.5x or even 1x. The compression ratio is about 10% worse since
I haven't implemented a K-means based huffman encoder yet. I have plans to
add a decoder eventually that will outperform the standard library version.

BENCHMARK: bz2:encRate
benchmark ds MB/s delta cgo MB/s delta
digits.txt:1:1e4 5.44 1.00x 7.55 1.39x
digits.txt:1:1e5 5.57 1.00x 12.51 2.25x
digits.txt:1:1e6 5.76 1.00x 12.88 2.24x
digits.txt:6:1e4 4.86 1.00x 7.47 1.54x
digits.txt:6:1e5 5.54 1.00x 12.46 2.25x
digits.txt:6:1e6 5.39 1.00x 12.46 2.31x
digits.txt:9:1e4 4.66 1.00x 7.54 1.62x
digits.txt:9:1e5 5.48 1.00x 12.60 2.30x
digits.txt:9:1e6 5.02 1.00x 12.11 2.41x
twain.txt:1:1e4 5.26 1.00x 7.89 1.50x
twain.txt:1:1e5 6.02 1.00x 12.89 2.14x
twain.txt:1:1e6 6.19 1.00x 12.83 2.07x
twain.txt:6:1e4 4.75 1.00x 8.07 1.70x
twain.txt:6:1e5 6.02 1.00x 12.93 2.15x
twain.txt:6:1e6 6.09 1.00x 12.51 2.06x
twain.txt:9:1e4 4.54 1.00x 7.93 1.75x
twain.txt:9:1e5 6.00 1.00x 12.50 2.08x
twain.txt:9:1e6 5.73 1.00x 12.13 2.12x
BENCHMARK: bz2:ratio
benchmark ds ratio delta cgo ratio delta
digits.txt:1:1e4 2.20x 1.00x 2.28x 1.04x
digits.txt:1:1e5 2.23x 1.00x 2.32x 1.04x
digits.txt:1:1e6 2.22x 1.00x 2.31x 1.04x
digits.txt:6:1e4 2.20x 1.00x 2.28x 1.04x
digits.txt:6:1e5 2.23x 1.00x 2.32x 1.04x
digits.txt:6:1e6 2.12x 1.00x 2.23x 1.05x
digits.txt:9:1e4 2.20x 1.00x 2.28x 1.04x
digits.txt:9:1e5 2.23x 1.00x 2.32x 1.04x
digits.txt:9:1e6 2.05x 1.00x 2.16x 1.05x
twain.txt:1:1e4 2.07x 1.00x 2.25x 1.09x
twain.txt:1:1e5 2.68x 1.00x 2.90x 1.08x
twain.txt:1:1e6 2.68x 1.00x 2.89x 1.08x
twain.txt:6:1e4 2.07x 1.00x 2.25x 1.09x
twain.txt:6:1e5 2.68x 1.00x 2.90x 1.08x
twain.txt:6:1e6 2.87x 1.00x 3.12x 1.09x
twain.txt:9:1e4 2.07x 1.00x 2.25x 1.09x
twain.txt:9:1e5 2.68x 1.00x 2.90x 1.08x
twain.txt:9:1e6 2.87x 1.00x 3.12x 1.09x

This package is still in development, but I would appreciate if people can
try using it and tell me about any bugs. I'll make a bigger announcement
about it when I feel that the package is stable enough. I eventually plan
on merging it into the standard library, but that definitely won't happen
until go1.7 or later.

JT

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

  • Thebrokentoaster at Dec 9, 2015 at 1:48 am
    Just an update, I have completed a working decoder. It decompresses at
    ~20MB/s, which about 25% faster than the C library (~16MB/s), and 200%
    faster than the Go standard library (~5MB/s).

    https://godoc.org/github.com/dsnet/compress/bzip2

    Enjoy.

    JT

    On Monday, December 7, 2015 at 11:55:41 PM UTC-8, thebroke...@gmail.com
    wrote:
    Hey all,

    github: https://github.com/dsnet/compress
    godoc: https://godoc.org/github.com/dsnet/compress/bzip2

    I finished writing a bzip2 encoder. I haven't heavily tested it, and
    there's still alot to do in the area of optimizations. However, it still
    performs relatively well relative to the C implementation and is actually
    faster in encoding than the standard Go library is at in decoding. The
    speed is about half that of the C version, but I think I can close them gap
    to be about 1.5x or even 1x. The compression ratio is about 10% worse since
    I haven't implemented a K-means based huffman encoder yet. I have plans to
    add a decoder eventually that will outperform the standard library version.

    BENCHMARK: bz2:encRate
    benchmark ds MB/s delta cgo MB/s delta
    digits.txt:1:1e4 5.44 1.00x 7.55 1.39x
    digits.txt:1:1e5 5.57 1.00x 12.51 2.25x
    digits.txt:1:1e6 5.76 1.00x 12.88 2.24x
    digits.txt:6:1e4 4.86 1.00x 7.47 1.54x
    digits.txt:6:1e5 5.54 1.00x 12.46 2.25x
    digits.txt:6:1e6 5.39 1.00x 12.46 2.31x
    digits.txt:9:1e4 4.66 1.00x 7.54 1.62x
    digits.txt:9:1e5 5.48 1.00x 12.60 2.30x
    digits.txt:9:1e6 5.02 1.00x 12.11 2.41x
    twain.txt:1:1e4 5.26 1.00x 7.89 1.50x
    twain.txt:1:1e5 6.02 1.00x 12.89 2.14x
    twain.txt:1:1e6 6.19 1.00x 12.83 2.07x
    twain.txt:6:1e4 4.75 1.00x 8.07 1.70x
    twain.txt:6:1e5 6.02 1.00x 12.93 2.15x
    twain.txt:6:1e6 6.09 1.00x 12.51 2.06x
    twain.txt:9:1e4 4.54 1.00x 7.93 1.75x
    twain.txt:9:1e5 6.00 1.00x 12.50 2.08x
    twain.txt:9:1e6 5.73 1.00x 12.13 2.12x
    BENCHMARK: bz2:ratio
    benchmark ds ratio delta cgo ratio delta
    digits.txt:1:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:1:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:1:1e6 2.22x 1.00x 2.31x 1.04x
    digits.txt:6:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:6:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:6:1e6 2.12x 1.00x 2.23x 1.05x
    digits.txt:9:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:9:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:9:1e6 2.05x 1.00x 2.16x 1.05x
    twain.txt:1:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:1:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:1:1e6 2.68x 1.00x 2.89x 1.08x
    twain.txt:6:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:6:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:6:1e6 2.87x 1.00x 3.12x 1.09x
    twain.txt:9:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:9:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:9:1e6 2.87x 1.00x 3.12x 1.09x

    This package is still in development, but I would appreciate if people can
    try using it and tell me about any bugs. I'll make a bigger announcement
    about it when I feel that the package is stable enough. I eventually plan
    on merging it into the standard library, but that definitely won't happen
    until go1.7 or later.

    JT
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Thebrokentoaster at Dec 9, 2015 at 1:51 am
    Just kidding, it's 400% faster than the standard library ;)

    JT

    On Tuesday, December 8, 2015 at 5:48:53 PM UTC-8, thebroke...@gmail.com
    wrote:
    Just an update, I have completed a working decoder. It decompresses at
    ~20MB/s, which about 25% faster than the C library (~16MB/s), and 200%
    faster than the Go standard library (~5MB/s).

    https://godoc.org/github.com/dsnet/compress/bzip2

    Enjoy.

    JT

    On Monday, December 7, 2015 at 11:55:41 PM UTC-8, thebroke...@gmail.com
    wrote:
    Hey all,

    github: https://github.com/dsnet/compress
    godoc: https://godoc.org/github.com/dsnet/compress/bzip2

    I finished writing a bzip2 encoder. I haven't heavily tested it, and
    there's still alot to do in the area of optimizations. However, it still
    performs relatively well relative to the C implementation and is actually
    faster in encoding than the standard Go library is at in decoding. The
    speed is about half that of the C version, but I think I can close them gap
    to be about 1.5x or even 1x. The compression ratio is about 10% worse since
    I haven't implemented a K-means based huffman encoder yet. I have plans to
    add a decoder eventually that will outperform the standard library version.

    BENCHMARK: bz2:encRate
    benchmark ds MB/s delta cgo MB/s delta
    digits.txt:1:1e4 5.44 1.00x 7.55 1.39x
    digits.txt:1:1e5 5.57 1.00x 12.51 2.25x
    digits.txt:1:1e6 5.76 1.00x 12.88 2.24x
    digits.txt:6:1e4 4.86 1.00x 7.47 1.54x
    digits.txt:6:1e5 5.54 1.00x 12.46 2.25x
    digits.txt:6:1e6 5.39 1.00x 12.46 2.31x
    digits.txt:9:1e4 4.66 1.00x 7.54 1.62x
    digits.txt:9:1e5 5.48 1.00x 12.60 2.30x
    digits.txt:9:1e6 5.02 1.00x 12.11 2.41x
    twain.txt:1:1e4 5.26 1.00x 7.89 1.50x
    twain.txt:1:1e5 6.02 1.00x 12.89 2.14x
    twain.txt:1:1e6 6.19 1.00x 12.83 2.07x
    twain.txt:6:1e4 4.75 1.00x 8.07 1.70x
    twain.txt:6:1e5 6.02 1.00x 12.93 2.15x
    twain.txt:6:1e6 6.09 1.00x 12.51 2.06x
    twain.txt:9:1e4 4.54 1.00x 7.93 1.75x
    twain.txt:9:1e5 6.00 1.00x 12.50 2.08x
    twain.txt:9:1e6 5.73 1.00x 12.13 2.12x
    BENCHMARK: bz2:ratio
    benchmark ds ratio delta cgo ratio delta
    digits.txt:1:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:1:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:1:1e6 2.22x 1.00x 2.31x 1.04x
    digits.txt:6:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:6:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:6:1e6 2.12x 1.00x 2.23x 1.05x
    digits.txt:9:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:9:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:9:1e6 2.05x 1.00x 2.16x 1.05x
    twain.txt:1:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:1:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:1:1e6 2.68x 1.00x 2.89x 1.08x
    twain.txt:6:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:6:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:6:1e6 2.87x 1.00x 3.12x 1.09x
    twain.txt:9:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:9:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:9:1e6 2.87x 1.00x 3.12x 1.09x

    This package is still in development, but I would appreciate if people
    can try using it and tell me about any bugs. I'll make a bigger
    announcement about it when I feel that the package is stable enough. I
    eventually plan on merging it into the standard library, but that
    definitely won't happen until go1.7 or later.

    JT
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Bitstorm at Dec 10, 2015 at 4:21 pm
    I wanted to try this code and compare speedwise to my own implementation of
    similar compression algorithms (https://github.com/flanglet/kanzi) but I
    cannot compile it:

    bash build.bash
    /tmp/tmp.OEZy2bZNpS/src/github.com/dsnet/compress/flate/bit_reader.go:68:
    br.bufRd.Discard undefined (type *bufio.Reader has no field or method
    Discard)

    [...]/go/src/github.com/dsnet/compress/api.go:52: cannot use
    (*bufio.Reader)(nil) (type *bufio.Reader) as type BufferedReader in
    assignment:
    *bufio.Reader does not implement BufferedReader (missing Discard method)

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Joe Tsai at Dec 10, 2015 at 8:00 am
    Hmmm... it seems I used a new method on bufio.Reader that was introduced in
    go1.5. Are you able to use go1.5 instead?

    Discard was important to getting high-performance on the decoder side.

    JT
    On Wed, Dec 9, 2015 at 11:58 PM, bitstorm wrote:

    I wanted to try this code and compare speedwise to my own implementation
    of similar compression algorithms (https://github.com/flanglet/kanzi) but
    I cannot compile it:

    bash build.bash
    /tmp/tmp.OEZy2bZNpS/src/github.com/dsnet/compress/flate/bit_reader.go:68:
    br.bufRd.Discard undefined (type *bufio.Reader has no field or method
    Discard)

    [...]/go/src/github.com/dsnet/compress/api.go:52: cannot use
    (*bufio.Reader)(nil) (type *bufio.Reader) as type BufferedReader in
    assignment:
    *bufio.Reader does not implement BufferedReader (missing Discard method)


    --
    Joe Tsai
    http://www.Digital-Static.net/

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Brad Fitzpatrick at Dec 10, 2015 at 4:46 pm
    Is the goal to merge this into the standard library?

    On Tue, Dec 8, 2015 at 5:48 PM, wrote:

    Just an update, I have completed a working decoder. It decompresses at
    ~20MB/s, which about 25% faster than the C library (~16MB/s), and 200%
    faster than the Go standard library (~5MB/s).

    https://godoc.org/github.com/dsnet/compress/bzip2

    Enjoy.

    JT

    On Monday, December 7, 2015 at 11:55:41 PM UTC-8, thebroke...@gmail.com
    wrote:
    Hey all,

    github: https://github.com/dsnet/compress
    godoc: https://godoc.org/github.com/dsnet/compress/bzip2

    I finished writing a bzip2 encoder. I haven't heavily tested it, and
    there's still alot to do in the area of optimizations. However, it still
    performs relatively well relative to the C implementation and is actually
    faster in encoding than the standard Go library is at in decoding. The
    speed is about half that of the C version, but I think I can close them gap
    to be about 1.5x or even 1x. The compression ratio is about 10% worse since
    I haven't implemented a K-means based huffman encoder yet. I have plans to
    add a decoder eventually that will outperform the standard library version.

    BENCHMARK: bz2:encRate
    benchmark ds MB/s delta cgo MB/s delta
    digits.txt:1:1e4 5.44 1.00x 7.55 1.39x
    digits.txt:1:1e5 5.57 1.00x 12.51 2.25x
    digits.txt:1:1e6 5.76 1.00x 12.88 2.24x
    digits.txt:6:1e4 4.86 1.00x 7.47 1.54x
    digits.txt:6:1e5 5.54 1.00x 12.46 2.25x
    digits.txt:6:1e6 5.39 1.00x 12.46 2.31x
    digits.txt:9:1e4 4.66 1.00x 7.54 1.62x
    digits.txt:9:1e5 5.48 1.00x 12.60 2.30x
    digits.txt:9:1e6 5.02 1.00x 12.11 2.41x
    twain.txt:1:1e4 5.26 1.00x 7.89 1.50x
    twain.txt:1:1e5 6.02 1.00x 12.89 2.14x
    twain.txt:1:1e6 6.19 1.00x 12.83 2.07x
    twain.txt:6:1e4 4.75 1.00x 8.07 1.70x
    twain.txt:6:1e5 6.02 1.00x 12.93 2.15x
    twain.txt:6:1e6 6.09 1.00x 12.51 2.06x
    twain.txt:9:1e4 4.54 1.00x 7.93 1.75x
    twain.txt:9:1e5 6.00 1.00x 12.50 2.08x
    twain.txt:9:1e6 5.73 1.00x 12.13 2.12x
    BENCHMARK: bz2:ratio
    benchmark ds ratio delta cgo ratio delta
    digits.txt:1:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:1:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:1:1e6 2.22x 1.00x 2.31x 1.04x
    digits.txt:6:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:6:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:6:1e6 2.12x 1.00x 2.23x 1.05x
    digits.txt:9:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:9:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:9:1e6 2.05x 1.00x 2.16x 1.05x
    twain.txt:1:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:1:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:1:1e6 2.68x 1.00x 2.89x 1.08x
    twain.txt:6:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:6:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:6:1e6 2.87x 1.00x 3.12x 1.09x
    twain.txt:9:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:9:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:9:1e6 2.87x 1.00x 3.12x 1.09x

    This package is still in development, but I would appreciate if people
    can try using it and tell me about any bugs. I'll make a bigger
    announcement about it when I feel that the package is stable enough. I
    eventually plan on merging it into the standard library, but that
    definitely won't happen until go1.7 or later.

    JT
    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Thebrokentoaster at Dec 12, 2015 at 1:16 am
    Is the goal to merge this into the standard library?
    Yep, the intent was to fulfill #4828
    <https://github.com/golang/go/issues/4828>, which asks for a bzip2 encoder.
    In an indirect way, it was also trying to fulfill #6754
    <https://github.com/golang/go/issues/6754>, which is regarding bzip2
    performance. The specific API I ended up with is slightly different from
    the standard library, but it should be pretty easy to adapt my
    implementation to match the existing API. I don't know if we want to merge
    both encoder and decoder at the same time, or just do one of them at a time.


    To answer some questions I have been getting in private messages:
    Q: How is the decoder much faster?
    I did make a mistake in reporting that the my implementation was 4x faster.
    The standard library benchmark reports the compression rate relative to the
    size of the compressed file, rather than the uncompressed size. Thus, the
    standard library will appear to perform slower than it really is. With the
    corrected benchmarks, my version is anywhere between 1.5x to 3x faster.
    Here's some data for decoding (ds is my version, std is standard library,
    cgo is the C library):
    BENCHMARK: bz2:decRate
    benchmark std MB/s delta ds MB/s delta cgo MB/s
    delta
    digits.txt:1:1e4 11.51 1.00x 22.04 1.91x 28.28
    2.46x
    digits.txt:1:1e5 13.85 1.00x 20.00 1.44x 27.02
    1.95x
    digits.txt:1:1e6 14.36 1.00x 19.91 1.39x 27.26
    1.90x
    digits.txt:6:1e4 5.98 1.00x 21.85 3.65x 28.43
    4.75x
    digits.txt:6:1e5 11.92 1.00x 20.62 1.73x 26.35
    2.21x
    digits.txt:6:1e6 12.86 1.00x 18.00 1.40x 24.24
    1.89x
    digits.txt:9:1e4 5.39 1.00x 21.39 3.97x 28.68
    5.32x
    digits.txt:9:1e5 11.36 1.00x 20.14 1.77x 27.21
    2.39x
    digits.txt:9:1e6 12.50 1.00x 17.90 1.43x 22.84
    1.83x
    twain.txt:1:1e4 10.22 1.00x 22.26 2.18x 27.64
    2.70x
    twain.txt:1:1e5 15.91 1.00x 24.17 1.52x 30.59
    1.92x
    twain.txt:1:1e6 16.45 1.00x 23.65 1.44x 31.39
    1.91x
    twain.txt:6:1e4 5.64 1.00x 22.29 3.95x 27.93
    4.95x
    twain.txt:6:1e5 13.40 1.00x 23.76 1.77x 30.49
    2.28x
    twain.txt:6:1e6 15.99 1.00x 21.76 1.36x 30.39
    1.90x
    twain.txt:9:1e4 5.14 1.00x 21.73 4.22x 27.89
    5.42x
    twain.txt:9:1e5 12.99 1.00x 23.64 1.82x 30.34
    2.33x
    twain.txt:9:1e6 15.61 1.00x 20.10 1.29x 28.21
    1.81x

    As for implementation details, my decoder has two major improvements over
    the standard library version:

        - The huffman decoder isn't represented as a tree, which requires a loop
        iteration for every bit. Instead, it uses the LUT method that the zlib
        library uses <http://www.gzip.org/algorithm.txt>. For most prefix codes,
        it can decode them in a single table lookup. It also uses careful coding to
        make sure critical functions are inlined by the compiler.
        - In decoding, the ReadByte method becomes a bottleneck because it is
        called for every byte. The bit reader I have avoids using ReadByte if the
        input Reader is a bufio.Reader (or something that looks like it). Thus, the
        decoder can grab more bytes in a batch, rather going at them one at a time.

    Q: Why won't the library build?
    The library depends on at least Go1.5.

    JT
    On Thursday, December 10, 2015 at 8:46:30 AM UTC-8, bradfitz wrote:

    Is the goal to merge this into the standard library?


    On Tue, Dec 8, 2015 at 5:48 PM, <thebroke...@gmail.com <javascript:>>
    wrote:
    Just an update, I have completed a working decoder. It decompresses at
    ~20MB/s, which about 25% faster than the C library (~16MB/s), and 200%
    faster than the Go standard library (~5MB/s).

    https://godoc.org/github.com/dsnet/compress/bzip2

    Enjoy.

    JT

    On Monday, December 7, 2015 at 11:55:41 PM UTC-8, thebroke...@gmail.com
    wrote:
    Hey all,

    github: https://github.com/dsnet/compress
    godoc: https://godoc.org/github.com/dsnet/compress/bzip2

    I finished writing a bzip2 encoder. I haven't heavily tested it, and
    there's still alot to do in the area of optimizations. However, it still
    performs relatively well relative to the C implementation and is actually
    faster in encoding than the standard Go library is at in decoding. The
    speed is about half that of the C version, but I think I can close them gap
    to be about 1.5x or even 1x. The compression ratio is about 10% worse since
    I haven't implemented a K-means based huffman encoder yet. I have plans to
    add a decoder eventually that will outperform the standard library version.

    BENCHMARK: bz2:encRate
    benchmark ds MB/s delta cgo MB/s delta
    digits.txt:1:1e4 5.44 1.00x 7.55 1.39x
    digits.txt:1:1e5 5.57 1.00x 12.51 2.25x
    digits.txt:1:1e6 5.76 1.00x 12.88 2.24x
    digits.txt:6:1e4 4.86 1.00x 7.47 1.54x
    digits.txt:6:1e5 5.54 1.00x 12.46 2.25x
    digits.txt:6:1e6 5.39 1.00x 12.46 2.31x
    digits.txt:9:1e4 4.66 1.00x 7.54 1.62x
    digits.txt:9:1e5 5.48 1.00x 12.60 2.30x
    digits.txt:9:1e6 5.02 1.00x 12.11 2.41x
    twain.txt:1:1e4 5.26 1.00x 7.89 1.50x
    twain.txt:1:1e5 6.02 1.00x 12.89 2.14x
    twain.txt:1:1e6 6.19 1.00x 12.83 2.07x
    twain.txt:6:1e4 4.75 1.00x 8.07 1.70x
    twain.txt:6:1e5 6.02 1.00x 12.93 2.15x
    twain.txt:6:1e6 6.09 1.00x 12.51 2.06x
    twain.txt:9:1e4 4.54 1.00x 7.93 1.75x
    twain.txt:9:1e5 6.00 1.00x 12.50 2.08x
    twain.txt:9:1e6 5.73 1.00x 12.13 2.12x
    BENCHMARK: bz2:ratio
    benchmark ds ratio delta cgo ratio delta
    digits.txt:1:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:1:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:1:1e6 2.22x 1.00x 2.31x 1.04x
    digits.txt:6:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:6:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:6:1e6 2.12x 1.00x 2.23x 1.05x
    digits.txt:9:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:9:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:9:1e6 2.05x 1.00x 2.16x 1.05x
    twain.txt:1:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:1:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:1:1e6 2.68x 1.00x 2.89x 1.08x
    twain.txt:6:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:6:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:6:1e6 2.87x 1.00x 3.12x 1.09x
    twain.txt:9:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:9:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:9:1e6 2.87x 1.00x 3.12x 1.09x

    This package is still in development, but I would appreciate if people
    can try using it and tell me about any bugs. I'll make a bigger
    announcement about it when I feel that the package is stable enough. I
    eventually plan on merging it into the standard library, but that
    definitely won't happen until go1.7 or later.

    JT
    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts...@googlegroups.com <javascript:>.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Alb Donizetti at Dec 12, 2015 at 2:23 pm
    It would be great if we could merge the encoder, since the bzip2 package
    currently does not have one!

    I have a few questions about the decoder benchmarks, since I benchmarked
    the standard library one when I mailed CL 13853, that closed issue #6754
    (the one you linked above). What are the sizes of the archives you are
    decoding?

    I see numbers at the end of your benchmarks names, but I'm not sure how to
    read
    them. I'm asking because if my assumption that the benchmarks are ordered by
    size, (i.e. the size is the last number in the name) is correct, it appears
    that your
    approach is much faster (4x) than the current one on very small files, and
    about
    1.4x faster on the biggest files you are benchmarking. I think you'll agree
    with me
    if I say that being faster on a big file (that takes a lot of time to
    decode) is better
    that being faster on a 0.1 millisecond decode operation : )

    So I'm asking how big is the biggest archive you benchmarked, and if you
    tested
    your decoder on a really big archive (let's say 100MB of english text: you
    can
    download for example a 100MB archive of wikipedia articles, that's what I
    did when
    emailed my CL).


    On Saturday, December 12, 2015 at 2:16:49 AM UTC+1, thebroke...@gmail.com
    wrote:
    Is the goal to merge this into the standard library?

    Yep, the intent was to fulfill #4828
    <https://github.com/golang/go/issues/4828>, which asks for a bzip2
    encoder. In an indirect way, it was also trying to fulfill #6754
    <https://github.com/golang/go/issues/6754>, which is regarding bzip2
    performance. The specific API I ended up with is slightly different from
    the standard library, but it should be pretty easy to adapt my
    implementation to match the existing API. I don't know if we want to merge
    both encoder and decoder at the same time, or just do one of them at a time.


    To answer some questions I have been getting in private messages:
    Q: How is the decoder much faster?
    I did make a mistake in reporting that the my implementation was 4x
    faster. The standard library benchmark reports the compression rate
    relative to the size of the compressed file, rather than the uncompressed
    size. Thus, the standard library will appear to perform slower than it
    really is. With the corrected benchmarks, my version is anywhere between
    1.5x to 3x faster. Here's some data for decoding (ds is my version, std is
    standard library, cgo is the C library):
    BENCHMARK: bz2:decRate
    benchmark std MB/s delta ds MB/s delta cgo MB/s
    delta
    digits.txt:1:1e4 11.51 1.00x 22.04 1.91x 28.28
    2.46x
    digits.txt:1:1e5 13.85 1.00x 20.00 1.44x 27.02
    1.95x
    digits.txt:1:1e6 14.36 1.00x 19.91 1.39x 27.26
    1.90x
    digits.txt:6:1e4 5.98 1.00x 21.85 3.65x 28.43
    4.75x
    digits.txt:6:1e5 11.92 1.00x 20.62 1.73x 26.35
    2.21x
    digits.txt:6:1e6 12.86 1.00x 18.00 1.40x 24.24
    1.89x
    digits.txt:9:1e4 5.39 1.00x 21.39 3.97x 28.68
    5.32x
    digits.txt:9:1e5 11.36 1.00x 20.14 1.77x 27.21
    2.39x
    digits.txt:9:1e6 12.50 1.00x 17.90 1.43x 22.84
    1.83x
    twain.txt:1:1e4 10.22 1.00x 22.26 2.18x 27.64
    2.70x
    twain.txt:1:1e5 15.91 1.00x 24.17 1.52x 30.59
    1.92x
    twain.txt:1:1e6 16.45 1.00x 23.65 1.44x 31.39
    1.91x
    twain.txt:6:1e4 5.64 1.00x 22.29 3.95x 27.93
    4.95x
    twain.txt:6:1e5 13.40 1.00x 23.76 1.77x 30.49
    2.28x
    twain.txt:6:1e6 15.99 1.00x 21.76 1.36x 30.39
    1.90x
    twain.txt:9:1e4 5.14 1.00x 21.73 4.22x 27.89
    5.42x
    twain.txt:9:1e5 12.99 1.00x 23.64 1.82x 30.34
    2.33x
    twain.txt:9:1e6 15.61 1.00x 20.10 1.29x 28.21
    1.81x

    As for implementation details, my decoder has two major improvements over
    the standard library version:

    - The huffman decoder isn't represented as a tree, which requires a
    loop iteration for every bit. Instead, it uses the LUT method that the
    zlib library uses <http://www.gzip.org/algorithm.txt>. For most prefix
    codes, it can decode them in a single table lookup. It also uses careful
    coding to make sure critical functions are inlined by the compiler.
    - In decoding, the ReadByte method becomes a bottleneck because it is
    called for every byte. The bit reader I have avoids using ReadByte if the
    input Reader is a bufio.Reader (or something that looks like it). Thus, the
    decoder can grab more bytes in a batch, rather going at them one at a time.

    Q: Why won't the library build?
    The library depends on at least Go1.5.

    JT
    On Thursday, December 10, 2015 at 8:46:30 AM UTC-8, bradfitz wrote:

    Is the goal to merge this into the standard library?

    On Tue, Dec 8, 2015 at 5:48 PM, wrote:

    Just an update, I have completed a working decoder. It decompresses at
    ~20MB/s, which about 25% faster than the C library (~16MB/s), and 200%
    faster than the Go standard library (~5MB/s).

    https://godoc.org/github.com/dsnet/compress/bzip2

    Enjoy.

    JT

    On Monday, December 7, 2015 at 11:55:41 PM UTC-8, thebroke...@gmail.com
    wrote:
    Hey all,

    github: https://github.com/dsnet/compress
    godoc: https://godoc.org/github.com/dsnet/compress/bzip2

    I finished writing a bzip2 encoder. I haven't heavily tested it, and
    there's still alot to do in the area of optimizations. However, it still
    performs relatively well relative to the C implementation and is actually
    faster in encoding than the standard Go library is at in decoding. The
    speed is about half that of the C version, but I think I can close them gap
    to be about 1.5x or even 1x. The compression ratio is about 10% worse since
    I haven't implemented a K-means based huffman encoder yet. I have plans to
    add a decoder eventually that will outperform the standard library version.

    BENCHMARK: bz2:encRate
    benchmark ds MB/s delta cgo MB/s delta
    digits.txt:1:1e4 5.44 1.00x 7.55 1.39x
    digits.txt:1:1e5 5.57 1.00x 12.51 2.25x
    digits.txt:1:1e6 5.76 1.00x 12.88 2.24x
    digits.txt:6:1e4 4.86 1.00x 7.47 1.54x
    digits.txt:6:1e5 5.54 1.00x 12.46 2.25x
    digits.txt:6:1e6 5.39 1.00x 12.46 2.31x
    digits.txt:9:1e4 4.66 1.00x 7.54 1.62x
    digits.txt:9:1e5 5.48 1.00x 12.60 2.30x
    digits.txt:9:1e6 5.02 1.00x 12.11 2.41x
    twain.txt:1:1e4 5.26 1.00x 7.89 1.50x
    twain.txt:1:1e5 6.02 1.00x 12.89 2.14x
    twain.txt:1:1e6 6.19 1.00x 12.83 2.07x
    twain.txt:6:1e4 4.75 1.00x 8.07 1.70x
    twain.txt:6:1e5 6.02 1.00x 12.93 2.15x
    twain.txt:6:1e6 6.09 1.00x 12.51 2.06x
    twain.txt:9:1e4 4.54 1.00x 7.93 1.75x
    twain.txt:9:1e5 6.00 1.00x 12.50 2.08x
    twain.txt:9:1e6 5.73 1.00x 12.13 2.12x
    BENCHMARK: bz2:ratio
    benchmark ds ratio delta cgo ratio delta
    digits.txt:1:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:1:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:1:1e6 2.22x 1.00x 2.31x 1.04x
    digits.txt:6:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:6:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:6:1e6 2.12x 1.00x 2.23x 1.05x
    digits.txt:9:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:9:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:9:1e6 2.05x 1.00x 2.16x 1.05x
    twain.txt:1:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:1:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:1:1e6 2.68x 1.00x 2.89x 1.08x
    twain.txt:6:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:6:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:6:1e6 2.87x 1.00x 3.12x 1.09x
    twain.txt:9:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:9:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:9:1e6 2.87x 1.00x 3.12x 1.09x

    This package is still in development, but I would appreciate if people
    can try using it and tell me about any bugs. I'll make a bigger
    announcement about it when I feel that the package is stable enough. I
    eventually plan on merging it into the standard library, but that
    definitely won't happen until go1.7 or later.

    JT
    --
    You received this message because you are subscribed to the Google
    Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to golang-nuts...@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Thebrokentoaster at Dec 13, 2015 at 7:40 am
    Hey Alberto,

    The format of the benchmarks (I should have labelled them) is
    <test_file>:<compression_level>:<test_filesize>, as you correctly surmised.
    I didn't really test too many large files yet and plan to do so for both
    the encoder and decoder. Fortunately, other people are doing benchmarks and
    tests have reported an out-of-bounds error on the encoder side, which I'll
    look into later.

    As for the decoder, I just ran a benchmark on a go1.5.2.linux-amd64.tar,
    which is 270MiB uncompressed, and 64MiB compressed. Here are the results:

    BenchmarkStd-4 1 14094699464 ns/op 20.07 MB/s 15858536 B/op 10118
    allocs/op
    BenchmarkDS-4 1 8983998803 ns/op 31.48 MB/s 2524556680 B/op 15996
    allocs/op


    Currently, my implementation allocates a lot of garbage, since I don't
    cache the slices. Hence, my implementation performs 150x more allocations.
    This is a relatively low-hanging fruit in terms of optimization, and I plan
    on addressing it soon.

    Also, looking at the implementations under pprof. Here are the top
    functions for the standard library implementation:

           flat flat% sum% cum cum%
         4470ms 29.96% 29.96% 5810ms 38.94%
      compress/bzip2.(*huffmanTree).Decode
         4430ms 29.69% 59.65% 4430ms 29.69%
      compress/bzip2.(*reader).readFromBlock
         1830ms 12.27% 71.92% 9320ms 62.47%
      compress/bzip2.(*reader).readBlock
         1000ms 6.70% 78.62% 1000ms 6.70% compress/bzip2.updateCRC
          810ms 5.43% 84.05% 810ms 5.43% compress/bzip2.inverseBWT

    We can see that huffman tree decoding tops this list. The second function
    that does most work is readFromBlock, which is probably because the
    de-scrambling phase of inverse BWT is performed there along with RLE.

    Here's my version under the same load:

         4790ms 50.32% 50.32% 4960ms 52.10%
      github.com/dsnet/compress/bzip2.(*burrowsWheelerTransform).Decode
         1060ms 11.13% 79.20% 1060ms 11.13%
      github.com/dsnet/compress/bzip2.(*runLengthEncoding).Read
          800ms 8.40% 87.61% 1860ms 19.54%
      github.com/dsnet/compress/bzip2.(*moveToFront).Decode
          460ms 4.83% 92.44% 1380ms 14.50%
      github.com/dsnet/compress/bzip2.(*Reader).decodePrefix
          110ms 1.16% 95.48% 760ms 7.98%
      github.com/dsnet/compress/internal/prefix.(*Reader).ReadSymbol

    Currently, the burrowsWheelerTransform tops the list (which makes sense
    given that it is the heart of bzip2).

    I realize that these two lists are not exactly comparable since the
    implementation is different. Also, the standard library has a single large
    function called readBlock, while mine is broken up into more smaller
    function and its not easy to determine a one-to-one relationship between
    these. An additional difficulty in comparing why these two perform
    differently is that my BWT decoding occurs in a single function call, while
    the standard library currently spreads the work across multiple stages,
    making it difficult to actually track how much CPU time is spent on each
    component.

    JT

    On Saturday, December 12, 2015 at 6:24:01 AM UTC-8, alb.do...@gmail.com
    wrote:
    It would be great if we could merge the encoder, since the bzip2 package
    currently does not have one!

    I have a few questions about the decoder benchmarks, since I benchmarked
    the standard library one when I mailed CL 13853, that closed issue #6754
    (the one you linked above). What are the sizes of the archives you are
    decoding?

    I see numbers at the end of your benchmarks names, but I'm not sure how to
    read
    them. I'm asking because if my assumption that the benchmarks are ordered
    by
    size, (i.e. the size is the last number in the name) is correct, it
    appears that your
    approach is much faster (4x) than the current one on very small files, and
    about
    1.4x faster on the biggest files you are benchmarking. I think you'll
    agree with me
    if I say that being faster on a big file (that takes a lot of time to
    decode) is better
    that being faster on a 0.1 millisecond decode operation : )

    So I'm asking how big is the biggest archive you benchmarked, and if you
    tested
    your decoder on a really big archive (let's say 100MB of english text: you
    can
    download for example a 100MB archive of wikipedia articles, that's what I
    did when
    emailed my CL).


    On Saturday, December 12, 2015 at 2:16:49 AM UTC+1, thebroke...@gmail.com
    wrote:
    Is the goal to merge this into the standard library?

    Yep, the intent was to fulfill #4828
    <https://github.com/golang/go/issues/4828>, which asks for a bzip2
    encoder. In an indirect way, it was also trying to fulfill #6754
    <https://github.com/golang/go/issues/6754>, which is regarding bzip2
    performance. The specific API I ended up with is slightly different from
    the standard library, but it should be pretty easy to adapt my
    implementation to match the existing API. I don't know if we want to merge
    both encoder and decoder at the same time, or just do one of them at a time.


    To answer some questions I have been getting in private messages:
    Q: How is the decoder much faster?
    I did make a mistake in reporting that the my implementation was 4x
    faster. The standard library benchmark reports the compression rate
    relative to the size of the compressed file, rather than the uncompressed
    size. Thus, the standard library will appear to perform slower than it
    really is. With the corrected benchmarks, my version is anywhere between
    1.5x to 3x faster. Here's some data for decoding (ds is my version, std is
    standard library, cgo is the C library):
    BENCHMARK: bz2:decRate
    benchmark std MB/s delta ds MB/s delta cgo MB/s
    delta
    digits.txt:1:1e4 11.51 1.00x 22.04 1.91x 28.28
    2.46x
    digits.txt:1:1e5 13.85 1.00x 20.00 1.44x 27.02
    1.95x
    digits.txt:1:1e6 14.36 1.00x 19.91 1.39x 27.26
    1.90x
    digits.txt:6:1e4 5.98 1.00x 21.85 3.65x 28.43
    4.75x
    digits.txt:6:1e5 11.92 1.00x 20.62 1.73x 26.35
    2.21x
    digits.txt:6:1e6 12.86 1.00x 18.00 1.40x 24.24
    1.89x
    digits.txt:9:1e4 5.39 1.00x 21.39 3.97x 28.68
    5.32x
    digits.txt:9:1e5 11.36 1.00x 20.14 1.77x 27.21
    2.39x
    digits.txt:9:1e6 12.50 1.00x 17.90 1.43x 22.84
    1.83x
    twain.txt:1:1e4 10.22 1.00x 22.26 2.18x 27.64
    2.70x
    twain.txt:1:1e5 15.91 1.00x 24.17 1.52x 30.59
    1.92x
    twain.txt:1:1e6 16.45 1.00x 23.65 1.44x 31.39
    1.91x
    twain.txt:6:1e4 5.64 1.00x 22.29 3.95x 27.93
    4.95x
    twain.txt:6:1e5 13.40 1.00x 23.76 1.77x 30.49
    2.28x
    twain.txt:6:1e6 15.99 1.00x 21.76 1.36x 30.39
    1.90x
    twain.txt:9:1e4 5.14 1.00x 21.73 4.22x 27.89
    5.42x
    twain.txt:9:1e5 12.99 1.00x 23.64 1.82x 30.34
    2.33x
    twain.txt:9:1e6 15.61 1.00x 20.10 1.29x 28.21
    1.81x

    As for implementation details, my decoder has two major improvements over
    the standard library version:

    - The huffman decoder isn't represented as a tree, which requires a
    loop iteration for every bit. Instead, it uses the LUT method that
    the zlib library uses <http://www.gzip.org/algorithm.txt>. For most
    prefix codes, it can decode them in a single table lookup. It also uses
    careful coding to make sure critical functions are inlined by the compiler.
    - In decoding, the ReadByte method becomes a bottleneck because it is
    called for every byte. The bit reader I have avoids using ReadByte if the
    input Reader is a bufio.Reader (or something that looks like it). Thus, the
    decoder can grab more bytes in a batch, rather going at them one at a time.

    Q: Why won't the library build?
    The library depends on at least Go1.5.

    JT
    On Thursday, December 10, 2015 at 8:46:30 AM UTC-8, bradfitz wrote:

    Is the goal to merge this into the standard library?

    On Tue, Dec 8, 2015 at 5:48 PM, wrote:

    Just an update, I have completed a working decoder. It decompresses at
    ~20MB/s, which about 25% faster than the C library (~16MB/s), and 200%
    faster than the Go standard library (~5MB/s).

    https://godoc.org/github.com/dsnet/compress/bzip2

    Enjoy.

    JT

    On Monday, December 7, 2015 at 11:55:41 PM UTC-8, thebroke...@gmail.com
    wrote:
    Hey all,

    github: https://github.com/dsnet/compress
    godoc: https://godoc.org/github.com/dsnet/compress/bzip2

    I finished writing a bzip2 encoder. I haven't heavily tested it, and
    there's still alot to do in the area of optimizations. However, it still
    performs relatively well relative to the C implementation and is actually
    faster in encoding than the standard Go library is at in decoding. The
    speed is about half that of the C version, but I think I can close them gap
    to be about 1.5x or even 1x. The compression ratio is about 10% worse since
    I haven't implemented a K-means based huffman encoder yet. I have plans to
    add a decoder eventually that will outperform the standard library version.

    BENCHMARK: bz2:encRate
    benchmark ds MB/s delta cgo MB/s delta
    digits.txt:1:1e4 5.44 1.00x 7.55 1.39x
    digits.txt:1:1e5 5.57 1.00x 12.51 2.25x
    digits.txt:1:1e6 5.76 1.00x 12.88 2.24x
    digits.txt:6:1e4 4.86 1.00x 7.47 1.54x
    digits.txt:6:1e5 5.54 1.00x 12.46 2.25x
    digits.txt:6:1e6 5.39 1.00x 12.46 2.31x
    digits.txt:9:1e4 4.66 1.00x 7.54 1.62x
    digits.txt:9:1e5 5.48 1.00x 12.60 2.30x
    digits.txt:9:1e6 5.02 1.00x 12.11 2.41x
    twain.txt:1:1e4 5.26 1.00x 7.89 1.50x
    twain.txt:1:1e5 6.02 1.00x 12.89 2.14x
    twain.txt:1:1e6 6.19 1.00x 12.83 2.07x
    twain.txt:6:1e4 4.75 1.00x 8.07 1.70x
    twain.txt:6:1e5 6.02 1.00x 12.93 2.15x
    twain.txt:6:1e6 6.09 1.00x 12.51 2.06x
    twain.txt:9:1e4 4.54 1.00x 7.93 1.75x
    twain.txt:9:1e5 6.00 1.00x 12.50 2.08x
    twain.txt:9:1e6 5.73 1.00x 12.13 2.12x
    BENCHMARK: bz2:ratio
    benchmark ds ratio delta cgo ratio delta
    digits.txt:1:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:1:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:1:1e6 2.22x 1.00x 2.31x 1.04x
    digits.txt:6:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:6:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:6:1e6 2.12x 1.00x 2.23x 1.05x
    digits.txt:9:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:9:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:9:1e6 2.05x 1.00x 2.16x 1.05x
    twain.txt:1:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:1:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:1:1e6 2.68x 1.00x 2.89x 1.08x
    twain.txt:6:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:6:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:6:1e6 2.87x 1.00x 3.12x 1.09x
    twain.txt:9:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:9:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:9:1e6 2.87x 1.00x 3.12x 1.09x

    This package is still in development, but I would appreciate if people
    can try using it and tell me about any bugs. I'll make a bigger
    announcement about it when I feel that the package is stable enough. I
    eventually plan on merging it into the standard library, but that
    definitely won't happen until go1.7 or later.

    JT
    --
    You received this message because you are subscribed to the Google
    Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to golang-nuts...@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Damian Gryski at Dec 13, 2015 at 9:08 am
    Don't forget to test with invalid inputs and of course https://github.com/dvyukov/go-fuzz :)

    Damian

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Alb Donizetti at Dec 13, 2015 at 12:53 pm
    Looks great! I was a little concerned about the decreasing speedups
    trend, but it looks like the decoder is very fast on big files too.

    On Sunday, December 13, 2015 at 8:40:50 AM UTC+1, thebroke...@gmail.com
    wrote:
    Hey Alberto,

    The format of the benchmarks (I should have labelled them) is
    <test_file>:<compression_level>:<test_filesize>, as you correctly surmised.
    I didn't really test too many large files yet and plan to do so for both
    the encoder and decoder. Fortunately, other people are doing benchmarks and
    tests have reported an out-of-bounds error on the encoder side, which I'll
    look into later.

    As for the decoder, I just ran a benchmark on a go1.5.2.linux-amd64.tar,
    which is 270MiB uncompressed, and 64MiB compressed. Here are the results:

    BenchmarkStd-4 1 14094699464 ns/op 20.07 MB/s 15858536 B/op
    10118 allocs/op
    BenchmarkDS-4 1 8983998803 ns/op 31.48 MB/s 2524556680 B/op
    15996 allocs/op


    Currently, my implementation allocates a lot of garbage, since I don't
    cache the slices. Hence, my implementation performs 150x more allocations.
    This is a relatively low-hanging fruit in terms of optimization, and I plan
    on addressing it soon.

    Also, looking at the implementations under pprof. Here are the top
    functions for the standard library implementation:

    flat flat% sum% cum cum%
    4470ms 29.96% 29.96% 5810ms 38.94%
    compress/bzip2.(*huffmanTree).Decode
    4430ms 29.69% 59.65% 4430ms 29.69%
    compress/bzip2.(*reader).readFromBlock
    1830ms 12.27% 71.92% 9320ms 62.47%
    compress/bzip2.(*reader).readBlock
    1000ms 6.70% 78.62% 1000ms 6.70% compress/bzip2.updateCRC
    810ms 5.43% 84.05% 810ms 5.43% compress/bzip2.inverseBWT

    We can see that huffman tree decoding tops this list. The second function
    that does most work is readFromBlock, which is probably because the
    de-scrambling phase of inverse BWT is performed there along with RLE.

    Here's my version under the same load:

    4790ms 50.32% 50.32% 4960ms 52.10%
    github.com/dsnet/compress/bzip2.(*burrowsWheelerTransform).Decode
    1060ms 11.13% 79.20% 1060ms 11.13%
    github.com/dsnet/compress/bzip2.(*runLengthEncoding).Read
    800ms 8.40% 87.61% 1860ms 19.54%
    github.com/dsnet/compress/bzip2.(*moveToFront).Decode
    460ms 4.83% 92.44% 1380ms 14.50%
    github.com/dsnet/compress/bzip2.(*Reader).decodePrefix
    110ms 1.16% 95.48% 760ms 7.98%
    github.com/dsnet/compress/internal/prefix.(*Reader).ReadSymbol

    Currently, the burrowsWheelerTransform tops the list (which makes sense
    given that it is the heart of bzip2).

    I realize that these two lists are not exactly comparable since the
    implementation is different. Also, the standard library has a single large
    function called readBlock, while mine is broken up into more smaller
    function and its not easy to determine a one-to-one relationship between
    these. An additional difficulty in comparing why these two perform
    differently is that my BWT decoding occurs in a single function call, while
    the standard library currently spreads the work across multiple stages,
    making it difficult to actually track how much CPU time is spent on each
    component.

    JT

    On Saturday, December 12, 2015 at 6:24:01 AM UTC-8, alb.do...@gmail.com
    wrote:
    It would be great if we could merge the encoder, since the bzip2 package
    currently does not have one!

    I have a few questions about the decoder benchmarks, since I benchmarked
    the standard library one when I mailed CL 13853, that closed issue #6754
    (the one you linked above). What are the sizes of the archives you are
    decoding?

    I see numbers at the end of your benchmarks names, but I'm not sure how
    to read
    them. I'm asking because if my assumption that the benchmarks are ordered
    by
    size, (i.e. the size is the last number in the name) is correct, it
    appears that your
    approach is much faster (4x) than the current one on very small files,
    and about
    1.4x faster on the biggest files you are benchmarking. I think you'll
    agree with me
    if I say that being faster on a big file (that takes a lot of time to
    decode) is better
    that being faster on a 0.1 millisecond decode operation : )

    So I'm asking how big is the biggest archive you benchmarked, and if you
    tested
    your decoder on a really big archive (let's say 100MB of english text:
    you can
    download for example a 100MB archive of wikipedia articles, that's what I
    did when
    emailed my CL).


    On Saturday, December 12, 2015 at 2:16:49 AM UTC+1, thebroke...@gmail.com
    wrote:
    Is the goal to merge this into the standard library?

    Yep, the intent was to fulfill #4828
    <https://github.com/golang/go/issues/4828>, which asks for a bzip2
    encoder. In an indirect way, it was also trying to fulfill #6754
    <https://github.com/golang/go/issues/6754>, which is regarding bzip2
    performance. The specific API I ended up with is slightly different from
    the standard library, but it should be pretty easy to adapt my
    implementation to match the existing API. I don't know if we want to merge
    both encoder and decoder at the same time, or just do one of them at a time.


    To answer some questions I have been getting in private messages:
    Q: How is the decoder much faster?
    I did make a mistake in reporting that the my implementation was 4x
    faster. The standard library benchmark reports the compression rate
    relative to the size of the compressed file, rather than the uncompressed
    size. Thus, the standard library will appear to perform slower than it
    really is. With the corrected benchmarks, my version is anywhere between
    1.5x to 3x faster. Here's some data for decoding (ds is my version, std is
    standard library, cgo is the C library):
    BENCHMARK: bz2:decRate
    benchmark std MB/s delta ds MB/s delta cgo MB/s
    delta
    digits.txt:1:1e4 11.51 1.00x 22.04 1.91x 28.28
    2.46x
    digits.txt:1:1e5 13.85 1.00x 20.00 1.44x 27.02
    1.95x
    digits.txt:1:1e6 14.36 1.00x 19.91 1.39x 27.26
    1.90x
    digits.txt:6:1e4 5.98 1.00x 21.85 3.65x 28.43
    4.75x
    digits.txt:6:1e5 11.92 1.00x 20.62 1.73x 26.35
    2.21x
    digits.txt:6:1e6 12.86 1.00x 18.00 1.40x 24.24
    1.89x
    digits.txt:9:1e4 5.39 1.00x 21.39 3.97x 28.68
    5.32x
    digits.txt:9:1e5 11.36 1.00x 20.14 1.77x 27.21
    2.39x
    digits.txt:9:1e6 12.50 1.00x 17.90 1.43x 22.84
    1.83x
    twain.txt:1:1e4 10.22 1.00x 22.26 2.18x 27.64
    2.70x
    twain.txt:1:1e5 15.91 1.00x 24.17 1.52x 30.59
    1.92x
    twain.txt:1:1e6 16.45 1.00x 23.65 1.44x 31.39
    1.91x
    twain.txt:6:1e4 5.64 1.00x 22.29 3.95x 27.93
    4.95x
    twain.txt:6:1e5 13.40 1.00x 23.76 1.77x 30.49
    2.28x
    twain.txt:6:1e6 15.99 1.00x 21.76 1.36x 30.39
    1.90x
    twain.txt:9:1e4 5.14 1.00x 21.73 4.22x 27.89
    5.42x
    twain.txt:9:1e5 12.99 1.00x 23.64 1.82x 30.34
    2.33x
    twain.txt:9:1e6 15.61 1.00x 20.10 1.29x 28.21
    1.81x

    As for implementation details, my decoder has two major improvements
    over the standard library version:

    - The huffman decoder isn't represented as a tree, which requires a
    loop iteration for every bit. Instead, it uses the LUT method that
    the zlib library uses <http://www.gzip.org/algorithm.txt>. For most
    prefix codes, it can decode them in a single table lookup. It also uses
    careful coding to make sure critical functions are inlined by the compiler.
    - In decoding, the ReadByte method becomes a bottleneck because it
    is called for every byte. The bit reader I have avoids using ReadByte if
    the input Reader is a bufio.Reader (or something that looks like it). Thus,
    the decoder can grab more bytes in a batch, rather going at them one at a
    time.

    Q: Why won't the library build?
    The library depends on at least Go1.5.

    JT
    On Thursday, December 10, 2015 at 8:46:30 AM UTC-8, bradfitz wrote:

    Is the goal to merge this into the standard library?

    On Tue, Dec 8, 2015 at 5:48 PM, wrote:

    Just an update, I have completed a working decoder. It decompresses at
    ~20MB/s, which about 25% faster than the C library (~16MB/s), and 200%
    faster than the Go standard library (~5MB/s).

    https://godoc.org/github.com/dsnet/compress/bzip2

    Enjoy.

    JT

    On Monday, December 7, 2015 at 11:55:41 PM UTC-8,
    thebroke...@gmail.com wrote:
    Hey all,

    github: https://github.com/dsnet/compress
    godoc: https://godoc.org/github.com/dsnet/compress/bzip2

    I finished writing a bzip2 encoder. I haven't heavily tested it, and
    there's still alot to do in the area of optimizations. However, it still
    performs relatively well relative to the C implementation and is actually
    faster in encoding than the standard Go library is at in decoding. The
    speed is about half that of the C version, but I think I can close them gap
    to be about 1.5x or even 1x. The compression ratio is about 10% worse since
    I haven't implemented a K-means based huffman encoder yet. I have plans to
    add a decoder eventually that will outperform the standard library version.

    BENCHMARK: bz2:encRate
    benchmark ds MB/s delta cgo MB/s delta
    digits.txt:1:1e4 5.44 1.00x 7.55 1.39x
    digits.txt:1:1e5 5.57 1.00x 12.51 2.25x
    digits.txt:1:1e6 5.76 1.00x 12.88 2.24x
    digits.txt:6:1e4 4.86 1.00x 7.47 1.54x
    digits.txt:6:1e5 5.54 1.00x 12.46 2.25x
    digits.txt:6:1e6 5.39 1.00x 12.46 2.31x
    digits.txt:9:1e4 4.66 1.00x 7.54 1.62x
    digits.txt:9:1e5 5.48 1.00x 12.60 2.30x
    digits.txt:9:1e6 5.02 1.00x 12.11 2.41x
    twain.txt:1:1e4 5.26 1.00x 7.89 1.50x
    twain.txt:1:1e5 6.02 1.00x 12.89 2.14x
    twain.txt:1:1e6 6.19 1.00x 12.83 2.07x
    twain.txt:6:1e4 4.75 1.00x 8.07 1.70x
    twain.txt:6:1e5 6.02 1.00x 12.93 2.15x
    twain.txt:6:1e6 6.09 1.00x 12.51 2.06x
    twain.txt:9:1e4 4.54 1.00x 7.93 1.75x
    twain.txt:9:1e5 6.00 1.00x 12.50 2.08x
    twain.txt:9:1e6 5.73 1.00x 12.13 2.12x
    BENCHMARK: bz2:ratio
    benchmark ds ratio delta cgo ratio delta
    digits.txt:1:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:1:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:1:1e6 2.22x 1.00x 2.31x 1.04x
    digits.txt:6:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:6:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:6:1e6 2.12x 1.00x 2.23x 1.05x
    digits.txt:9:1e4 2.20x 1.00x 2.28x 1.04x
    digits.txt:9:1e5 2.23x 1.00x 2.32x 1.04x
    digits.txt:9:1e6 2.05x 1.00x 2.16x 1.05x
    twain.txt:1:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:1:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:1:1e6 2.68x 1.00x 2.89x 1.08x
    twain.txt:6:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:6:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:6:1e6 2.87x 1.00x 3.12x 1.09x
    twain.txt:9:1e4 2.07x 1.00x 2.25x 1.09x
    twain.txt:9:1e5 2.68x 1.00x 2.90x 1.08x
    twain.txt:9:1e6 2.87x 1.00x 3.12x 1.09x

    This package is still in development, but I would appreciate if
    people can try using it and tell me about any bugs. I'll make a bigger
    announcement about it when I feel that the package is stable enough. I
    eventually plan on merging it into the standard library, but that
    definitely won't happen until go1.7 or later.

    JT
    --
    You received this message because you are subscribed to the Google
    Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to golang-nuts...@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedDec 8, '15 at 7:55a
activeDec 13, '15 at 12:53p
posts11
users5
websitegolang.org

People

Translate

site design / logo © 2021 Grokbase