FAQ
Reviewers: rsc,

Message:
Hello rsc@golang.org (cc: golang-dev@googlegroups.com),

I'd like you to review this change to
https://code.google.com/p/go


Description:
compress/flate: move the history buffer out of the decompressor struct.

I'm not exactly sure why there's a performance gain, but it seems like
an easy win. Maybe it's a cache line thing. Maybe it's that
unsafe.Sizeof(decompressor{}) drops to below unmappedzero, so that
checkref/checkoffset don't need to insert TESTB instructions. Maybe
it's less noise for the conservative garbage collector. Maybe it's
something else.

compress/flate benchmarks:
BenchmarkDecodeDigitsSpeed1e4 378628 349906 -7.59%
BenchmarkDecodeDigitsSpeed1e5 3481976 3204898 -7.96%
BenchmarkDecodeDigitsSpeed1e6 34419500 31750660 -7.75%
BenchmarkDecodeDigitsDefault1e4 362317 335562 -7.38%
BenchmarkDecodeDigitsDefault1e5 3290032 3107624 -5.54%
BenchmarkDecodeDigitsDefault1e6 30542540 28937480 -5.26%
BenchmarkDecodeDigitsCompress1e4 362803 335158 -7.62%
BenchmarkDecodeDigitsCompress1e5 3294512 3114526 -5.46%
BenchmarkDecodeDigitsCompress1e6 30514940 28927090 -5.20%
BenchmarkDecodeTwainSpeed1e4 412818 389521 -5.64%
BenchmarkDecodeTwainSpeed1e5 3475780 3288908 -5.38%
BenchmarkDecodeTwainSpeed1e6 33629640 31931420 -5.05%
BenchmarkDecodeTwainDefault1e4 369736 348850 -5.65%
BenchmarkDecodeTwainDefault1e5 2861050 2721383 -4.88%
BenchmarkDecodeTwainDefault1e6 27120120 25862050 -4.64%
BenchmarkDecodeTwainCompress1e4 372057 350822 -5.71%
BenchmarkDecodeTwainCompress1e5 2855109 2718664 -4.78%
BenchmarkDecodeTwainCompress1e6 26987010 26336030 -2.41%

image/png benchmarks:
BenchmarkDecodeGray 1841839 1802251 -2.15%
BenchmarkDecodeNRGBAGradient 7115318 6933280 -2.56%
BenchmarkDecodeNRGBAOpaque 6135892 6013284 -2.00%
BenchmarkDecodePaletted 1153313 1114302 -3.38%
BenchmarkDecodeRGB 5619404 5511190 -1.93%

Please review this at http://codereview.appspot.com/6533048/

Affected files:
M src/pkg/compress/flate/inflate.go


Index: src/pkg/compress/flate/inflate.go
===================================================================
--- a/src/pkg/compress/flate/inflate.go
+++ b/src/pkg/compress/flate/inflate.go
@@ -212,7 +212,7 @@
codebits [numCodes]int

// Output history, buffer.
- hist [maxHist]byte
+ hist *[maxHist]byte
hp int // current output position in buffer
hw int // have written hist[0:hw] already
hfull bool // buffer has filled at least once
@@ -693,6 +693,7 @@
func NewReader(r io.Reader) io.ReadCloser {
var f decompressor
f.r = makeReader(r)
+ f.hist = new([maxHist]byte)
f.step = (*decompressor).nextBlock
return &f
}
@@ -704,8 +705,9 @@
// to read data compressed by NewWriterDict.
func NewReaderDict(r io.Reader, dict []byte) io.ReadCloser {
var f decompressor
+ f.r = makeReader(r)
+ f.hist = new([maxHist]byte)
+ f.step = (*decompressor).nextBlock
f.setDict(dict)
- f.r = makeReader(r)
- f.step = (*decompressor).nextBlock
return &f
}

Search Discussions

  • R at Sep 19, 2012 at 3:00 am
  • R at Sep 19, 2012 at 3:06 am
    i'd be curious how it compares if the buffer was the last field of the
    struct

    http://codereview.appspot.com/6533048/
  • Nigel Tao at Sep 19, 2012 at 3:09 am

    On 19 September 2012 13:00, wrote:
    i'd be curious how it compares if the buffer was the last field of the
    struct
    Yeah, I tried that: the speed-up was about half as good (i.e. 2-3%
    instead of 5-7%) as the indirect through a pointer.
  • Nigeltao at Sep 24, 2012 at 7:58 am
    *** Submitted as
    http://code.google.com/p/go/source/detail?r=3d7777fe2e8a ***

    compress/flate: move the history buffer out of the decompressor struct.

    I'm not exactly sure why there's a performance gain, but it seems like
    an easy win. Maybe it's a cache line thing. Maybe it's that
    unsafe.Sizeof(decompressor{}) drops to below unmappedzero, so that
    checkref/checkoffset don't need to insert TESTB instructions. Maybe
    it's less noise for the conservative garbage collector. Maybe it's
    something else.

    compress/flate benchmarks:
    BenchmarkDecodeDigitsSpeed1e4 378628 349906 -7.59%
    BenchmarkDecodeDigitsSpeed1e5 3481976 3204898 -7.96%
    BenchmarkDecodeDigitsSpeed1e6 34419500 31750660 -7.75%
    BenchmarkDecodeDigitsDefault1e4 362317 335562 -7.38%
    BenchmarkDecodeDigitsDefault1e5 3290032 3107624 -5.54%
    BenchmarkDecodeDigitsDefault1e6 30542540 28937480 -5.26%
    BenchmarkDecodeDigitsCompress1e4 362803 335158 -7.62%
    BenchmarkDecodeDigitsCompress1e5 3294512 3114526 -5.46%
    BenchmarkDecodeDigitsCompress1e6 30514940 28927090 -5.20%
    BenchmarkDecodeTwainSpeed1e4 412818 389521 -5.64%
    BenchmarkDecodeTwainSpeed1e5 3475780 3288908 -5.38%
    BenchmarkDecodeTwainSpeed1e6 33629640 31931420 -5.05%
    BenchmarkDecodeTwainDefault1e4 369736 348850 -5.65%
    BenchmarkDecodeTwainDefault1e5 2861050 2721383 -4.88%
    BenchmarkDecodeTwainDefault1e6 27120120 25862050 -4.64%
    BenchmarkDecodeTwainCompress1e4 372057 350822 -5.71%
    BenchmarkDecodeTwainCompress1e5 2855109 2718664 -4.78%
    BenchmarkDecodeTwainCompress1e6 26987010 26336030 -2.41%

    image/png benchmarks:
    BenchmarkDecodeGray 1841839 1802251 -2.15%
    BenchmarkDecodeNRGBAGradient 7115318 6933280 -2.56%
    BenchmarkDecodeNRGBAOpaque 6135892 6013284 -2.00%
    BenchmarkDecodePaletted 1153313 1114302 -3.38%
    BenchmarkDecodeRGB 5619404 5511190 -1.93%

    R=rsc, r
    CC=golang-dev
    http://codereview.appspot.com/6533048


    http://codereview.appspot.com/6533048/

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-dev @
categoriesgo
postedSep 19, '12 at 2:56a
activeSep 24, '12 at 7:58a
posts5
users2
websitegolang.org

2 users in discussion

Nigeltao: 3 posts R: 2 posts

People

Translate

site design / logo © 2022 Grokbase