FAQ
I write a simple zlib code using compress/zlib, but it's way too slow than
the python version

I know python using c extension to do that

but encoding/json package has the same efficiency of python json module,
but slower than UltraJson extension of python

So, is there anyway to make zlib faster? My project is heavily depends on
this functionality

here is the test code:

package main


import (

"bytes"

"compress/zlib"


"fmt"

"io"


"time"

)






func main() {

times = 30000

var in, out bytes.Buffer

b := []byte(
`{"Name":"Wednesday","Age":6,"Parents":["Gomez","Morticia"],"test":{"prop1":1,"prop2":[1,2,3]}}`
)

t1 = time.Now()

for i := 0; i < times; i++ {

w := zlib.NewWriter(&in)

w.Write(b)

w.Flush()

r, _ := zlib.NewReader(&in)

io.Copy(&out, r)

in.Reset()

out.Reset()

}

fmt.Println(time.Since(t1))


}

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Search Discussions

  • Bryanturley at Feb 4, 2013 at 5:47 pm
    Try the benchmark compressing over 20+MB, the overhead to compress that
    handful of bytes might be killing your test.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Bryan Turley at Feb 4, 2013 at 6:38 pm
    This went offlist let me put it back on for context.
    On Mon, Feb 4, 2013 at 11:58 AM, davy zhang wrote:

    Thanks for the advise.
    I don't know if the size added up to 20+MB will produce a better result,
    but this test is based on my project situation. I use zlib to compress the
    network packets, not files. Seldom packets can be larger than 1MB. C zlib
    with python did a good job on this point
    Most real network packets are smaller than 1500 bytes actually, what you
    should do is use zlib (or other compressors) on a stream of data not
    individual packets.

    20+MB is not a magic number, it is just compressing only 20-40 bytes is not
    going to get you as much as a longer stream.


    在 2013年2月5日星期二UTC+8上午1时47分48秒,bryanturley写道:
    Try the benchmark compressing over 20+MB, the overhead to compress that
    handful of bytes might be killing your test.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Steve wang at Feb 4, 2013 at 6:32 pm
    io.Copy only use buffer with the fixed size of 32K bytes, which may in a
    way slow down your program if your data to be compressed is bulky. I think
    you can make use of bufio to improve your program's performance in this
    case.
    I can't tell more without seeing your real code and its counterpart which
    is written in python.
    On Monday, February 4, 2013 5:37:51 PM UTC+8, davy zhang wrote:

    I write a simple zlib code using compress/zlib, but it's way too slow than
    the python version

    I know python using c extension to do that

    but encoding/json package has the same efficiency of python json module,
    but slower than UltraJson extension of python

    So, is there anyway to make zlib faster? My project is heavily depends on
    this functionality

    here is the test code:

    package main


    import (

    "bytes"

    "compress/zlib"


    "fmt"

    "io"


    "time"

    )






    func main() {

    times = 30000

    var in, out bytes.Buffer

    b := []byte(
    `{"Name":"Wednesday","Age":6,"Parents":["Gomez","Morticia"],"test":{"prop1":1,"prop2":[1,2,3]}}`
    )

    t1 = time.Now()

    for i := 0; i < times; i++ {

    w := zlib.NewWriter(&in)

    w.Write(b)

    w.Flush()

    r, _ := zlib.NewReader(&in)

    io.Copy(&out, r)

    in.Reset()

    out.Reset()

    }

    fmt.Println(time.Since(t1))


    }
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Dave Cheney at Feb 4, 2013 at 8:17 pm
    Please use that standard benchmark idiom when benchmarking. You can find examples in the compress/* packages themselves.

    Please provide the usual details about your hardware platform, os and go version.

    Please provide the python version for comparison so that others can reproduce your benchmark.

    Cheers

    Dave
    On 05/02/2013, at 5:32, steve wang wrote:

    io.Copy only use buffer with the fixed size of 32K bytes, which may in a way slow down your program if your data to be compressed is bulky. I think you can make use of bufio to improve your program's performance in this case.
    I can't tell more without seeing your real code and its counterpart which is written in python.
    On Monday, February 4, 2013 5:37:51 PM UTC+8, davy zhang wrote:

    I write a simple zlib code using compress/zlib, but it's way too slow than the python version

    I know python using c extension to do that

    but encoding/json package has the same efficiency of python json module, but slower than UltraJson extension of python

    So, is there anyway to make zlib faster? My project is heavily depends on this functionality

    here is the test code:
    package main



    import (

    "bytes"

    "compress/zlib"



    "fmt"

    "io"



    "time"

    )











    func main() {

    times = 30000

    var in, out bytes.Buffer

    b := []byte(`{"Name":"Wednesday","Age":6,"Parents":["Gomez","Morticia"],"test":{"prop1":1,"prop2":[1,2,3]}}`)

    t1 = time.Now()

    for i := 0; i < times; i++ {

    w := zlib.NewWriter(&in)

    w.Write(b)

    w.Flush()

    r, _ := zlib.NewReader(&in)

    io.Copy(&out, r)

    in.Reset()

    out.Reset()

    }

    fmt.Println(time.Since(t1))



    }
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Davy zhang at Feb 5, 2013 at 4:59 pm
    I did this test on my macbook pro with i7 dual core and macos 10.8
    the comparable python version is braindead simple

    import time

    import zlib

    s = '{"Name":"Wednesday","Age":6,"Parents":["Gomez","Morticia
    "],"test":{"prop1":1,"prop2":[1,2,3]}}'

    st = time.time()

    for i in xrange(times):

    zlib.decompress(zlib.compress(s))

    et = time.time()

    print "zlib:",et - st
    python2.7 and golang 1.0.3

    thanks for any advice
    On Tuesday, February 5, 2013 4:17:07 AM UTC+8, Dave Cheney wrote:

    Please use that standard benchmark idiom when benchmarking. You can find
    examples in the compress/* packages themselves.

    Please provide the usual details about your hardware platform, os and go
    version.

    Please provide the python version for comparison so that others can
    reproduce your benchmark.

    Cheers

    Dave

    On 05/02/2013, at 5:32, steve wang <steve....@gmail.com <javascript:>>
    wrote:

    io.Copy only use buffer with the fixed size of 32K bytes, which may in a
    way slow down your program if your data to be compressed is bulky. I think
    you can make use of bufio to improve your program's performance in this
    case.
    I can't tell more without seeing your real code and its counterpart which
    is written in python.
    On Monday, February 4, 2013 5:37:51 PM UTC+8, davy zhang wrote:

    I write a simple zlib code using compress/zlib, but it's way too slow
    than the python version

    I know python using c extension to do that

    but encoding/json package has the same efficiency of python json module,
    but slower than UltraJson extension of python

    So, is there anyway to make zlib faster? My project is heavily depends on
    this functionality

    here is the test code:

    package main


    import (

    "bytes"

    "compress/zlib"


    "fmt"

    "io"


    "time"

    )






    func main() {

    times = 30000

    var in, out bytes.Buffer

    b := []byte(
    `{"Name":"Wednesday","Age":6,"Parents":["Gomez","Morticia"],"test":{"prop1":1,"prop2":[1,2,3]}}`
    )

    t1 = time.Now()

    for i := 0; i < times; i++ {

    w := zlib.NewWriter(&in)

    w.Write(b)

    w.Flush()

    r, _ := zlib.NewReader(&in)

    io.Copy(&out, r)

    in.Reset()

    out.Reset()

    }

    fmt.Println(time.Since(t1))


    }
    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts...@googlegroups.com <javascript:>.
    For more options, visit https://groups.google.com/groups/opt_out.


    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Steve wang at Feb 5, 2013 at 6:34 pm
    My profiling suggests that the performance of zlib need improvement on
    memory management.
    On Wednesday, February 6, 2013 12:59:01 AM UTC+8, davy zhang wrote:

    I did this test on my macbook pro with i7 dual core and macos 10.8
    the comparable python version is braindead simple

    import time

    import zlib

    s = '{"Name":"Wednesday","Age":6,"Parents":["Gomez","Morticia
    "],"test":{"prop1":1,"prop2":[1,2,3]}}'

    st = time.time()

    for i in xrange(times):

    zlib.decompress(zlib.compress(s))

    et = time.time()

    print "zlib:",et - st
    python2.7 and golang 1.0.3

    thanks for any advice
    On Tuesday, February 5, 2013 4:17:07 AM UTC+8, Dave Cheney wrote:

    Please use that standard benchmark idiom when benchmarking. You can find
    examples in the compress/* packages themselves.

    Please provide the usual details about your hardware platform, os and go
    version.

    Please provide the python version for comparison so that others can
    reproduce your benchmark.

    Cheers

    Dave

    On 05/02/2013, at 5:32, steve wang wrote:

    io.Copy only use buffer with the fixed size of 32K bytes, which may in a
    way slow down your program if your data to be compressed is bulky. I think
    you can make use of bufio to improve your program's performance in this
    case.
    I can't tell more without seeing your real code and its counterpart which
    is written in python.
    On Monday, February 4, 2013 5:37:51 PM UTC+8, davy zhang wrote:

    I write a simple zlib code using compress/zlib, but it's way too slow
    than the python version

    I know python using c extension to do that

    but encoding/json package has the same efficiency of python json module,
    but slower than UltraJson extension of python

    So, is there anyway to make zlib faster? My project is heavily depends
    on this functionality

    here is the test code:

    package main


    import (

    "bytes"

    "compress/zlib"


    "fmt"

    "io"


    "time"

    )






    func main() {

    times = 30000

    var in, out bytes.Buffer

    b := []byte(
    `{"Name":"Wednesday","Age":6,"Parents":["Gomez","Morticia"],"test":{"prop1":1,"prop2":[1,2,3]}}`
    )

    t1 = time.Now()

    for i := 0; i < times; i++ {

    w := zlib.NewWriter(&in)

    w.Write(b)

    w.Flush()

    r, _ := zlib.NewReader(&in)

    io.Copy(&out, r)

    in.Reset()

    out.Reset()

    }

    fmt.Println(time.Since(t1))


    }
    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts...@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.


    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Davy zhang at Feb 6, 2013 at 5:12 am
    yep, after I posted this thread, I tried to wrap the zlib directly using
    CGO. I found the binary write/read is "very" expensive in golang.

    The code like this:

    buf := new(bytes.Buffer)

    // buf := bytes.NewBuffer(rawBytes) // this will improve a little bit

    binary.Write(buf, binary.BigEndian, uint32(dstLen))

    binary.Write(buf, binary.BigEndian, rawBytes)
    will definitely slow codes down like 30% in my test.

    if I use the pure cgo call to zlib compress/uncompress with some wrap, the
    performance is almost the same as python zlib

    but when I try to store the orignal length of rawBytes for uncompress
    allocation using binary package

    the code will significantly slower than the previous one.

    the pure call one can result 438.452ms the python version is 431ms, the
    binary header read/write version is 4694.922ms

    So I guess the memory management or gc is bad performance under pressure

    here is the sample code, note the commented code, I commented them for
    performance improve

    func maxZipLen(nLenSrc int) int {

    n16kBlocks := (nLenSrc + 16383) / 16384 // round up any fraction of a block

    return (nLenSrc + 6 + (n16kBlocks * 5))

    }


    func Zip(src *[]byte) []byte {

    srcLen := len(*src)

    raw := unsafe.Pointer(&((*src)[0])) // change []byte to Pointer


    memLen := C.size_t(maxZipLen(srcLen))

    // fmt.Println("mem length is ", memLen)

    dst := C.calloc(memLen, 1)

    defer C.free(dst)

    dstLen := C.ulong(memLen)

    C.zcompress(dst, &dstLen, raw, C.ulong(srcLen))

    //write the compressed length

    rawBytes := C.GoBytes(dst, C.int(dstLen))

    // buf := new(bytes.Buffer)

    // buf := bytes.NewBuffer(rawBytes)

    // binary.Write(buf, binary.BigEndian, uint32(dstLen))

    // binary.Write(buf, binary.BigEndian, rawBytes)

    // fmt.Printf("%02x\n",buf.Bytes())

    // return buf.Bytes()

    return rawBytes

    }


    func UnZip(src *[]byte, oriLen uint32) []byte {

    srcLen := len(*src)


    buf := new(bytes.Buffer)

    buf.Write(*src)

    // binary.Read(buf, binary.BigEndian, &oriLen)

    // fmt.Println("original size found ", oriLen)


    // rawBytes := make([]byte, oriLen)

    // binary.Read(buf, binary.BigEndian, &rawBytes)

    // ioutil.WriteFile("/tmp/go_compressed_inter", rawBytes, 0644)

    // raw := unsafe.Pointer(&((rawBytes)[0])) // change []byte to Pointer

    raw := unsafe.Pointer(&((*src)[0])) // change []byte to Pointer

    // fmt.Println("mem length is ", oriLen)

    dst := C.calloc(C.size_t(oriLen), 1)

    defer C.free(dst)

    dstLen := C.ulong(oriLen)

    C.zuncompress(dst, &dstLen, raw, C.ulong(srcLen))

    // fmt.Println("origLen after uncompressed", dstLen)


    // fmt.Printf("%02x\n",buf.Bytes())

    return C.GoBytes(dst, C.int(dstLen))

    }

    On Wednesday, February 6, 2013 2:34:05 AM UTC+8, steve wang wrote:

    My profiling suggests that the performance of zlib need improvement on
    memory management.
    On Wednesday, February 6, 2013 12:59:01 AM UTC+8, davy zhang wrote:

    I did this test on my macbook pro with i7 dual core and macos 10.8
    the comparable python version is braindead simple

    import time

    import zlib

    s = '{"Name":"Wednesday","Age":6,"Parents":["Gomez","Morticia
    "],"test":{"prop1":1,"prop2":[1,2,3]}}'

    st = time.time()

    for i in xrange(times):

    zlib.decompress(zlib.compress(s))

    et = time.time()

    print "zlib:",et - st
    python2.7 and golang 1.0.3

    thanks for any advice
    On Tuesday, February 5, 2013 4:17:07 AM UTC+8, Dave Cheney wrote:

    Please use that standard benchmark idiom when benchmarking. You can find
    examples in the compress/* packages themselves.

    Please provide the usual details about your hardware platform, os and go
    version.

    Please provide the python version for comparison so that others can
    reproduce your benchmark.

    Cheers

    Dave

    On 05/02/2013, at 5:32, steve wang wrote:

    io.Copy only use buffer with the fixed size of 32K bytes, which may in a
    way slow down your program if your data to be compressed is bulky. I think
    you can make use of bufio to improve your program's performance in this
    case.
    I can't tell more without seeing your real code and its counterpart
    which is written in python.
    On Monday, February 4, 2013 5:37:51 PM UTC+8, davy zhang wrote:

    I write a simple zlib code using compress/zlib, but it's way too slow
    than the python version

    I know python using c extension to do that

    but encoding/json package has the same efficiency of python json
    module, but slower than UltraJson extension of python

    So, is there anyway to make zlib faster? My project is heavily depends
    on this functionality

    here is the test code:

    package main


    import (

    "bytes"

    "compress/zlib"


    "fmt"

    "io"


    "time"

    )






    func main() {

    times = 30000

    var in, out bytes.Buffer

    b := []byte(
    `{"Name":"Wednesday","Age":6,"Parents":["Gomez","Morticia"],"test":{"prop1":1,"prop2":[1,2,3]}}`
    )

    t1 = time.Now()

    for i := 0; i < times; i++ {

    w := zlib.NewWriter(&in)

    w.Write(b)

    w.Flush()

    r, _ := zlib.NewReader(&in)

    io.Copy(&out, r)

    in.Reset()

    out.Reset()

    }

    fmt.Println(time.Since(t1))


    }
    --
    You received this message because you are subscribed to the Google
    Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to golang-nuts...@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.


    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Dave Cheney at Feb 6, 2013 at 5:14 am

    yep, after I posted this thread, I tried to wrap the zlib directly using
    CGO. I found the binary write/read is "very" expensive in golang.
    cgo transitions are expensive.
    if I use the pure cgo call to zlib compress/uncompress with some wrap, the
    performance is almost the same as python zlib
    That would make sense, both are having to translate from the Go/Python
    to C environment
    So I guess the memory management or gc is bad performance under pressure
    Don't guess, measure. GOGCTRACE=1 may be useful here

    Dave

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Sugu Sougoumarane at Feb 6, 2013 at 6:08 pm
    We ran into the same issues with vitess. We have a cgo wrapper that's
    within 2% of the C library's performance for our tests. YMMV:
    http://code.google.com/p/vitess/source/browse/#hg%2Fgo%2Fcgzip

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Msolomon at Feb 6, 2013 at 6:53 pm
    cgo transitions are not expensive enough to be an issue here.

    We have the exact same issues with performance in Vitess, so we have the
    cgzip module which works just as you describe. It's performance is within
    ~2% of the C version IIRC. I've dropped a link to the module if you want
    to just use it.

    There are several issues contributing to the inefficiency of pure-Go zlib.
    They are all fixable, but if linking via CGO is an option, I would take
    that road for now.

    http://code.google.com/p/vitess/source/browse/#hg%2Fgo%2Fcgzip
    On Tuesday, February 5, 2013 9:14:44 PM UTC-8, Dave Cheney wrote:

    yep, after I posted this thread, I tried to wrap the zlib directly using
    CGO. I found the binary write/read is "very" expensive in golang.
    cgo transitions are expensive.
    if I use the pure cgo call to zlib compress/uncompress with some wrap, the
    performance is almost the same as python zlib
    That would make sense, both are having to translate from the Go/Python
    to C environment
    So I guess the memory management or gc is bad performance under pressure
    Don't guess, measure. GOGCTRACE=1 may be useful here

    Dave
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Hamish Ogilvy at Feb 9, 2013 at 12:17 am
    Interesting. How about memory usage, have you done any measurements? I'm
    currently suspecting zlib leaves a huge memory footprint behind if lots of
    smaller files are compressed quickly, still analysing before posting some
    data though.

    I'll have a look at cgzip though. Thanks for the tip.

    Regards,
    Hamish

    On Thursday, 7 February 2013 04:26:24 UTC+11, Mike Solomon wrote:

    cgo transitions are not expensive enough to be an issue here.

    We have the exact same issues with performance in Vitess, so we have the
    cgzip module which works just as you describe. It's performance is within
    ~2% of the C version IIRC. I've dropped a link to the module if you want
    to just use it.

    There are several issues contributing to the inefficiency of pure-Go zlib.
    They are all fixable, but if linking via CGO is an option, I would take
    that road for now.

    http://code.google.com/p/vitess/source/browse/#hg%2Fgo%2Fcgzip
    On Tuesday, February 5, 2013 9:14:44 PM UTC-8, Dave Cheney wrote:

    yep, after I posted this thread, I tried to wrap the zlib directly using
    CGO. I found the binary write/read is "very" expensive in golang.
    cgo transitions are expensive.
    if I use the pure cgo call to zlib compress/uncompress with some wrap, the
    performance is almost the same as python zlib
    That would make sense, both are having to translate from the Go/Python
    to C environment
    So I guess the memory management or gc is bad performance under
    pressure

    Don't guess, measure. GOGCTRACE=1 may be useful here

    Dave
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Hamish Ogilvy at Feb 12, 2013 at 3:07 am
    Just a quick note to say i was wrong, memory footprint for zlib is fine.
    Don't listen to me...

    On Saturday, 9 February 2013 11:17:38 UTC+11, Hamish Ogilvy wrote:

    Interesting. How about memory usage, have you done any measurements? I'm
    currently suspecting zlib leaves a huge memory footprint behind if lots of
    smaller files are compressed quickly, still analysing before posting some
    data though.

    I'll have a look at cgzip though. Thanks for the tip.

    Regards,
    Hamish

    On Thursday, 7 February 2013 04:26:24 UTC+11, Mike Solomon wrote:

    cgo transitions are not expensive enough to be an issue here.

    We have the exact same issues with performance in Vitess, so we have the
    cgzip module which works just as you describe. It's performance is within
    ~2% of the C version IIRC. I've dropped a link to the module if you want
    to just use it.

    There are several issues contributing to the inefficiency of pure-Go
    zlib. They are all fixable, but if linking via CGO is an option, I would
    take that road for now.

    http://code.google.com/p/vitess/source/browse/#hg%2Fgo%2Fcgzip
    On Tuesday, February 5, 2013 9:14:44 PM UTC-8, Dave Cheney wrote:

    yep, after I posted this thread, I tried to wrap the zlib directly using
    CGO. I found the binary write/read is "very" expensive in golang.
    cgo transitions are expensive.
    if I use the pure cgo call to zlib compress/uncompress with some wrap, the
    performance is almost the same as python zlib
    That would make sense, both are having to translate from the Go/Python
    to C environment
    So I guess the memory management or gc is bad performance under
    pressure

    Don't guess, measure. GOGCTRACE=1 may be useful here

    Dave
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Chengw at Feb 20, 2013 at 2:30 am
    Hi,

    Step 1
    Could you please change
    "binary.Write(buf, binary.BigEndian, rawBytes)"
    to
    "buf.Write(rawBytes)"
    to avoid reflection?

    Step 2
    encoding.binary provides faster paths for basic type like uint32,
    but it still calling "data.(type)", we can remove this by explictly write
    in BigEndian:
    buf.Write(uint8(dstLen >> 24))
    buf.Write(uint8((dstLen >> 16) & 0xff))
    buf.Write(uint8(dstLen >> 8) & 0xff))
    buf.Write(uint8(dstLen & 0xff))
    buf.Write(rawBytes)

    Does this further improve the speed?

    If possible, could you rerun the tests with Step 1 and Step 1+2 and tell us
    the results? Thanks a lot!
    On Wednesday, February 6, 2013 1:12:19 PM UTC+8, davy zhang wrote:

    yep, after I posted this thread, I tried to wrap the zlib directly using
    CGO. I found the binary write/read is "very" expensive in golang.

    The code like this:

    buf := new(bytes.Buffer)

    // buf := bytes.NewBuffer(rawBytes) // this will improve a little bit

    binary.Write(buf, binary.BigEndian, uint32(dstLen))

    binary.Write(buf, binary.BigEndian, rawBytes)
    will definitely slow codes down like 30% in my test.

    if I use the pure cgo call to zlib compress/uncompress with some wrap, the
    performance is almost the same as python zlib

    but when I try to store the orignal length of rawBytes for uncompress
    allocation using binary package

    the code will significantly slower than the previous one.

    the pure call one can result 438.452ms the python version is 431ms, the
    binary header read/write version is 4694.922ms

    So I guess the memory management or gc is bad performance under pressure

    here is the sample code, note the commented code, I commented them for
    performance improve

    func maxZipLen(nLenSrc int) int {

    n16kBlocks := (nLenSrc + 16383) / 16384 // round up any fraction of a
    block

    return (nLenSrc + 6 + (n16kBlocks * 5))

    }


    func Zip(src *[]byte) []byte {

    srcLen := len(*src)

    raw := unsafe.Pointer(&((*src)[0])) // change []byte to Pointer


    memLen := C.size_t(maxZipLen(srcLen))

    // fmt.Println("mem length is ", memLen)

    dst := C.calloc(memLen, 1)

    defer C.free(dst)

    dstLen := C.ulong(memLen)

    C.zcompress(dst, &dstLen, raw, C.ulong(srcLen))

    //write the compressed length

    rawBytes := C.GoBytes(dst, C.int(dstLen))

    // buf := new(bytes.Buffer)

    // buf := bytes.NewBuffer(rawBytes)

    // binary.Write(buf, binary.BigEndian, uint32(dstLen))

    // binary.Write(buf, binary.BigEndian, rawBytes)

    // fmt.Printf("%02x\n",buf.Bytes())

    // return buf.Bytes()

    return rawBytes

    }


    func UnZip(src *[]byte, oriLen uint32) []byte {

    srcLen := len(*src)


    buf := new(bytes.Buffer)

    buf.Write(*src)

    // binary.Read(buf, binary.BigEndian, &oriLen)

    // fmt.Println("original size found ", oriLen)


    // rawBytes := make([]byte, oriLen)

    // binary.Read(buf, binary.BigEndian, &rawBytes)

    // ioutil.WriteFile("/tmp/go_compressed_inter", rawBytes, 0644)

    // raw := unsafe.Pointer(&((rawBytes)[0])) // change []byte to Pointer

    raw := unsafe.Pointer(&((*src)[0])) // change []byte to Pointer

    // fmt.Println("mem length is ", oriLen)

    dst := C.calloc(C.size_t(oriLen), 1)

    defer C.free(dst)

    dstLen := C.ulong(oriLen)

    C.zuncompress(dst, &dstLen, raw, C.ulong(srcLen))

    // fmt.Println("origLen after uncompressed", dstLen)


    // fmt.Printf("%02x\n",buf.Bytes())

    return C.GoBytes(dst, C.int(dstLen))

    }

    On Wednesday, February 6, 2013 2:34:05 AM UTC+8, steve wang wrote:

    My profiling suggests that the performance of zlib need improvement on
    memory management.
    On Wednesday, February 6, 2013 12:59:01 AM UTC+8, davy zhang wrote:

    I did this test on my macbook pro with i7 dual core and macos 10.8
    the comparable python version is braindead simple

    import time

    import zlib

    s = '{"Name":"Wednesday","Age":6,"Parents":["Gomez","Morticia
    "],"test":{"prop1":1,"prop2":[1,2,3]}}'

    st = time.time()

    for i in xrange(times):

    zlib.decompress(zlib.compress(s))

    et = time.time()

    print "zlib:",et - st
    python2.7 and golang 1.0.3

    thanks for any advice
    On Tuesday, February 5, 2013 4:17:07 AM UTC+8, Dave Cheney wrote:

    Please use that standard benchmark idiom when benchmarking. You can
    find examples in the compress/* packages themselves.

    Please provide the usual details about your hardware platform, os and
    go version.

    Please provide the python version for comparison so that others can
    reproduce your benchmark.

    Cheers

    Dave

    On 05/02/2013, at 5:32, steve wang wrote:

    io.Copy only use buffer with the fixed size of 32K bytes, which may in
    a way slow down your program if your data to be compressed is bulky. I
    think you can make use of bufio to improve your program's performance in
    this case.
    I can't tell more without seeing your real code and its counterpart
    which is written in python.
    On Monday, February 4, 2013 5:37:51 PM UTC+8, davy zhang wrote:

    I write a simple zlib code using compress/zlib, but it's way too slow
    than the python version

    I know python using c extension to do that

    but encoding/json package has the same efficiency of python json
    module, but slower than UltraJson extension of python

    So, is there anyway to make zlib faster? My project is heavily depends
    on this functionality

    here is the test code:

    package main


    import (

    "bytes"

    "compress/zlib"


    "fmt"

    "io"


    "time"

    )






    func main() {

    times = 30000

    var in, out bytes.Buffer

    b := []byte(
    `{"Name":"Wednesday","Age":6,"Parents":["Gomez","Morticia"],"test":{"prop1":1,"prop2":[1,2,3]}}`
    )

    t1 = time.Now()

    for i := 0; i < times; i++ {

    w := zlib.NewWriter(&in)

    w.Write(b)

    w.Flush()

    r, _ := zlib.NewReader(&in)

    io.Copy(&out, r)

    in.Reset()

    out.Reset()

    }

    fmt.Println(time.Since(t1))


    }
    --
    You received this message because you are subscribed to the Google
    Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to golang-nuts...@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.


    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Nigel Tao at Feb 7, 2013 at 12:39 am

    On Mon, Feb 4, 2013 at 8:37 PM, davy zhang wrote:
    I write a simple zlib code using compress/zlib, but it's way too slow than
    the python version
    If you're on the stable release (Go 1.0.3), the upcoming Go 1.1
    version should have a faster zlib.
    https://codereview.appspot.com/6872063/ suggests an 1.5x improvement
    in decompression on a MacBook Pro, for decent sized workloads.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedFeb 4, '13 at 4:22p
activeFeb 20, '13 at 2:30a
posts15
users9
websitegolang.org

People

Translate

site design / logo © 2021 Grokbase