FAQ
Reran vtocc benchmars, around 10M queries using 100 clients.

Run 1: go version currently used on production 0a3866d6cc6b (Sep 24):
qps: 5832 StackSys: 86MB

Run 2: go @tip d0d76b7fb219 (Jan 3):
qps: 5543 StackSys: 77MB

Run 3: Using CL 6997052:
qps: 5673 StackSys: 3MB

Run 4: Using CL 7029044:
qps: 5699 StackSys: 15MB

Conclusion: Marginal difference in performance between the two CLs. The
older CL uses less memory. Maybe it will be more pronounced if you
passed large objects by value to functions?
The runtime @tip is slower than the one from September, an unrelated
observation.

This is just a summary. I can send you more detailed stats and pprof
captures if needed.

https://codereview.appspot.com/7029044/

Search Discussions

  • Dvyukov at Jan 4, 2013 at 6:05 am

    On 2013/01/03 23:55:50, sougou wrote:
    Reran vtocc benchmars, around 10M queries using 100 clients.
    Run 1: go version currently used on production 0a3866d6cc6b (Sep 24):
    qps: 5832 StackSys: 86MB
    Run 2: go @tip d0d76b7fb219 (Jan 3):
    qps: 5543 StackSys: 77MB
    Run 3: Using CL 6997052:
    qps: 5673 StackSys: 3MB
    Run 4: Using CL 7029044:
    qps: 5699 StackSys: 15MB
    Conclusion: Marginal difference in performance between the two CLs.
    The older CL
    uses less memory. Maybe it will be more pronounced if you passed large objects
    by value to functions?
    The runtime @tip is slower than the one from September, an unrelated
    observation.
    This is just a summary. I can send you more detailed stats and pprof
    captures if
    needed.
    Can you please test with varying values for
    StackCacheSize/StackCacheBatch in src/pkg/runtime/runtime.h?
    Currently they are set to 128/32. The CL/6997052 is using 16/8. I am
    inclined towards 32/16 for now (my synthetic tests show still minimal
    memory consumption and good performance). Another possible point is
    64/32.



    https://codereview.appspot.com/7029044/
  • No Smile Face at Jan 4, 2013 at 7:19 am

    On 2013/01/03 15:29:31, dvyukov wrote:
    Hello mailto:golang-dev@googlegroups.com,
    I'd like you to review this change to
    https://dvyukov%2540google.com%40code.google.com/p/go/
    For my ray tracer stuff the behaviour is very similar to what I had with
    previous patch.

    Machine: amd64/linux on i5-3470 CPU.

    Before (tip d0d76b7fb219):
    Rendering took 2m 34.5s
    Sys: 85928184
    StackInuse: 102400
    StackSys: 31981568
    After (tip d0d76b7fb219 + issue7029044_6007.diff):
    Rendering took 2m 35.2s
    Sys: 55128064
    StackInuse: 12885004288
    StackSys: 2621440

    This time the actual runtime.MemStats numbers instead of staring at
    process' RES (resident memory size). Rendering time is similar (less
    than 1% difference is not statistically significant). Oh and this time I
    was using 50 rays per pixel instead of 100, just to make tests quicker.
    Also note the anomally high StackInuse number in your patch. Is it a
    miscalculation?


    https://codereview.appspot.com/7029044/
  • Dvyukov at Jan 4, 2013 at 7:37 am

    On 2013/01/04 07:19:15, nsf wrote:
    On 2013/01/03 15:29:31, dvyukov wrote:
    Hello mailto:golang-dev@googlegroups.com,

    I'd like you to review this change to
    https://dvyukov%252540google.com%2540code.google.com/p/go/
    For my ray tracer stuff the behaviour is very similar to what I had with
    previous patch.
    Machine: amd64/linux on i5-3470 CPU.
    Before (tip d0d76b7fb219):
    Rendering took 2m 34.5s
    Sys: 85928184
    StackInuse: 102400
    StackSys: 31981568
    After (tip d0d76b7fb219 + issue7029044_6007.diff):
    Rendering took 2m 35.2s
    Sys: 55128064
    StackInuse: 12885004288
    StackSys: 2621440
    This time the actual runtime.MemStats numbers instead of staring at
    process' RES
    (resident memory size). Rendering time is similar (less than 1%
    difference is
    not statistically significant). Oh and this time I was using 50 rays per pixel
    instead of 100, just to make tests quicker. Also note the anomally high
    StackInuse number in your patch. Is it a miscalculation?
    Thanks! This is actually a bug. Fixed:
    https://codereview.appspot.com/7029044/diff2/6007:13007/src/pkg/runtime/runtime.h
    https://codereview.appspot.com/7029044/diff2/6007:13007/src/pkg/runtime/stack_test.go


    https://codereview.appspot.com/7029044/
  • Dvyukov at Jan 4, 2013 at 7:39 am

    On 2013/01/04 07:36:59, dvyukov wrote:
    On 2013/01/04 07:19:15, nsf wrote:
    On 2013/01/03 15:29:31, dvyukov wrote:
    Hello mailto:golang-dev@googlegroups.com,

    I'd like you to review this change to
    https://dvyukov%25252540google.com%252540code.google.com/p/go/
    For my ray tracer stuff the behaviour is very similar to what I had
    with
    previous patch.

    Machine: amd64/linux on i5-3470 CPU.

    Before (tip d0d76b7fb219):
    Rendering took 2m 34.5s
    Sys: 85928184
    StackInuse: 102400
    StackSys: 31981568
    After (tip d0d76b7fb219 + issue7029044_6007.diff):
    Rendering took 2m 35.2s
    Sys: 55128064
    StackInuse: 12885004288
    StackSys: 2621440

    This time the actual runtime.MemStats numbers instead of staring at
    process'
    RES
    (resident memory size).
    Do you miss a part of the sentence?
    Rendering time is similar (less than 1% difference is
    not statistically significant).
    The results look fine, right?
    Oh and this time I was using 50 rays per pixel
    instead of 100, just to make tests quicker. Also note the anomally
    high
    StackInuse number in your patch. Is it a miscalculation?
    Thanks! This is actually a bug. Fixed:
    https://codereview.appspot.com/7029044/diff2/6007:13007/src/pkg/runtime/runtime.h

    https://codereview.appspot.com/7029044/diff2/6007:13007/src/pkg/runtime/stack_test.go


    https://codereview.appspot.com/7029044/
  • Dvyukov at Jan 4, 2013 at 7:48 am

    On 2013/01/04 06:04:59, dvyukov wrote:
    On 2013/01/03 23:55:50, sougou wrote:
    Reran vtocc benchmars, around 10M queries using 100 clients.

    Run 1: go version currently used on production 0a3866d6cc6b (Sep
    24):
    qps: 5832 StackSys: 86MB

    Run 2: go @tip d0d76b7fb219 (Jan 3):
    qps: 5543 StackSys: 77MB

    Run 3: Using CL 6997052:
    qps: 5673 StackSys: 3MB

    Run 4: Using CL 7029044:
    qps: 5699 StackSys: 15MB

    Conclusion: Marginal difference in performance between the two CLs.
    The older
    CL
    uses less memory. Maybe it will be more pronounced if you passed
    large objects
    by value to functions?
    The runtime @tip is slower than the one from September, an unrelated
    observation.

    This is just a summary. I can send you more detailed stats and pprof
    captures
    if
    needed.
    Can you please test with varying values for
    StackCacheSize/StackCacheBatch in
    src/pkg/runtime/runtime.h?
    Currently they are set to 128/32. The CL/6997052 is using 16/8. I am inclined
    towards 32/16 for now (my synthetic tests show still minimal memory
    consumption
    and good performance). Another possible point is 64/32.
    Or perhaps it's already fine?
    15 vs 79-80MB is a good win already. More importantly StackSys must not
    grow over time now, it's bounded by 512kb per thread (while currently it
    slowly grows to infinity).

    Well, actually not that slowly. I've run the following funny test --
    each line is StackSys *increase*.

    Current behavior:
    $ go test -run=StackMem -v
    -cpu=1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
    2>&1 | grep "for stack mem"
    stack_test.go:1569: Consumed 106MB for stack mem
    stack_test.go:1569: Consumed 48MB for stack mem
    stack_test.go:1569: Consumed 52MB for stack mem
    stack_test.go:1569: Consumed 71MB for stack mem
    stack_test.go:1569: Consumed 71MB for stack mem
    stack_test.go:1569: Consumed 53MB for stack mem
    stack_test.go:1569: Consumed 35MB for stack mem
    stack_test.go:1569: Consumed 27MB for stack mem
    stack_test.go:1569: Consumed 39MB for stack mem
    stack_test.go:1569: Consumed 43MB for stack mem
    stack_test.go:1569: Consumed 49MB for stack mem
    stack_test.go:1569: Consumed 54MB for stack mem
    stack_test.go:1569: Consumed 44MB for stack mem
    stack_test.go:1569: Consumed 35MB for stack mem
    stack_test.go:1569: Consumed 41MB for stack mem
    stack_test.go:1569: Consumed 32MB for stack mem
    stack_test.go:1569: Consumed 27MB for stack mem
    stack_test.go:1569: Consumed 20MB for stack mem
    stack_test.go:1569: Consumed 36MB for stack mem
    stack_test.go:1569: Consumed 33MB for stack mem
    stack_test.go:1569: Consumed 31MB for stack mem
    stack_test.go:1569: Consumed 45MB for stack mem
    stack_test.go:1569: Consumed 40MB for stack mem
    stack_test.go:1569: Consumed 30MB for stack mem
    stack_test.go:1569: Consumed 39MB for stack mem
    stack_test.go:1569: Consumed 27MB for stack mem
    stack_test.go:1569: Consumed 27MB for stack mem
    stack_test.go:1569: Consumed 37MB for stack mem
    stack_test.go:1569: Consumed 33MB for stack mem
    stack_test.go:1569: Consumed 36MB for stack mem
    stack_test.go:1569: Consumed 34MB for stack mem
    stack_test.go:1569: Consumed 42MB for stack mem
    stack_test.go:1569: Consumed 29MB for stack mem
    stack_test.go:1569: Consumed 29MB for stack mem
    stack_test.go:1569: Consumed 44MB for stack mem
    stack_test.go:1569: Consumed 20MB for stack mem
    stack_test.go:1569: Consumed 31MB for stack mem
    stack_test.go:1569: Consumed 31MB for stack mem
    stack_test.go:1569: Consumed 19MB for stack mem
    stack_test.go:1569: Consumed 25MB for stack mem


    New behavior:
    $ go test -run=StackMem -v
    -cpu=1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
    2>&1 | grep "for stack mem"
    stack_test.go:1569: Consumed 13MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    ...



    Either somebody must come up with a good tuning methodology, or let's
    commit it as-is and tune later.

    https://codereview.appspot.com/7029044/
  • Sugu Sougoumarane at Jan 4, 2013 at 8:13 am
    Sorry, I forgot to mention that StackSys was indefinitely growing in the
    first two runs (without the CLs).
    For run 3 (old CL), it started off at 3MB and stayed there.
    For run 4 (new CL), it started at 14MB and inched up to 15.

    So, both CLs are good enough. I would personally lean towards the old CL,
    but the new CL should also be fine if it offers more balanced performance.
  • Dmitry Vyukov at Jan 4, 2013 at 8:11 am

    On Fri, Jan 4, 2013 at 12:05 PM, Sugu Sougoumarane wrote:
    Sorry, I forgot to mention that StackSys was indefinitely growing in the
    first two runs (without the CLs).
    For run 3 (old CL), it started off at 3MB and stayed there.
    For run 4 (new CL), it started at 14MB and inched up to 15.

    So, both CLs are good enough. I would personally lean towards the old CL,
    but the new CL should also be fine if it offers more balanced performance.
    I just afraid that I can badly penalize some other workloads that I
    don't know about.
    You may try new CL with e.g. StackCacheSize=32, StackCacheBatch=16 or
    64/32. On the synthetic tests 32/16 is still good enough, so if it
    reduces StackSys for you, I am happy to change it to 32/16.
  • Mike Solomon at Jan 4, 2013 at 1:49 pm
    From our perspective, capping the growth is a big win and the
    performance tradeoff is worth it. I'll let Sugu confirm that with a
    production test. Once we can reason about good sizes, we can run a few
    more tests, but I suspect it may be workload dependent. In the future
    it might be worth considering an environment variable, but I tend to
    dislike tunables.

    Either way, thanks for putting so much time into this.
    On Thu, Jan 3, 2013 at 11:48 PM, wrote:
    On 2013/01/04 06:04:59, dvyukov wrote:
    On 2013/01/03 23:55:50, sougou wrote:
    Reran vtocc benchmars, around 10M queries using 100 clients.

    Run 1: go version currently used on production 0a3866d6cc6b (Sep
    24):
    qps: 5832 StackSys: 86MB

    Run 2: go @tip d0d76b7fb219 (Jan 3):
    qps: 5543 StackSys: 77MB

    Run 3: Using CL 6997052:
    qps: 5673 StackSys: 3MB

    Run 4: Using CL 7029044:
    qps: 5699 StackSys: 15MB

    Conclusion: Marginal difference in performance between the two CLs.
    The older
    CL
    uses less memory. Maybe it will be more pronounced if you passed
    large objects
    by value to functions?
    The runtime @tip is slower than the one from September, an unrelated
    observation.

    This is just a summary. I can send you more detailed stats and pprof
    captures
    if
    needed.
    Can you please test with varying values for
    StackCacheSize/StackCacheBatch in
    src/pkg/runtime/runtime.h?
    Currently they are set to 128/32. The CL/6997052 is using 16/8. I am inclined
    towards 32/16 for now (my synthetic tests show still minimal memory
    consumption
    and good performance). Another possible point is 64/32.

    Or perhaps it's already fine?
    15 vs 79-80MB is a good win already. More importantly StackSys must not
    grow over time now, it's bounded by 512kb per thread (while currently it
    slowly grows to infinity).

    Well, actually not that slowly. I've run the following funny test --
    each line is StackSys *increase*.

    Current behavior:
    $ go test -run=StackMem -v
    -cpu=1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
    2>&1 | grep "for stack mem"
    stack_test.go:1569: Consumed 106MB for stack mem
    stack_test.go:1569: Consumed 48MB for stack mem
    stack_test.go:1569: Consumed 52MB for stack mem
    stack_test.go:1569: Consumed 71MB for stack mem
    stack_test.go:1569: Consumed 71MB for stack mem
    stack_test.go:1569: Consumed 53MB for stack mem
    stack_test.go:1569: Consumed 35MB for stack mem
    stack_test.go:1569: Consumed 27MB for stack mem
    stack_test.go:1569: Consumed 39MB for stack mem
    stack_test.go:1569: Consumed 43MB for stack mem
    stack_test.go:1569: Consumed 49MB for stack mem
    stack_test.go:1569: Consumed 54MB for stack mem
    stack_test.go:1569: Consumed 44MB for stack mem
    stack_test.go:1569: Consumed 35MB for stack mem
    stack_test.go:1569: Consumed 41MB for stack mem
    stack_test.go:1569: Consumed 32MB for stack mem
    stack_test.go:1569: Consumed 27MB for stack mem
    stack_test.go:1569: Consumed 20MB for stack mem
    stack_test.go:1569: Consumed 36MB for stack mem
    stack_test.go:1569: Consumed 33MB for stack mem
    stack_test.go:1569: Consumed 31MB for stack mem
    stack_test.go:1569: Consumed 45MB for stack mem
    stack_test.go:1569: Consumed 40MB for stack mem
    stack_test.go:1569: Consumed 30MB for stack mem
    stack_test.go:1569: Consumed 39MB for stack mem
    stack_test.go:1569: Consumed 27MB for stack mem
    stack_test.go:1569: Consumed 27MB for stack mem
    stack_test.go:1569: Consumed 37MB for stack mem
    stack_test.go:1569: Consumed 33MB for stack mem
    stack_test.go:1569: Consumed 36MB for stack mem
    stack_test.go:1569: Consumed 34MB for stack mem
    stack_test.go:1569: Consumed 42MB for stack mem
    stack_test.go:1569: Consumed 29MB for stack mem
    stack_test.go:1569: Consumed 29MB for stack mem
    stack_test.go:1569: Consumed 44MB for stack mem
    stack_test.go:1569: Consumed 20MB for stack mem
    stack_test.go:1569: Consumed 31MB for stack mem
    stack_test.go:1569: Consumed 31MB for stack mem
    stack_test.go:1569: Consumed 19MB for stack mem
    stack_test.go:1569: Consumed 25MB for stack mem


    New behavior:
    $ go test -run=StackMem -v
    -cpu=1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
    2>&1 | grep "for stack mem"
    stack_test.go:1569: Consumed 13MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    ...



    Either somebody must come up with a good tuning methodology, or let's
    commit it as-is and tune later.

    https://codereview.appspot.com/7029044/
  • Dmitry Vyukov at Jan 4, 2013 at 8:13 am

    On Fri, Jan 4, 2013 at 11:55 AM, Mike Solomon wrote:
    From our perspective, capping the growth is a big win and the
    performance tradeoff is worth it. I'll let Sugu confirm that with a
    production test. Once we can reason about good sizes, we can run a few
    more tests, but I suspect it may be workload dependent. In the future
    it might be worth considering an environment variable, but I tend to
    dislike tunables.

    When/if I finally submit the improved scheduler, it will allow for
    per-processor (per-GOMAXPROC) state. And stack caches along with other
    stuff will move there. That will decrease stack mem further, and I
    believe eliminate any need in tunables (e.g. now you have say 200
    threads each with own cache, and then you will have only 8 procs with
    caches).

    Either way, thanks for putting so much time into this.
    You are welcome.

    On Thu, Jan 3, 2013 at 11:48 PM, wrote:
    On 2013/01/04 06:04:59, dvyukov wrote:
    On 2013/01/03 23:55:50, sougou wrote:
    Reran vtocc benchmars, around 10M queries using 100 clients.

    Run 1: go version currently used on production 0a3866d6cc6b (Sep
    24):
    qps: 5832 StackSys: 86MB

    Run 2: go @tip d0d76b7fb219 (Jan 3):
    qps: 5543 StackSys: 77MB

    Run 3: Using CL 6997052:
    qps: 5673 StackSys: 3MB

    Run 4: Using CL 7029044:
    qps: 5699 StackSys: 15MB

    Conclusion: Marginal difference in performance between the two CLs.
    The older
    CL
    uses less memory. Maybe it will be more pronounced if you passed
    large objects
    by value to functions?
    The runtime @tip is slower than the one from September, an unrelated
    observation.

    This is just a summary. I can send you more detailed stats and pprof
    captures
    if
    needed.
    Can you please test with varying values for
    StackCacheSize/StackCacheBatch in
    src/pkg/runtime/runtime.h?
    Currently they are set to 128/32. The CL/6997052 is using 16/8. I am inclined
    towards 32/16 for now (my synthetic tests show still minimal memory
    consumption
    and good performance). Another possible point is 64/32.

    Or perhaps it's already fine?
    15 vs 79-80MB is a good win already. More importantly StackSys must not
    grow over time now, it's bounded by 512kb per thread (while currently it
    slowly grows to infinity).

    Well, actually not that slowly. I've run the following funny test --
    each line is StackSys *increase*.

    Current behavior:
    $ go test -run=StackMem -v
    -cpu=1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
    2>&1 | grep "for stack mem"
    stack_test.go:1569: Consumed 106MB for stack mem
    stack_test.go:1569: Consumed 48MB for stack mem
    stack_test.go:1569: Consumed 52MB for stack mem
    stack_test.go:1569: Consumed 71MB for stack mem
    stack_test.go:1569: Consumed 71MB for stack mem
    stack_test.go:1569: Consumed 53MB for stack mem
    stack_test.go:1569: Consumed 35MB for stack mem
    stack_test.go:1569: Consumed 27MB for stack mem
    stack_test.go:1569: Consumed 39MB for stack mem
    stack_test.go:1569: Consumed 43MB for stack mem
    stack_test.go:1569: Consumed 49MB for stack mem
    stack_test.go:1569: Consumed 54MB for stack mem
    stack_test.go:1569: Consumed 44MB for stack mem
    stack_test.go:1569: Consumed 35MB for stack mem
    stack_test.go:1569: Consumed 41MB for stack mem
    stack_test.go:1569: Consumed 32MB for stack mem
    stack_test.go:1569: Consumed 27MB for stack mem
    stack_test.go:1569: Consumed 20MB for stack mem
    stack_test.go:1569: Consumed 36MB for stack mem
    stack_test.go:1569: Consumed 33MB for stack mem
    stack_test.go:1569: Consumed 31MB for stack mem
    stack_test.go:1569: Consumed 45MB for stack mem
    stack_test.go:1569: Consumed 40MB for stack mem
    stack_test.go:1569: Consumed 30MB for stack mem
    stack_test.go:1569: Consumed 39MB for stack mem
    stack_test.go:1569: Consumed 27MB for stack mem
    stack_test.go:1569: Consumed 27MB for stack mem
    stack_test.go:1569: Consumed 37MB for stack mem
    stack_test.go:1569: Consumed 33MB for stack mem
    stack_test.go:1569: Consumed 36MB for stack mem
    stack_test.go:1569: Consumed 34MB for stack mem
    stack_test.go:1569: Consumed 42MB for stack mem
    stack_test.go:1569: Consumed 29MB for stack mem
    stack_test.go:1569: Consumed 29MB for stack mem
    stack_test.go:1569: Consumed 44MB for stack mem
    stack_test.go:1569: Consumed 20MB for stack mem
    stack_test.go:1569: Consumed 31MB for stack mem
    stack_test.go:1569: Consumed 31MB for stack mem
    stack_test.go:1569: Consumed 19MB for stack mem
    stack_test.go:1569: Consumed 25MB for stack mem


    New behavior:
    $ go test -run=StackMem -v
    -cpu=1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
    2>&1 | grep "for stack mem"
    stack_test.go:1569: Consumed 13MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    stack_test.go:1569: Consumed 0MB for stack mem
    ...



    Either somebody must come up with a good tuning methodology, or let's
    commit it as-is and tune later.

    https://codereview.appspot.com/7029044/
  • Sougou at Jan 4, 2013 at 8:23 pm
    5M rows using 100 connections:

    128/32:
    qps: 5885 StackSys: 14.7MB

    64/32:
    qps: 5876 StackSys: 8.4MB

    32/16:
    qps: 5850 StackSys: 4.9MB

    16/8:
    qps: 5892 StackSys: 3.15MB

    All other stats were comparable.

    Conclusion. No significant change in performance for different values of
    StackCacheSize/StackCacheBatch, except for the StackSys footprint, but
    there are diminishing returns below 32/16.

    https://codereview.appspot.com/7029044/
  • Dmitry Vyukov at Jan 4, 2013 at 8:36 pm
    OK, I will replace the constants with 32/16.

    And let's wait for Russ' blessing.

    On Sat, Jan 5, 2013 at 12:15 AM, wrote:
    5M rows using 100 connections:

    128/32:
    qps: 5885 StackSys: 14.7MB

    64/32:
    qps: 5876 StackSys: 8.4MB

    32/16:
    qps: 5850 StackSys: 4.9MB

    16/8:
    qps: 5892 StackSys: 3.15MB

    All other stats were comparable.

    Conclusion. No significant change in performance for different values of
    StackCacheSize/StackCacheBatch, except for the StackSys footprint, but
    there are diminishing returns below 32/16.

    https://codereview.appspot.com/7029044/

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-dev @
categoriesgo
postedJan 4, '13 at 1:17a
activeJan 4, '13 at 8:36p
posts12
users4
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase