FAQ
Reviewers: rsc, remyoudompheng,

Message:
Hello rsc, remyoudompheng (cc: golang-dev@googlegroups.com),

I'd like you to review this change to
https://go.googlecode.com/hg/


Description:
runtime: use reflect·call() to enter the function gc()

Garbage collection code (to be merged later) is calling functions
which have many local variables. This increases the probability that
the stack capacity won't be big enough to hold the local variables.
So, start gc() on a bigger stack to eliminate a potentially large number
of calls to runtime·morestack().

Please review this at http://codereview.appspot.com/6846044/

Affected files:
M src/pkg/runtime/mgc0.c
M src/pkg/runtime/proc.c
M src/pkg/runtime/runtime.h

Search Discussions

  • David Symonds at Nov 13, 2012 at 7:40 am
    Do you have performance numbers?
  • 0xe2 0x9a 0x9b at Nov 13, 2012 at 7:56 am

    On 2012/11/13 07:40:56, dsymonds wrote:
    Do you have performance numbers?
    The probability of stack reallocations during GC is low, but it can
    occur in regular Go programs. The worst case I encountered is
    test/bench/garbage/peano.go - the stack reallocations were causing
    performance issues there. I based my decision to send this CL on the
    worst case scenario.

    https://codereview.appspot.com/6846044/
  • David Symonds at Nov 13, 2012 at 7:58 am

    On Tue, Nov 13, 2012 at 6:56 PM, wrote:
    On 2012/11/13 07:40:56, dsymonds wrote:

    Do you have performance numbers?

    The probability of stack reallocations during GC is low, but it can
    occur in regular Go programs. The worst case I encountered is
    test/bench/garbage/peano.go - the stack reallocations were causing
    performance issues there. I based my decision to send this CL on the
    worst case scenario.
    So can you post some numbers for the impact on peano?
  • 0xe2 0x9a 0x9b at Nov 13, 2012 at 9:47 am

    On 2012/11/13 07:58:42, dsymonds wrote:
    On Tue, Nov 13, 2012 at 6:56 PM,
    wrote:
    On 2012/11/13 07:40:56, dsymonds wrote:

    Do you have performance numbers?

    The probability of stack reallocations during GC is low, but it can
    occur in regular Go programs. The worst case I encountered is
    test/bench/garbage/peano.go - the stack reallocations were causing
    performance issues there. I based my decision to send this CL on the
    worst case scenario.
    So can you post some numbers for the impact on peano?
    I am unable to make exact measurements on amd64 when running peano.go
    because I am using a virtualized environment (rhcloud.com).

    On i386, the number of calls to runtime·newstack() per 1 call to gc() is
    low: it ranges from 0 to 1000. So, the stack allocations are not an
    issue on i386. When it is about 1000, the measured time spent in
    runtime·newstack() while in gc() is about 0.04% of total program time.

    However, on amd64, the number of calls to runtime·newstack() per 1 call
    to gc() can be high: from 0 to 100000+. The behavior of the peano
    benchmark is stochastic: the number cab be 0, 1026, 9926, ..., 125470.
    I am unable to determine how this impacts GC performance. I presume that
    100000 stack reallocations lead to a measureable performance
    degradation.

    ---

    Note: I added a new field 'moreframesize_minalloc' to struct M just
    below the extant field 'moreframesize'. However, on amd64 this
    positioning appears to be completely incorrect and I have to move the
    new field lower (for example: below 'racepc').

    https://codereview.appspot.com/6846044/
  • Minux Ma at Nov 13, 2012 at 11:57 am
    My benchmark result on linux/amd64 (I've run both the new and old
    versions 20 times, and select the best/worst case _pair_):
    worst case:
    benchmark old ns/op new ns/op delta
    garbage.BenchmarkPeano 65746437 69322040 +5.44%
    garbage.BenchmarkPeanoLastPause 6062509 6115177 +0.87%
    garbage.BenchmarkPeanoPause 2154311 2532654 +17.56%
    best case:
    benchmark old ns/op new ns/op delta
    garbage.BenchmarkPeano 70536831 65711182 -6.84%
    garbage.BenchmarkPeanoLastPause 6347188 6116066 -3.64%
    garbage.BenchmarkPeanoPause 2812915 2499263 -11.15%
    On 2012/11/13 09:47:43, atom wrote:
    Note: I added a new field 'moreframesize_minalloc' to struct M just below the
    extant field 'moreframesize'. However, on amd64 this positioning
    appears to be
    completely incorrect and I have to move the new field lower (for
    example: below
    'racepc').
    i think you can place the new field just below moreargsize.
    just remember to run "cmd/dist install pkg/runtime" before
    "go install -v std" whenever you changed struct M or G;
    the assembly routines in runtime might use out-of-date offset
    otherwise.

    btw, please 'hg sync' and 'hg upload 6846044' as the runtime.h change
    can't be applied cleanly.


    https://codereview.appspot.com/6846044/
  • 0xe2 0x9a 0x9b at Nov 13, 2012 at 12:58 pm

    On 2012/11/13 11:57:16, minux wrote:
    My benchmark result on linux/amd64 (I've run both the new and old
    versions 20 times, and select the best/worst case _pair_):
    worst case:
    benchmark old ns/op new ns/op delta
    garbage.BenchmarkPeano 65746437 69322040 +5.44%
    garbage.BenchmarkPeanoLastPause 6062509 6115177 +0.87%
    garbage.BenchmarkPeanoPause 2154311 2532654 +17.56%
    best case:
    benchmark old ns/op new ns/op delta
    garbage.BenchmarkPeano 70536831 65711182 -6.84%
    garbage.BenchmarkPeanoLastPause 6347188 6116066 -3.64%
    garbage.BenchmarkPeanoPause 2812915 2499263 -11.15%
    Did you apply CL 6114046? I made some modifications to my local copy of
    CL 6114046, and those are the numbers I reported. I didn't test without
    CL 6114046. I can publish the diff files if you like.

    In my opinion, the average out of the 20 runs would be more useful.
    btw, please 'hg sync' and 'hg upload 6846044' as the runtime.h change
    can't be applied cleanly.
    Done.

    https://codereview.appspot.com/6846044/
  • Iant at Nov 13, 2012 at 2:28 pm
    I can't help but feel that this would be better done via some sort of
    compiler pragma. You are introducing runtime overhead to every call to
    the garbage collector. But the compiler could easily say at the start
    of the function "give me at least this much space on the stack" with
    much less overhead.

    Much better than that would be some sort of runtime heuristic: if we had
    to split the stack several times beneath some function, then the next
    time we enter the function ask for a bigger stack. I have no concrete
    suggestions for how to implement that, though.


    https://codereview.appspot.com/6846044/diff/4002/src/pkg/runtime/mgc0.c
    File src/pkg/runtime/mgc0.c (right):

    https://codereview.appspot.com/6846044/diff/4002/src/pkg/runtime/mgc0.c#newcode1097
    src/pkg/runtime/mgc0.c:1097: // a problem in the past.
    This test should stay where it was, since that is where it matters.

    https://codereview.appspot.com/6846044/
  • 0xe2 0x9a 0x9b at Nov 13, 2012 at 2:36 pm
    Hello rsc@golang.org, remyoudompheng@gmail.com, dsymonds@golang.org,
    minux.ma@gmail.com, iant@golang.org (cc: golang-dev@googlegroups.com),

    Please take another look.


    http://codereview.appspot.com/6846044/
  • 0xe2 0x9a 0x9b at Nov 13, 2012 at 2:37 pm
    https://codereview.appspot.com/6846044/diff/4002/src/pkg/runtime/mgc0.c
    File src/pkg/runtime/mgc0.c (right):

    https://codereview.appspot.com/6846044/diff/4002/src/pkg/runtime/mgc0.c#newcode1097
    src/pkg/runtime/mgc0.c:1097: // a problem in the past.
    On 2012/11/13 14:28:40, iant wrote:
    This test should stay where it was, since that is where it matters.
    Done.

    https://codereview.appspot.com/6846044/
  • 0xe2 0x9a 0x9b at Nov 13, 2012 at 3:18 pm

    On 2012/11/13 14:28:40, iant wrote:
    I can't help but feel that this would be better done via some sort of compiler
    pragma. You are introducing runtime overhead to every call to the garbage
    collector. But the compiler could easily say at the start of the
    function "give
    me at least this much space on the stack" with much less overhead.
    In my opinion it wouldn't be much faster than reflect·call().
    Much better than that would be some sort of runtime heuristic: if we had to
    split the stack several times beneath some function, then the next
    time we enter
    the function ask for a bigger stack. I have no concrete suggestions
    for how to
    implement that, though.
    A problem is that any goroutine can call gc(). The stack of a random
    goroutine cannot be enlarged if there are many goroutines because over
    time it may consume too much memory. Though it would be possible when
    there are few goroutines.

    https://codereview.appspot.com/6846044/
  • Ian Lance Taylor at Nov 13, 2012 at 4:53 pm

    On Tue, Nov 13, 2012 at 7:18 AM, wrote:
    On 2012/11/13 14:28:40, iant wrote:

    I can't help but feel that this would be better done via some sort of compiler
    pragma. You are introducing runtime overhead to every call to the garbage
    collector. But the compiler could easily say at the start of the
    function "give
    me at least this much space on the stack" with much less overhead.

    In my opinion it wouldn't be much faster than reflect·call().
    Perhaps you are right. Hard to know for sure.

    Much better than that would be some sort of runtime heuristic: if we had to
    split the stack several times beneath some function, then the next
    time we enter
    the function ask for a bigger stack. I have no concrete suggestions
    for how to
    implement that, though.

    A problem is that any goroutine can call gc(). The stack of a random
    goroutine cannot be enlarged if there are many goroutines because over
    time it may consume too much memory. Though it would be possible when
    there are few goroutines.
    If it worked well, it would reduce the number of stack splits per GC
    to 1. More importantly, it would help in other cases as well.

    Ian
  • 0xe2 0x9a 0x9b at Nov 16, 2012 at 7:26 am
    Hello rsc@golang.org, remyoudompheng@gmail.com, dsymonds@golang.org,
    minux.ma@gmail.com, iant@golang.org, iant@google.com (cc:
    golang-dev@googlegroups.com),

    Please take another look.


    http://codereview.appspot.com/6846044/
  • 0xe2 0x9a 0x9b at Nov 26, 2012 at 9:31 am
    I would like to obtain information about the progress of merging this
    CL.

    https://codereview.appspot.com/6846044/
  • Russ Cox at Nov 26, 2012 at 2:14 pm

    On Mon, Nov 26, 2012 at 4:31 AM, wrote:
    I would like to obtain information about the progress of merging this
    CL.
    I have 83 gmail threads in my rsc@golang.org todo label, dating back
    to Nov 3. This is one of them. I hope to review many of them this
    week.
  • Rsc at Nov 26, 2012 at 8:40 pm
    This seems fine for now. I share Ian's reservations, but this is easy to
    change if we come up with a better idea. I expect that the cost of the
    reflect.call is tiny compared to a typical gc.



    https://codereview.appspot.com/6846044/diff/5/src/pkg/runtime/mgc0.c
    File src/pkg/runtime/mgc0.c (right):

    https://codereview.appspot.com/6846044/diff/5/src/pkg/runtime/mgc0.c#newcode968
    src/pkg/runtime/mgc0.c:968: static void
    Can you please do

    static void gc(struct gc_args *args);

    void
    runtime.gc(int force)
    {
    ...
    }

    static void
    gc(struct gc_args *args)
    {
    ...
    }

    so that the code reads top to bottom and also to minimize the diff?

    Thanks.

    https://codereview.appspot.com/6846044/diff/5/src/pkg/runtime/mgc0.c#newcode977
    src/pkg/runtime/mgc0.c:977: // The atomic operations are not atomic if
    the uint64s
    This can move into runtime.gc.

    https://codereview.appspot.com/6846044/diff/5/src/pkg/runtime/mgc0.c#newcode1092
    src/pkg/runtime/mgc0.c:1092: runtime·gc(1);
    Maybe this should be done in runtime.gc instead?
    I'm a little weirded out by runtime.gc calling gc on a new stack calling
    runtime.gc calling gc on a new stack.

    https://codereview.appspot.com/6846044/
  • 0xe2 0x9a 0x9b at Nov 27, 2012 at 5:47 am
    https://codereview.appspot.com/6846044/diff/5/src/pkg/runtime/mgc0.c
    File src/pkg/runtime/mgc0.c (right):

    https://codereview.appspot.com/6846044/diff/5/src/pkg/runtime/mgc0.c#newcode968
    src/pkg/runtime/mgc0.c:968: static void
    On 2012/11/26 20:40:19, rsc wrote:
    Can you please do
    static void gc(struct gc_args *args);
    void
    runtime.gc(int force)
    {
    ...
    }
    static void
    gc(struct gc_args *args)
    {
    ...
    }
    so that the code reads top to bottom and also to minimize the diff?
    Thanks.
    Done.

    https://codereview.appspot.com/6846044/diff/5/src/pkg/runtime/mgc0.c#newcode977
    src/pkg/runtime/mgc0.c:977: // The atomic operations are not atomic if
    the uint64s
    On 2012/11/26 20:40:19, rsc wrote:
    This can move into runtime.gc.
    Done.

    https://codereview.appspot.com/6846044/diff/5/src/pkg/runtime/mgc0.c#newcode1092
    src/pkg/runtime/mgc0.c:1092: runtime·gc(1);
    On 2012/11/26 20:40:19, rsc wrote:
    Maybe this should be done in runtime.gc instead?
    I'm a little weirded out by runtime.gc calling gc on a new stack calling
    runtime.gc calling gc on a new stack.
    Done.

    https://codereview.appspot.com/6846044/
  • 0xe2 0x9a 0x9b at Nov 27, 2012 at 5:48 am
    Hello rsc@golang.org, remyoudompheng@gmail.com, dsymonds@golang.org,
    minux.ma@gmail.com, iant@golang.org, iant@google.com (cc:
    golang-dev@googlegroups.com),

    Please take another look.


    http://codereview.appspot.com/6846044/
  • Rsc at Nov 27, 2012 at 3:21 pm
  • Russ Cox at Nov 27, 2012 at 5:45 pm
    I tried to patch this in to my tree to submit it, but when I run
    all.bash the first Go binary (the bootstrap go command) crashes with

    runtime: memory allocated by OS (0x2210517000) not in usable range
    [0x2103a6000,0x2903a6000)
    runtime: out of memory: cannot allocate 585205940224-byte block (1048576 in use)
    throw: out of memory

    goroutine 1 [running]:
    ----- stack segment boundary -----
    regexp/syntax.(*parser).factor(0x210438000, 0x210439080, 0xa, 0x10, 0x0, ...)
    /Users/rsc/g/go/src/pkg/regexp/syntax/parse.go:417 +0x1b74
    regexp/syntax.(*parser).collapse(0x210438000, 0x210439010, 0xa, 0xe,
    0x210415113, ...)
    /Users/rsc/g/go/src/pkg/regexp/syntax/parse.go:347 +0x25d
    regexp/syntax.(*parser).alternate(0x210438000, 0x2103a6701, 0x0, 0x0)
    /Users/rsc/g/go/src/pkg/regexp/syntax/parse.go:302 +0x1e9
    regexp/syntax.(*parser).parseRightParen(0x210438000, 0x6f, 0x6f, 0x290dbe)
    /Users/rsc/g/go/src/pkg/regexp/syntax/parse.go:1166 +0x89
    regexp/syntax.Parse(0x290d80, 0xbb, 0xd4, 0xffffffffffffffff, 0x3721a0, ...)
    /Users/rsc/g/go/src/pkg/regexp/syntax/parse.go:705 +0x5ca
    regexp.compile(0x290d80, 0xbb, 0xd4, 0x4, 0xffffffffffffffff, ...)
    /Users/rsc/g/go/src/pkg/regexp/regexp.go:134 +0x40
    regexp.Compile(0x290d80, 0xbb, 0x53d5c, 0x174948, 0x0, ...)
    /Users/rsc/g/go/src/pkg/regexp/regexp.go:107 +0x3b
    regexp.MustCompile(0x290d80, 0xbb, 0x2000, 0x210427180, 0xf20bd, ...)
    /Users/rsc/g/go/src/pkg/regexp/regexp.go:195 +0x38
    go/doc.init(0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /Users/rsc/g/go/src/pkg/go/doc/synopsis.go:-1878 +0x8b
    main.init()
    /Users/rsc/g/go/src/cmd/go/vet.go:37 +0x54

    goroutine 2 [runnable]:
    created by runtime.main
    /Users/rsc/g/go/src/pkg/runtime/proc.c:225

    Any ideas?

    Russ
  • 0xe2 0x9a 0x9b at Nov 27, 2012 at 5:53 pm
    Hello rsc@golang.org, remyoudompheng@gmail.com, dsymonds@golang.org,
    minux.ma@gmail.com, iant@golang.org, iant@google.com (cc:
    golang-dev@googlegroups.com),

    Please take another look.


    http://codereview.appspot.com/6846044/
  • 0xe2 0x9a 0x9b at Nov 27, 2012 at 5:59 pm
    It appears to be caused by the position of the new field
    'moreframesize_minalloc' in struct M. This affects amd64 only. If I move
    the field lower (below 'racepc') it works fine on amd64.

    I wasn't able to determine the actual cause of the problem.
    On 2012/11/27 17:20:56, rsc wrote:
    I tried to patch this in to my tree to submit it, but when I run
    all.bash the first Go binary (the bootstrap go command) crashes with
    runtime: memory allocated by OS (0x2210517000) not in usable range
    [0x2103a6000,0x2903a6000)
    runtime: out of memory: cannot allocate 585205940224-byte block
    (1048576 in
    use)
    throw: out of memory
    goroutine 1 [running]:
    ----- stack segment boundary -----
    regexp/syntax.(*parser).factor(0x210438000, 0x210439080, 0xa, 0x10, 0x0, ...)
    /Users/rsc/g/go/src/pkg/regexp/syntax/parse.go:417 +0x1b74
    regexp/syntax.(*parser).collapse(0x210438000, 0x210439010, 0xa, 0xe,
    0x210415113, ...)
    /Users/rsc/g/go/src/pkg/regexp/syntax/parse.go:347 +0x25d
    regexp/syntax.(*parser).alternate(0x210438000, 0x2103a6701, 0x0, 0x0)
    /Users/rsc/g/go/src/pkg/regexp/syntax/parse.go:302 +0x1e9
    regexp/syntax.(*parser).parseRightParen(0x210438000, 0x6f, 0x6f, 0x290dbe)
    /Users/rsc/g/go/src/pkg/regexp/syntax/parse.go:1166 +0x89
    regexp/syntax.Parse(0x290d80, 0xbb, 0xd4, 0xffffffffffffffff,
    0x3721a0, ...)
    /Users/rsc/g/go/src/pkg/regexp/syntax/parse.go:705 +0x5ca
    regexp.compile(0x290d80, 0xbb, 0xd4, 0x4, 0xffffffffffffffff, ...)
    /Users/rsc/g/go/src/pkg/regexp/regexp.go:134 +0x40
    regexp.Compile(0x290d80, 0xbb, 0x53d5c, 0x174948, 0x0, ...)
    /Users/rsc/g/go/src/pkg/regexp/regexp.go:107 +0x3b
    regexp.MustCompile(0x290d80, 0xbb, 0x2000, 0x210427180, 0xf20bd, ...)
    /Users/rsc/g/go/src/pkg/regexp/regexp.go:195 +0x38
    go/doc.init(0x0, 0x0, 0x0, 0x0, 0x0, ...)
    /Users/rsc/g/go/src/pkg/go/doc/synopsis.go:-1878 +0x8b
    main.init()
    /Users/rsc/g/go/src/cmd/go/vet.go:37 +0x54
    goroutine 2 [runnable]:
    created by runtime.main
    /Users/rsc/g/go/src/pkg/runtime/proc.c:225
    Any ideas?
    Russ

    https://codereview.appspot.com/6846044/
  • Russ Cox at Nov 27, 2012 at 7:06 pm
    Worked. It's possible that the offset of the tls field is known
    somewhere deep. I don't mind just moving the field and leaving that
    mystery for another day. Thanks for the quick fix.

    Russ
  • Rsc at Nov 27, 2012 at 6:12 pm
    *** Submitted as
    http://code.google.com/p/go/source/detail?r=7646c94159a1 ***

    runtime: use reflect·call() to enter the function gc()

    Garbage collection code (to be merged later) is calling functions
    which have many local variables. This increases the probability that
    the stack capacity won't be big enough to hold the local variables.
    So, start gc() on a bigger stack to eliminate a potentially large number
    of calls to runtime·morestack().

    R=rsc, remyoudompheng, dsymonds, minux.ma, iant, iant
    CC=golang-dev
    http://codereview.appspot.com/6846044

    Committer: Russ Cox <rsc@golang.org>


    http://codereview.appspot.com/6846044/

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-dev @
categoriesgo
postedNov 13, '12 at 7:18a
activeNov 27, '12 at 7:06p
posts24
users6
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase