FAQ

Search Discussions

  • Rsc at Sep 17, 2012 at 9:08 pm
    Overall seems okay but there are things to clean up.



    http://codereview.appspot.com/6443115/diff/7026/src/cmd/go/test.go
    File src/cmd/go/test.go (right):

    http://codereview.appspot.com/6443115/diff/7026/src/cmd/go/test.go#newcode112
    src/cmd/go/test.go:112: -test.contentionprofile cont.out
    Can we shorten contention to something? We write mem not memory. You
    have to invoke the binary by hand to get these so you're talking about a
    23-byte flag.

    lockprofile?
    blockprofile?

    http://codereview.appspot.com/6443115/diff/7026/src/pkg/runtime/mprof.goc
    File src/pkg/runtime/mprof.goc (right):

    http://codereview.appspot.com/6443115/diff/7026/src/pkg/runtime/mprof.goc#newcode290
    src/pkg/runtime/mprof.goc:290: if(rate > cycles &&
    runtime·fastrand1()%rate > cycles)
    Is this correct sampling?

    http://codereview.appspot.com/6443115/diff/7026/src/pkg/runtime/pprof/sys_linux.go
    File src/pkg/runtime/pprof/sys_linux.go (right):

    http://codereview.appspot.com/6443115/diff/7026/src/pkg/runtime/pprof/sys_linux.go#newcode13
    src/pkg/runtime/pprof/sys_linux.go:13: c = 2e9 // wild guess
    This needs to be fixed before submit.

    http://codereview.appspot.com/6443115/diff/7026/src/pkg/runtime/pprof/sys_other.go
    File src/pkg/runtime/pprof/sys_other.go (right):

    http://codereview.appspot.com/6443115/diff/7026/src/pkg/runtime/pprof/sys_other.go#newcode10
    src/pkg/runtime/pprof/sys_other.go:10: // TODO(dvyukov): not
    implemented.
    This needs to be fixed before submit.
    Maybe it should be cpu-specific code instead of OS-specific code.
    I would expect a cputickfreq function to live next to cputicks so that
    they are in sync.

    http://codereview.appspot.com/6443115/diff/7026/src/pkg/runtime/sema.goc
    File src/pkg/runtime/sema.goc (right):

    http://codereview.appspot.com/6443115/diff/7026/src/pkg/runtime/sema.goc#newcode27
    src/pkg/runtime/sema.goc:27: uint32 volatile* addr;
    You don't have to undo these, but in the future please avoid spacing
    changes like this. They clutter the diffs.

    http://codereview.appspot.com/6443115/
  • Dmitry Vyukov at Sep 18, 2012 at 7:49 pm

    On Mon, Sep 17, 2012 at 2:08 PM, wrote:

    Overall seems okay but there are things to clean up.

    http://codereview.appspot.com/**6443115/diff/7026/src/cmd/go/**test.go<http://codereview.appspot.com/6443115/diff/7026/src/cmd/go/test.go>
    File src/cmd/go/test.go (right):

    http://codereview.appspot.com/**6443115/diff/7026/src/cmd/go/**
    test.go#newcode112<http://codereview.appspot.com/6443115/diff/7026/src/cmd/go/test.go#newcode112>
    src/cmd/go/test.go:112: -test.contentionprofile cont.out
    Can we shorten contention to something? We write mem not memory. You
    have to invoke the binary by hand to get these so you're talking about a
    23-byte flag.

    lockprofile?
    blockprofile?
    I like blockprofile.
    Do you mean I replace contention->block everywhere, or just the flag name?
  • Russ Cox at Sep 18, 2012 at 7:54 pm
    I meant just the flag name but probably the other API names should match.
  • Dave at Sep 18, 2012 at 5:43 pm
    Can you please remail this, it doesn't apply cleanly any more.

    Also, there may be a problem with an additional dependency (io/ioutil),
    which causes pkg/go/build/deps_test.go to fail.


    https://codereview.appspot.com/6443115/
  • Dvyukov at Sep 18, 2012 at 9:13 pm

    On 2012/09/18 17:19:49, dfc wrote:
    Can you please remail this, it doesn't apply cleanly any more.
    Done
    Do you use it? What is your experience?


    http://codereview.appspot.com/6443115/
  • Dave Cheney at Sep 18, 2012 at 9:18 pm
    I haven't had a case to use it apart from the issues that you raised
    for the net and http packages. I would like to try it on the
    go.crypto/ssh package and the tls pacakges, to see if contention (as
    opposed to crypto) is a significant % of the runtime.
    On Wed, Sep 19, 2012 at 7:06 AM, wrote:
    On 2012/09/18 17:19:49, dfc wrote:

    Can you please remail this, it doesn't apply cleanly any more.

    Done
    Do you use it? What is your experience?


    http://codereview.appspot.com/6443115/
  • Dvyukov at Sep 18, 2012 at 9:28 pm
    http://codereview.appspot.com/6443115/diff/27013/src/pkg/runtime/runtime.c
    File src/pkg/runtime/runtime.c (right):

    http://codereview.appspot.com/6443115/diff/27013/src/pkg/runtime/runtime.c#newcode363
    src/pkg/runtime/runtime.c:363:
    runtime/pprof·runtime_cyclesPerSecond(int64 res)
    How about this?

    On my machine where
    /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq says 2395000000,
    this method says:
    2393915609
    2393927695
    2393923912

    It is portable and allows to remove that pprof/sys_* files, and removes
    the additional dependency between runtime/pprof->io/ioutil as well.

    The more precise method would require either to examine CPUID output
    (it's quite tricky, works since Pentium4 and I am not sure about AMD). I
    have no idea how to do it on ARM. Or alternatively, OS-dependent method
    is to read it from system registry on Windows, and I don't know about
    other OSes.

    http://codereview.appspot.com/6443115/
  • Dave at Sep 18, 2012 at 9:33 pm
    I don't know if that will be reliable enough. Even with the performance
    governor enabled, the CPU freq on my i5 shifts between 2.5 and 3.0 Ghz
    depending on thermal load. Additionally for some systems C state
    sleeping may not be optional, which will throw this computation off as
    well.

    http://codereview.appspot.com/6443115/
  • Dmitry Vyukov at Sep 18, 2012 at 9:34 pm
    I would expect TSC to tick with constant freq regardless of TurboBoost and
    C states.

    On Tue, Sep 18, 2012 at 2:33 PM, wrote:

    I don't know if that will be reliable enough. Even with the performance
    governor enabled, the CPU freq on my i5 shifts between 2.5 and 3.0 Ghz
    depending on thermal load. Additionally for some systems C state
    sleeping may not be optional, which will throw this computation off as
    well.

    http://codereview.appspot.com/**6443115/<http://codereview.appspot.com/6443115/>
  • Dmitry Vyukov at Sep 18, 2012 at 9:43 pm
    Otherwise I would prefer to drop this at all, pprof assumes 2GHz by default
    which should be enough.

    On Tue, Sep 18, 2012 at 2:34 PM, Dmitry Vyukov wrote:

    I would expect TSC to tick with constant freq regardless of TurboBoost and
    C states.


    On Tue, Sep 18, 2012 at 2:33 PM, wrote:

    I don't know if that will be reliable enough. Even with the performance
    governor enabled, the CPU freq on my i5 shifts between 2.5 and 3.0 Ghz
    depending on thermal load. Additionally for some systems C state
    sleeping may not be optional, which will throw this computation off as
    well.

    http://codereview.appspot.com/**6443115/<http://codereview.appspot.com/6443115/>
  • Dvyukov at Sep 18, 2012 at 9:34 pm
    http://codereview.appspot.com/6443115/diff/7026/src/pkg/runtime/mprof.goc
    File src/pkg/runtime/mprof.goc (right):

    http://codereview.appspot.com/6443115/diff/7026/src/pkg/runtime/mprof.goc#newcode290
    src/pkg/runtime/mprof.goc:290: if(rate > cycles &&
    runtime·fastrand1()%rate > cycles)
    On 2012/09/17 21:08:07, rsc wrote:
    Is this correct sampling?
    Well, there may be better sampling techniques. But this is simple enough
    and filters proportionally to blocking time.

    http://codereview.appspot.com/6443115/
  • Dvyukov at Sep 18, 2012 at 10:17 pm

    On 2012/09/18 19:54:25, rsc wrote:
    I meant just the flag name but probably the other API names should
    match.

    Renamed Contention->Block everywhere.
    New API diff:
    +pkg net/http/pprof, func BlockProfileRate(http.ResponseWriter,
    *http.Request)
    +pkg runtime, func BlockProfile([]BlockProfileRecord) (int, bool)
    +pkg runtime, method (*BlockProfileRecord) Stack() []uintptr
    +pkg runtime, type BlockProfileRecord struct
    +pkg runtime, type BlockProfileRecord struct, Count int64
    +pkg runtime, type BlockProfileRecord struct, Cycles int64
    +pkg runtime, type BlockProfileRecord struct, embedded StackRecord
    +pkg runtime, var BlockProfileRate int
    PTAL

    http://codereview.appspot.com/6443115/
  • Minux Ma at Sep 19, 2012 at 9:26 am
    http://codereview.appspot.com/6443115/diff/27013/src/pkg/runtime/runtime.c
    File src/pkg/runtime/runtime.c (right):

    http://codereview.appspot.com/6443115/diff/27013/src/pkg/runtime/runtime.c#newcode363
    src/pkg/runtime/runtime.c:363:
    runtime/pprof·runtime_cyclesPerSecond(int64 res)
    On 2012/09/18 21:28:42, dvyukov wrote:
    How about this?
    won't work on ARM.
    Linux/ARM doesn't allow user space access to the cycle counter.

    so runtime.cputicks() on ARM actually returns a random number
    (for seeding our hash map)
    The more precise method would require either to examine CPUID output
    (it's quite
    tricky, works since Pentium4 and I am not sure about AMD). I have no
    idea how to
    do it on ARM. Or alternatively, OS-dependent method is to read it from system
    registry on Windows, and I don't know about other OSes.
    I think the best way is to get it from the OS.

    http://codereview.appspot.com/6443115/
  • Rsc at Sep 20, 2012 at 6:47 pm
    LGTM

    Please wait for r to take a look at the new API + testing flags too.


    http://codereview.appspot.com/6443115/
  • R at Sep 20, 2012 at 6:57 pm
    http://codereview.appspot.com/6443115/diff/36001/src/cmd/go/test.go
    File src/cmd/go/test.go (right):

    http://codereview.appspot.com/6443115/diff/36001/src/cmd/go/test.go#newcode119
    src/cmd/go/test.go:119: aims to sample an average of one blocking event
    per that many
    s/that many/n/
    or else i don't understand this comment

    http://codereview.appspot.com/6443115/diff/36001/src/cmd/go/test.go#newcode120
    src/cmd/go/test.go:120: CPU cycles. By default all blocking events are
    recorded.
    by default, what is the value of n?
    i guess i don't understand this comment.

    http://codereview.appspot.com/6443115/diff/36001/src/pkg/runtime/debug.go
    File src/pkg/runtime/debug.go (right):

    http://codereview.appspot.com/6443115/diff/36001/src/pkg/runtime/debug.go#newcode144
    src/pkg/runtime/debug.go:144: // one blocking event per BlockProfileRate
    CPU cycles.
    i'm still confused. i don't see how blocking events and CPU cycles are
    related. sorry to be so dim.

    http://codereview.appspot.com/6443115/diff/36001/src/pkg/runtime/debug.go#newcode148
    src/pkg/runtime/debug.go:148: var BlockProfileRate int = 0
    an int with the value of 0 or 1 sounds like a bool to me.

    http://codereview.appspot.com/6443115/
  • Russ Cox at Sep 20, 2012 at 7:47 pm
    time spent blocking is measured in cpu cycles (perhaps confusingly)
    if the rate is set to 1000 then it tries to sample one blocking event
    per 1000 cpu cycles spent blocked.

    probably the rate and the output should be in nanoseconds or something
    more portable.
  • Dvyukov at Sep 20, 2012 at 8:01 pm
    https://codereview.appspot.com/6443115/diff/36001/src/cmd/go/test.go
    File src/cmd/go/test.go (right):

    https://codereview.appspot.com/6443115/diff/36001/src/cmd/go/test.go#newcode120
    src/cmd/go/test.go:120: CPU cycles. By default all blocking events are
    recorded.
    On 2012/09/20 18:57:58, r wrote:
    by default, what is the value of n?
    i guess i don't understand this comment.
    The default value of n is 1.

    https://codereview.appspot.com/6443115/diff/36001/src/pkg/runtime/debug.go
    File src/pkg/runtime/debug.go (right):

    https://codereview.appspot.com/6443115/diff/36001/src/pkg/runtime/debug.go#newcode148
    src/pkg/runtime/debug.go:148: var BlockProfileRate int = 0
    On 2012/09/20 18:57:58, r wrote:
    an int with the value of 0 or 1 sounds like a bool to me.
    The intention is the same as with MemProfileRate:

    n<=0 - turn off the profiler
    n=1 - record every blocking event
    n>1 - sample some fraction of events: if we have a blocking event of
    duration x CPU cycles, then it's sampled with probability min(1, x/n).
    So on average it's one blocking event per n CPU cycles of... total
    blocking events.

    I think the process will be faster, if you propose your wording (if I
    succeed in explaining the intention).

    https://codereview.appspot.com/6443115/
  • Dvyukov at Sep 20, 2012 at 8:16 pm

    On 2012/09/20 19:47:15, rsc wrote:
    time spent blocking is measured in cpu cycles (perhaps confusingly)
    if the rate is set to 1000 then it tries to sample one blocking event
    per 1000 cpu cycles spent blocked.
    probably the rate and the output should be in nanoseconds or something
    more portable.
    pprof expects input in cycles, pprof output is in seconds. This part
    looks fine.

    potentially BlockProfileRace may be measured in nanoseconds, but then I
    need to convert ns to cycles somewhere... I think it can be easily done
    in runtime·blockevent() (mprof.goc).


    https://codereview.appspot.com/6443115/
  • Russ Cox at Sep 20, 2012 at 8:12 pm

    potentially BlockProfileRace may be measured in nanoseconds, but then I
    need to convert ns to cycles somewhere... I think it can be easily done
    in runtime·blockevent() (mprof.goc).
    That's fine, but given that we can't even agree on how well to define
    cycles, they should be an internal detail. The API exposed by package
    runtime should talk about nanoseconds.

    Russ
  • Dmitry Vyukov at Sep 20, 2012 at 8:14 pm

    On Thu, Sep 20, 2012 at 1:12 PM, Russ Cox wrote:

    potentially BlockProfileRace may be measured in nanoseconds, but then I
    need to convert ns to cycles somewhere... I think it can be easily done
    in runtime·blockevent() (mprof.goc).
    That's fine, but given that we can't even agree on how well to define
    cycles, they should be an internal detail. The API exposed by package
    runtime should talk about nanoseconds.
    OK, I will change it to ns.
  • Dvyukov at Sep 20, 2012 at 10:31 pm
    PTAL

    Now BlockProfileRate is in ns.
    Fixed comments.
    Added runtime·tickspersecond(void), it caches the value so it is stable
    during execution.



    https://codereview.appspot.com/6443115/
  • R at Sep 21, 2012 at 4:57 am
    Now I'm bothered by the idea of a rate being measure in units of time
    rather than frequency. Sorry. I know it's analogous with the memory one
    but that didn't have units of time, so the rate/time thing didn't stick
    out.

    But I don't have a better idea, so maybe this will stand. Wording will
    help. I'll work on that.


    https://codereview.appspot.com/6443115/
  • R at Sep 21, 2012 at 4:57 am
    https://codereview.appspot.com/6443115/diff/37006/src/cmd/go/test.go
    File src/cmd/go/test.go (right):

    https://codereview.appspot.com/6443115/diff/37006/src/cmd/go/test.go#newcode120
    src/cmd/go/test.go:120: spent blocked. By default all blocking events
    are recorded.
    Control the detail provided in goroutine blocking profiles by setting
    runtime.BlockProfileRate to n. See 'godoc runtime BlockProfileRate'.
    The profiler aims to sample, on average, one blocking event every
    n nanoseconds the program (or is it a goroutine??) spends blocked. By
    default, if -test.blockprofile is set without this flag, all blocking
    events
    are recorded, equivalent to -testblockprofile=1.

    https://codereview.appspot.com/6443115/diff/37006/src/pkg/runtime/debug.go
    File src/pkg/runtime/debug.go (right):

    https://codereview.appspot.com/6443115/diff/37006/src/pkg/runtime/debug.go#newcode142
    src/pkg/runtime/debug.go:142: // that are recorded and reported in the
    blocking profile.
    s/recorded and// (it's redundant)

    https://codereview.appspot.com/6443115/diff/37006/src/pkg/runtime/debug.go#newcode144
    src/pkg/runtime/debug.go:144: // one blocking event per BlockProfileRate
    nanoseconds spent blocked.
    what does blocked mean? is it global, goroutine, mutex, channel...?

    https://codereview.appspot.com/6443115/
  • Russ Cox at Sep 21, 2012 at 5:06 am

    src/pkg/runtime/debug.go:144: // one blocking event per BlockProfileRate
    nanoseconds spent blocked.
    what does blocked mean? is it global, goroutine, mutex, channel...?
    it's a random sampling rate, so it's not per-anything.
    each time something gets blocked there is a chance
    it will be sampled.
  • Dvyukov at Sep 22, 2012 at 3:52 am
    PTAL


    https://codereview.appspot.com/6443115/diff/37006/src/cmd/go/test.go
    File src/cmd/go/test.go (right):

    https://codereview.appspot.com/6443115/diff/37006/src/cmd/go/test.go#newcode120
    src/cmd/go/test.go:120: spent blocked. By default all blocking events
    are recorded.
    On 2012/09/21 04:57:37, r wrote:
    Control the detail provided in goroutine blocking profiles by setting
    runtime.BlockProfileRate to n. See 'godoc runtime BlockProfileRate'.
    The profiler aims to sample, on average, one blocking event every
    n nanoseconds the program (or is it a goroutine??)
    It is per program.
    spends blocked. By
    default, if -test.blockprofile is set without this flag, all blocking events
    are recorded, equivalent to -testblockprofile=1.
    Done.

    https://codereview.appspot.com/6443115/diff/37006/src/pkg/runtime/debug.go
    File src/pkg/runtime/debug.go (right):

    https://codereview.appspot.com/6443115/diff/37006/src/pkg/runtime/debug.go#newcode142
    src/pkg/runtime/debug.go:142: // that are recorded and reported in the
    blocking profile.
    On 2012/09/21 04:57:37, r wrote:
    s/recorded and// (it's redundant)
    Done.

    https://codereview.appspot.com/6443115/diff/37006/src/pkg/runtime/debug.go#newcode144
    src/pkg/runtime/debug.go:144: // one blocking event per BlockProfileRate
    nanoseconds spent blocked.
    On 2012/09/21 04:57:37, r wrote:
    what does blocked mean? is it global, goroutine, mutex, channel...?
    Blocking event is when a goroutine blocks on
    mutex/channel/waitgroup/etc.

    https://codereview.appspot.com/6443115/
  • Dvyukov at Sep 26, 2012 at 9:49 pm
    Is it OK to submit now?
    On 2012/09/22 03:51:57, dvyukov wrote:
    PTAL
    https://codereview.appspot.com/6443115/diff/37006/src/cmd/go/test.go
    File src/cmd/go/test.go (right):

    https://codereview.appspot.com/6443115/diff/37006/src/cmd/go/test.go#newcode120
    src/cmd/go/test.go:120: spent blocked. By default all blocking events are
    recorded.
    On 2012/09/21 04:57:37, r wrote:
    Control the detail provided in goroutine blocking profiles by
    setting
    runtime.BlockProfileRate to n. See 'godoc runtime BlockProfileRate'.
    The profiler aims to sample, on average, one blocking event every
    n nanoseconds the program (or is it a goroutine??)
    It is per program.
    spends blocked. By
    default, if -test.blockprofile is set without this flag, all
    blocking events
    are recorded, equivalent to -testblockprofile=1.
    Done.

    https://codereview.appspot.com/6443115/diff/37006/src/pkg/runtime/debug.go
    File src/pkg/runtime/debug.go (right):

    https://codereview.appspot.com/6443115/diff/37006/src/pkg/runtime/debug.go#newcode142
    src/pkg/runtime/debug.go:142: // that are recorded and reported in the blocking
    profile.
    On 2012/09/21 04:57:37, r wrote:
    s/recorded and// (it's redundant)
    Done.

    https://codereview.appspot.com/6443115/diff/37006/src/pkg/runtime/debug.go#newcode144
    src/pkg/runtime/debug.go:144: // one blocking event per
    BlockProfileRate
    nanoseconds spent blocked.
    On 2012/09/21 04:57:37, r wrote:
    what does blocked mean? is it global, goroutine, mutex, channel...?
    Blocking event is when a goroutine blocks on
    mutex/channel/waitgroup/etc.



    http://codereview.appspot.com/6443115/
  • Minux Ma at Sep 27, 2012 at 9:39 am

    On 2012/09/26 21:49:28, dvyukov wrote:
    Is it OK to submit now?
    The only problem is that this feature is completely unusable
    on ARM due to lack of correct runtime.cputicks().

    I suggest you document this as a known BUG.

    https://codereview.appspot.com/6443115/
  • Dmitry Vyukov at Sep 27, 2012 at 6:53 pm
    Can't we use nanotime() or something like that on ARM? It may be slower,
    but still better than nothing. The profiler does not care too much that the
    cpu ticks are actually cpu ticks, it just needs some relatively precise
    time source.


    On Thu, Sep 27, 2012 at 2:39 AM, wrote:
    On 2012/09/26 21:49:28, dvyukov wrote:

    Is it OK to submit now?
    The only problem is that this feature is completely unusable
    on ARM due to lack of correct runtime.cputicks().

    I suggest you document this as a known BUG.

    https://codereview.appspot.**com/6443115/<https://codereview.appspot.com/6443115/>
  • Dave at Sep 29, 2012 at 4:32 am
    Let's try with nanotime() on arm in the spirit of getting this in.

    http://codereview.appspot.com/6443115/

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-dev @
categoriesgo
postedSep 11, '12 at 12:50a
activeSep 29, '12 at 4:32a
posts30
users5
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase