FAQ
Hi all,

I have a problem with goroutines that I'm not sure how to solve. I
have some ALSA bindings that I use to capture audio:

https://github.com/cocoonlife/goalsa/

On top of this I have a loop that looks like:

   for {
      // This C function will block
      buf := C.snd_pcm_readi(..)
      // Write output to a buffered channel
      ch <- buf
   }

However on slower machines this code will occasionally report an ALSA
buffer overrun, which means that the call to snd_pcm_readi did not
happen in time. At this point the system is only lightly loaded and
the time constraints are relatively relaxed (approx. 1/25 of a
second).

Has anybody got any suggestions how I can improve this situation? e.g.
would LockOSThread() help?

Are there any good ways to debug this type of problem? For example the
blocking profiler tells me I am blocking mostly in cgo, but that is
what I expect.

Thanks,

--
Will Newton
Software - Cocoon (http://cocoon.life)

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

  • John Asmuth at May 18, 2015 at 9:07 pm
    Perhaps whatever is on the other end of ch is not receiving fast enough? is
    the channel buffered?
    On Monday, May 18, 2015 at 4:25:43 PM UTC-4, Will Newton wrote:

    Hi all,

    I have a problem with goroutines that I'm not sure how to solve. I
    have some ALSA bindings that I use to capture audio:

    https://github.com/cocoonlife/goalsa/

    On top of this I have a loop that looks like:

    for {
    // This C function will block
    buf := C.snd_pcm_readi(..)
    // Write output to a buffered channel
    ch <- buf
    }

    However on slower machines this code will occasionally report an ALSA
    buffer overrun, which means that the call to snd_pcm_readi did not
    happen in time. At this point the system is only lightly loaded and
    the time constraints are relatively relaxed (approx. 1/25 of a
    second).

    Has anybody got any suggestions how I can improve this situation? e.g.
    would LockOSThread() help?

    Are there any good ways to debug this type of problem? For example the
    blocking profiler tells me I am blocking mostly in cgo, but that is
    what I expect.

    Thanks,

    --
    Will Newton
    Software - Cocoon (http://cocoon.life)
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Will Newton at May 18, 2015 at 9:15 pm
    Yes that's possible, but the channel is buffered (see comment) so that
    would mean a pretty huge scheduling delay. I could up the buffer size
    but it feels to me like I could be papering over the real problem
    here.
    On 18 May 2015 at 22:07, John Asmuth wrote:
    Perhaps whatever is on the other end of ch is not receiving fast enough? is
    the channel buffered?

    On Monday, May 18, 2015 at 4:25:43 PM UTC-4, Will Newton wrote:

    Hi all,

    I have a problem with goroutines that I'm not sure how to solve. I
    have some ALSA bindings that I use to capture audio:

    https://github.com/cocoonlife/goalsa/

    On top of this I have a loop that looks like:

    for {
    // This C function will block
    buf := C.snd_pcm_readi(..)
    // Write output to a buffered channel
    ch <- buf
    }

    However on slower machines this code will occasionally report an ALSA
    buffer overrun, which means that the call to snd_pcm_readi did not
    happen in time. At this point the system is only lightly loaded and
    the time constraints are relatively relaxed (approx. 1/25 of a
    second).

    Has anybody got any suggestions how I can improve this situation? e.g.
    would LockOSThread() help?

    Are there any good ways to debug this type of problem? For example the
    blocking profiler tells me I am blocking mostly in cgo, but that is
    what I expect.

    Thanks,

    --
    Will Newton
    Software - Cocoon (http://cocoon.life)
    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.


    --
    Will Newton
    Software - Cocoon (http://cocoon.life)

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • John Asmuth at May 18, 2015 at 9:30 pm
    I suggest increasing the buffer size, at least experimentally. If it
    lessens the problem, at least you know the core cause (other end isn't
    reading fast enough) and can address it more directly.
    On Mon, May 18, 2015 at 5:15 PM, Will Newton wrote:

    Yes that's possible, but the channel is buffered (see comment) so that
    would mean a pretty huge scheduling delay. I could up the buffer size
    but it feels to me like I could be papering over the real problem
    here.
    On 18 May 2015 at 22:07, John Asmuth wrote:
    Perhaps whatever is on the other end of ch is not receiving fast enough? is
    the channel buffered?

    On Monday, May 18, 2015 at 4:25:43 PM UTC-4, Will Newton wrote:

    Hi all,

    I have a problem with goroutines that I'm not sure how to solve. I
    have some ALSA bindings that I use to capture audio:

    https://github.com/cocoonlife/goalsa/

    On top of this I have a loop that looks like:

    for {
    // This C function will block
    buf := C.snd_pcm_readi(..)
    // Write output to a buffered channel
    ch <- buf
    }

    However on slower machines this code will occasionally report an ALSA
    buffer overrun, which means that the call to snd_pcm_readi did not
    happen in time. At this point the system is only lightly loaded and
    the time constraints are relatively relaxed (approx. 1/25 of a
    second).

    Has anybody got any suggestions how I can improve this situation? e.g.
    would LockOSThread() help?

    Are there any good ways to debug this type of problem? For example the
    blocking profiler tells me I am blocking mostly in cgo, but that is
    what I expect.

    Thanks,

    --
    Will Newton
    Software - Cocoon (http://cocoon.life)
    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.


    --
    Will Newton
    Software - Cocoon (http://cocoon.life)
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Klaus Post at May 18, 2015 at 9:51 pm

    On Monday, May 18, 2015 at 10:25:43 PM UTC+2, Will Newton wrote:
    However on slower machines this code will occasionally report an ALSA
    buffer overrun, which means that the call to snd_pcm_readi did not
    happen in time. At this point the system is only lightly loaded and
    the time constraints are relatively relaxed (approx. 1/25 of a
    second).
    GC is a pretty good candidate. With 25 buffers/sec, that leaves absolute
    maximum of 40ms for a GC sweep. Depending on the speed of the computer and
    the number of allocated objects, that might not be enough.

    Would it be possible to buffer on the C-side of your code, or possibly have
    a bigger buffer? Maybe a thread outside Go? Granted that gives additional
    latency, which may be problematic, but I think a 40ms guarantee is a bit
    much to expect from the GC.

    /Klaus

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Dave Cheney at May 19, 2015 at 12:29 am
    Don't guess, profile.

    env GODEBUG=gctrace=1 ./yourprogram

    github.com/pkg/profile and use the cpu or block profile options.
    On Tuesday, 19 May 2015 07:51:20 UTC+10, Klaus Post wrote:
    On Monday, May 18, 2015 at 10:25:43 PM UTC+2, Will Newton wrote:

    However on slower machines this code will occasionally report an ALSA
    buffer overrun, which means that the call to snd_pcm_readi did not
    happen in time. At this point the system is only lightly loaded and
    the time constraints are relatively relaxed (approx. 1/25 of a
    second).
    GC is a pretty good candidate. With 25 buffers/sec, that leaves absolute
    maximum of 40ms for a GC sweep. Depending on the speed of the computer and
    the number of allocated objects, that might not be enough.

    Would it be possible to buffer on the C-side of your code, or possibly
    have a bigger buffer? Maybe a thread outside Go? Granted that gives
    additional latency, which may be problematic, but I think a 40ms guarantee
    is a bit much to expect from the GC.

    /Klaus
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Will Newton at May 19, 2015 at 9:36 am
    Indeed, good advice. I am having some trouble interpreting the profiles though.

    The CPU profile picks out a few known hotspots in our code (*), but as
    the CPU is only quite lightly loaded (~25% in top), it's not clear how
    I can translate that into missed deadlines:

    Showing top 20 nodes out of 94 (cum >= 180ms)
           flat flat% sum% cum cum%
         1810ms 11.96% 11.96% 1810ms 11.96% runtime.futex
         1360ms 8.99% 20.95% 1360ms 8.99% func1*
         1150ms 7.60% 28.55% 1150ms 7.60% runtime.usleep
          910ms 6.01% 34.57% 2480ms 16.39% func2*
          840ms 5.55% 40.12% 840ms 5.55% ExternalCode
          480ms 3.17% 43.29% 480ms 3.17% scanblock
          460ms 3.04% 46.33% 970ms 6.41% runtime.selectgoImpl
          460ms 3.04% 49.37% 460ms 3.04% stepflt
          410ms 2.71% 52.08% 540ms 3.57% func3*
          330ms 2.18% 54.26% 380ms 2.51% runtime.mallocgc
          270ms 1.78% 56.05% 310ms 2.05% runtime.cgocall_errno
          250ms 1.65% 57.70% 250ms 1.65% findrunnable
          190ms 1.26% 58.96% 190ms 1.26% runtime.nanotime
          190ms 1.26% 60.21% 190ms 1.26% slowdodiv
          180ms 1.19% 61.40% 180ms 1.19% crypto/elliptic.p256ReduceDegree
          170ms 1.12% 62.52% 180ms 1.19% runtime.cas
          160ms 1.06% 63.58% 160ms 1.06% runtime.epollwait
          140ms 0.93% 64.51% 140ms 0.93% runtime.MSpan_Sweep
          140ms 0.93% 65.43% 140ms 0.93% runtime.netpoll
          140ms 0.93% 66.36% 180ms 1.19% runtime.unlock

    Also there's nearly 20% of the time is spent in futex and usleep which
    is rather surprising.

    Looking at the block profile the output buffer routine on the ALSA
    input seems quite low down:

    7158.65s of 7158.67s total ( 100%)
    Dropped 113 nodes (cum <= 35.79s)
    Showing top 10 nodes out of 39 (cum >= 630.11s)
           flat flat% sum% cum cum%
       5637.64s 78.75% 78.75% 5637.64s 78.75% runtime.selectgo
       1154.66s 16.13% 94.88% 1154.66s 16.13% runtime.chanrecv2
        309.02s 4.32% 99.20% 309.02s 4.32% runtime.chanrecv1
         57.34s 0.8% 100% 57.34s 0.8% runtime.chansend1
              0 0% 100% 327.87s 4.58% subsystem1
              0 0% 100% 330.86s 4.62% subsystem2
              0 0% 100% 340.31s 4.75% subsystem3
              0 0% 100% 325.51s 4.55% AlsaIn.outputBuffer

    And in fact according to "tree" that seems to be blocking on chanrecv,
    so I guess that is the receive side of the buffer waiting for data. Is
    that interpretation plausibly correct? Are there any tricks to
    interpreting the blocking profile?

    Looking at the output from gctrace I see there is a bit of garbage
    collection going on (although I didn't see that much in the CPU
    profile so I don't know how significant). I guess I care most about
    mark time and stop-the-world time i.e. the second and third numbers?

    An average looking kind of trace:

    gc369(2): 164+8993+13325+229 us, 9 -> 29 MB, 56669 (4017784-3961115)
    objects, 52 goroutines, 3248/2023/0 sweeps, 2(61) handoff, 6(107)
    steal, 26/14/1 yields
    gc370(2): 134+45+14261+189 us, 23 -> 46 MB, 58784 (4064232-4005448)
    objects, 51 goroutines, 3248/2459/0 sweeps, 1(9) handoff, 6(107)
    steal, 26/18/1 yields
    gc371(2): 67+52+15610+161 us, 4 -> 9 MB, 35209 (4087154-4051945)
    objects, 51 goroutines, 3248/3210/0 sweeps, 0(0) handoff, 6(107)
    steal, 16/10/2 yields
    gc372(2): 300+8014+23347+162 us, 6 -> 12 MB, 38822 (4113035-4074213)
    objects, 51 goroutines, 3248/1326/0 sweeps, 0(0) handoff, 6(103)
    steal, 16/10/1 yields
    gc373(2): 223+47+15901+163 us, 7 -> 15 MB, 43224 (4142464-4099240)
    objects, 51 goroutines, 3248/1627/0 sweeps, 0(0) handoff, 4(98) steal,
    16/10/1 yields
    gc374(2): 156+40+17592+170 us, 9 -> 29 MB, 46852 (4174458-4127606)
    objects, 52 goroutines, 3248/2027/0 sweeps, 2(18) handoff, 6(107)
    steal, 36/21/2 yields
    gc375(2): 489+46+14025+424 us, 23 -> 46 MB, 67353 (4231717-4164364)
    objects, 51 goroutines, 3256/2411/0 sweeps, 1(3) handoff, 6(109)
    steal, 16/10/2 yields
    gc376(2): 486+44+14404+291 us, 4 -> 9 MB, 34254 (4253767-4219513)
    objects, 51 goroutines, 3256/3222/0 sweeps, 1(3) handoff, 6(109)
    steal, 26/20/2 yields
    gc377(2): 652+44+15105+206 us, 5 -> 11 MB, 38130 (4279075-4240945)
    objects, 51 goroutines, 3256/1297/0 sweeps, 0(0) handoff, 6(107)
    steal, 16/10/4 yields
    gc378(2): 938+50+15465+350 us, 7 -> 14 MB, 42766 (4308216-4265450)
    objects, 51 goroutines, 3256/1573/0 sweeps, 0(0) handoff, 6(106)
    steal, 16/10/2 yields
    gc379(2): 1343+44+17240+163 us, 9 -> 18 MB, 51718 (4346057-4294339)
    objects, 56 goroutines, 3256/1974/0 sweeps, 0(0) handoff, 6(105)
    steal, 16/10/4 yields
    gc380(2): 33+44+7353+136 us, 12 -> 34 MB, 20205 (4348012-4327807)
    objects, 57 goroutines, 3256/2461/0 sweeps, 1(3) handoff, 6(109)
    steal, 16/10/1 yields
    scvg12: inuse: 53, idle: 0, sys: 54, released: 0, consumed: 54 (MB)

    Which seems to suggest I will regularly see delays of 20ms, and I
    guess spikes of more. Although I should probably be able to cope with
    that, Is there any way I can mitigate it?




    On 19 May 2015 at 01:29, Dave Cheney wrote:
    Don't guess, profile.

    env GODEBUG=gctrace=1 ./yourprogram

    github.com/pkg/profile and use the cpu or block profile options.

    On Tuesday, 19 May 2015 07:51:20 UTC+10, Klaus Post wrote:
    On Monday, May 18, 2015 at 10:25:43 PM UTC+2, Will Newton wrote:

    However on slower machines this code will occasionally report an ALSA
    buffer overrun, which means that the call to snd_pcm_readi did not
    happen in time. At this point the system is only lightly loaded and
    the time constraints are relatively relaxed (approx. 1/25 of a
    second).

    GC is a pretty good candidate. With 25 buffers/sec, that leaves absolute
    maximum of 40ms for a GC sweep. Depending on the speed of the computer and
    the number of allocated objects, that might not be enough.

    Would it be possible to buffer on the C-side of your code, or possibly
    have a bigger buffer? Maybe a thread outside Go? Granted that gives
    additional latency, which may be problematic, but I think a 40ms guarantee
    is a bit much to expect from the GC.

    /Klaus
    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.


    --
    Will Newton
    Software - Cocoon (http://cocoon.life)

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Will Newton at May 19, 2015 at 8:26 am

    On 18 May 2015 at 22:51, Klaus Post wrote:
    On Monday, May 18, 2015 at 10:25:43 PM UTC+2, Will Newton wrote:

    However on slower machines this code will occasionally report an ALSA
    buffer overrun, which means that the call to snd_pcm_readi did not
    happen in time. At this point the system is only lightly loaded and
    the time constraints are relatively relaxed (approx. 1/25 of a
    second).

    GC is a pretty good candidate. With 25 buffers/sec, that leaves absolute
    maximum of 40ms for a GC sweep. Depending on the speed of the computer and
    the number of allocated objects, that might not be enough.

    Would it be possible to buffer on the C-side of your code, or possibly have
    a bigger buffer? Maybe a thread outside Go? Granted that gives additional
    latency, which may be problematic, but I think a 40ms guarantee is a bit
    much to expect from the GC.
    It's possible. ALSA is quite unpredictable with buffers - some
    hardware allows very large (MBs) buffers, other hardware can only
    provide a few kB. It would be a shame to have to add an extra layer of
    complexity just for that case.

    --
    Will Newton
    Software - Cocoon (http://cocoon.life)

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Kevin Malachowski at May 19, 2015 at 1:17 am
    Is that reading a sample at a time? I've run into problems where trying to
    send a sample at a time over a channel is too slow (ie sending on channels
    44,100 times a second doesn't scale to many goroutines). If the library
    doesn't have a function for reading multiple samples at a time you could
    try manually buffering to a slice and sending slices of samples through
    your channels instead.
    On Monday, May 18, 2015 at 1:25:43 PM UTC-7, Will Newton wrote:

    Hi all,

    I have a problem with goroutines that I'm not sure how to solve. I
    have some ALSA bindings that I use to capture audio:

    https://github.com/cocoonlife/goalsa/

    On top of this I have a loop that looks like:

    for {
    // This C function will block
    buf := C.snd_pcm_readi(..)
    // Write output to a buffered channel
    ch <- buf
    }

    However on slower machines this code will occasionally report an ALSA
    buffer overrun, which means that the call to snd_pcm_readi did not
    happen in time. At this point the system is only lightly loaded and
    the time constraints are relatively relaxed (approx. 1/25 of a
    second).

    Has anybody got any suggestions how I can improve this situation? e.g.
    would LockOSThread() help?

    Are there any good ways to debug this type of problem? For example the
    blocking profiler tells me I am blocking mostly in cgo, but that is
    what I expect.

    Thanks,

    --
    Will Newton
    Software - Cocoon (http://cocoon.life)
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Will Newton at May 19, 2015 at 8:50 am
    I'm reading one period of samples at a time. How large that is varies
    with the ALSA hardware parameters but in this case is about 1500
    samples.
    On 19 May 2015 at 02:17, Kevin Malachowski wrote:
    Is that reading a sample at a time? I've run into problems where trying to
    send a sample at a time over a channel is too slow (ie sending on channels
    44,100 times a second doesn't scale to many goroutines). If the library
    doesn't have a function for reading multiple samples at a time you could try
    manually buffering to a slice and sending slices of samples through your
    channels instead.

    On Monday, May 18, 2015 at 1:25:43 PM UTC-7, Will Newton wrote:

    Hi all,

    I have a problem with goroutines that I'm not sure how to solve. I
    have some ALSA bindings that I use to capture audio:

    https://github.com/cocoonlife/goalsa/

    On top of this I have a loop that looks like:

    for {
    // This C function will block
    buf := C.snd_pcm_readi(..)
    // Write output to a buffered channel
    ch <- buf
    }

    However on slower machines this code will occasionally report an ALSA
    buffer overrun, which means that the call to snd_pcm_readi did not
    happen in time. At this point the system is only lightly loaded and
    the time constraints are relatively relaxed (approx. 1/25 of a
    second).

    Has anybody got any suggestions how I can improve this situation? e.g.
    would LockOSThread() help?

    Are there any good ways to debug this type of problem? For example the
    blocking profiler tells me I am blocking mostly in cgo, but that is
    what I expect.

    Thanks,

    --
    Will Newton
    Software - Cocoon (http://cocoon.life)
    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.


    --
    Will Newton
    Software - Cocoon (http://cocoon.life)

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedMay 18, '15 at 8:25p
activeMay 19, '15 at 9:36a
posts10
users5
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase