Indeed, good advice. I am having some trouble interpreting the profiles though.
The CPU profile picks out a few known hotspots in our code (*), but as
the CPU is only quite lightly loaded (~25% in top), it's not clear how
I can translate that into missed deadlines:
Showing top 20 nodes out of 94 (cum >= 180ms)
flat flat% sum% cum cum%
1810ms 11.96% 11.96% 1810ms 11.96% runtime.futex
1360ms 8.99% 20.95% 1360ms 8.99% func1*
1150ms 7.60% 28.55% 1150ms 7.60% runtime.usleep
910ms 6.01% 34.57% 2480ms 16.39% func2*
840ms 5.55% 40.12% 840ms 5.55% ExternalCode
480ms 3.17% 43.29% 480ms 3.17% scanblock
460ms 3.04% 46.33% 970ms 6.41% runtime.selectgoImpl
460ms 3.04% 49.37% 460ms 3.04% stepflt
410ms 2.71% 52.08% 540ms 3.57% func3*
330ms 2.18% 54.26% 380ms 2.51% runtime.mallocgc
270ms 1.78% 56.05% 310ms 2.05% runtime.cgocall_errno
250ms 1.65% 57.70% 250ms 1.65% findrunnable
190ms 1.26% 58.96% 190ms 1.26% runtime.nanotime
190ms 1.26% 60.21% 190ms 1.26% slowdodiv
180ms 1.19% 61.40% 180ms 1.19% crypto/elliptic.p256ReduceDegree
170ms 1.12% 62.52% 180ms 1.19% runtime.cas
160ms 1.06% 63.58% 160ms 1.06% runtime.epollwait
140ms 0.93% 64.51% 140ms 0.93% runtime.MSpan_Sweep
140ms 0.93% 65.43% 140ms 0.93% runtime.netpoll
140ms 0.93% 66.36% 180ms 1.19% runtime.unlock
Also there's nearly 20% of the time is spent in futex and usleep which
is rather surprising.
Looking at the block profile the output buffer routine on the ALSA
input seems quite low down:
7158.65s of 7158.67s total ( 100%)
Dropped 113 nodes (cum <= 35.79s)
Showing top 10 nodes out of 39 (cum >= 630.11s)
flat flat% sum% cum cum%
5637.64s 78.75% 78.75% 5637.64s 78.75% runtime.selectgo
1154.66s 16.13% 94.88% 1154.66s 16.13% runtime.chanrecv2
309.02s 4.32% 99.20% 309.02s 4.32% runtime.chanrecv1
57.34s 0.8% 100% 57.34s 0.8% runtime.chansend1
0 0% 100% 327.87s 4.58% subsystem1
0 0% 100% 330.86s 4.62% subsystem2
0 0% 100% 340.31s 4.75% subsystem3
0 0% 100% 325.51s 4.55% AlsaIn.outputBuffer
And in fact according to "tree" that seems to be blocking on chanrecv,
so I guess that is the receive side of the buffer waiting for data. Is
that interpretation plausibly correct? Are there any tricks to
interpreting the blocking profile?
Looking at the output from gctrace I see there is a bit of garbage
collection going on (although I didn't see that much in the CPU
profile so I don't know how significant). I guess I care most about
mark time and stop-the-world time i.e. the second and third numbers?
An average looking kind of trace:
gc369(2): 164+8993+13325+229 us, 9 -> 29 MB, 56669 (4017784-3961115)
objects, 52 goroutines, 3248/2023/0 sweeps, 2(61) handoff, 6(107)
steal, 26/14/1 yields
gc370(2): 134+45+14261+189 us, 23 -> 46 MB, 58784 (4064232-4005448)
objects, 51 goroutines, 3248/2459/0 sweeps, 1(9) handoff, 6(107)
steal, 26/18/1 yields
gc371(2): 67+52+15610+161 us, 4 -> 9 MB, 35209 (4087154-4051945)
objects, 51 goroutines, 3248/3210/0 sweeps, 0(0) handoff, 6(107)
steal, 16/10/2 yields
gc372(2): 300+8014+23347+162 us, 6 -> 12 MB, 38822 (4113035-4074213)
objects, 51 goroutines, 3248/1326/0 sweeps, 0(0) handoff, 6(103)
steal, 16/10/1 yields
gc373(2): 223+47+15901+163 us, 7 -> 15 MB, 43224 (4142464-4099240)
objects, 51 goroutines, 3248/1627/0 sweeps, 0(0) handoff, 4(98) steal,
16/10/1 yields
gc374(2): 156+40+17592+170 us, 9 -> 29 MB, 46852 (4174458-4127606)
objects, 52 goroutines, 3248/2027/0 sweeps, 2(18) handoff, 6(107)
steal, 36/21/2 yields
gc375(2): 489+46+14025+424 us, 23 -> 46 MB, 67353 (4231717-4164364)
objects, 51 goroutines, 3256/2411/0 sweeps, 1(3) handoff, 6(109)
steal, 16/10/2 yields
gc376(2): 486+44+14404+291 us, 4 -> 9 MB, 34254 (4253767-4219513)
objects, 51 goroutines, 3256/3222/0 sweeps, 1(3) handoff, 6(109)
steal, 26/20/2 yields
gc377(2): 652+44+15105+206 us, 5 -> 11 MB, 38130 (4279075-4240945)
objects, 51 goroutines, 3256/1297/0 sweeps, 0(0) handoff, 6(107)
steal, 16/10/4 yields
gc378(2): 938+50+15465+350 us, 7 -> 14 MB, 42766 (4308216-4265450)
objects, 51 goroutines, 3256/1573/0 sweeps, 0(0) handoff, 6(106)
steal, 16/10/2 yields
gc379(2): 1343+44+17240+163 us, 9 -> 18 MB, 51718 (4346057-4294339)
objects, 56 goroutines, 3256/1974/0 sweeps, 0(0) handoff, 6(105)
steal, 16/10/4 yields
gc380(2): 33+44+7353+136 us, 12 -> 34 MB, 20205 (4348012-4327807)
objects, 57 goroutines, 3256/2461/0 sweeps, 1(3) handoff, 6(109)
steal, 16/10/1 yields
scvg12: inuse: 53, idle: 0, sys: 54, released: 0, consumed: 54 (MB)
Which seems to suggest I will regularly see delays of 20ms, and I
guess spikes of more. Although I should probably be able to cope with
that, Is there any way I can mitigate it?
On 19 May 2015 at 01:29, Dave Cheney wrote:Don't guess, profile.
env GODEBUG=gctrace=1 ./yourprogram
github.com/pkg/profile and use the cpu or block profile options.
On Tuesday, 19 May 2015 07:51:20 UTC+10, Klaus Post wrote:On Monday, May 18, 2015 at 10:25:43 PM UTC+2, Will Newton wrote:
However on slower machines this code will occasionally report an ALSA
buffer overrun, which means that the call to snd_pcm_readi did not
happen in time. At this point the system is only lightly loaded and
the time constraints are relatively relaxed (approx. 1/25 of a
second).
GC is a pretty good candidate. With 25 buffers/sec, that leaves absolute
maximum of 40ms for a GC sweep. Depending on the speed of the computer and
the number of allocated objects, that might not be enough.
Would it be possible to buffer on the C-side of your code, or possibly
have a bigger buffer? Maybe a thread outside Go? Granted that gives
additional latency, which may be problematic, but I think a 40ms guarantee
is a bit much to expect from the GC.
/Klaus
--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/d/optout. --
Will Newton
Software - Cocoon (http://cocoon.life)
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/d/optout.