I don't see the point to the exercise as far as optimizing golang is concerned. Your experiment just shows that Your compiler (GCC?) missed an optimization as far as reducing backend latency goes.

You may also find that swapping the order of some of the instructions such as the second and the third in the loop may also reduce backend latency further.

I am not on a high end Intel CPU now, but when I was I found that with a buffer size adjusted to the L1 cache size (8192 32-bit words or 32 Kilobytes) that eratspeed ran on an Intel 2700K @ 3.5 GHz at about 3.5 clock cycles per loop (about 405,000,000 loops for this range).

My current AMD Bulldozer CPU has a severe cache bottleneck and can't come close to this speed by a factor of about two.

You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 21 of 27 | next ›
Discussion Overview
groupgolang-nuts @
postedJun 14, '16 at 12:31p
activeJun 18, '16 at 10:54a



site design / logo © 2021 Grokbase