FAQ
Profiling on windows (where the occupancy problem happens) using the Go
recommended profiler (pprof), requires installing perl, which I don't
really want to do. I am moderately confident that sha1 hashing is more cpu
intensive than allocating memory, based solely on my experience
implementing sha1 in the past. I do not know how the Go memory allocator
works. Its also possible that acquiring the lock for the channel may be
the source of contention.

@Dave Cheney: I did read your answer, but it wasn't related to the question
I asked. Having each thread produce the data it will be hashing is likely
faster / more efficient. That is besides the point: I am noticing a
behavior which looks suspiciously like a scheduler priority issue.

The code posted (which is valid, working code) reliably causes the issue to
occur. Criticizing the code doesn't explore the possibility of there being
a scheduler issue.
On Monday, March 10, 2014 9:23:44 PM UTC-7, egon wrote:
On Tuesday, March 11, 2014 4:31:18 AM UTC+2, shif...@gmail.com wrote:

Please take care to see the intent of my original post. I would totally
agree that there are performance optimizations possible. This is a work in
progress when I posted it. The problem I am seeing is that it is not
saturating the processor, not that it is running slowly.
Please don't dismiss notes that you should profile your code better to
understand why things are not saturating. Once you do, the optimizations
may make more sense.

Also, usually when we look at code we have some comments where it could be
better, so there's no point in keeping those comments to ourselves.

The hashing calls should be pretty expensive compared to filling in a
slice with about 1 - 2 bytes (and even then, it should be eating CPU too).
You are assuming and not knowing. I mean exactly how much more expensive
is it? Why is it more expensive?

What is your suffix production rate&throughput? (measure only 1 producer
in main)
What is your prefix consumption rate&throughput? (measure only 1
consumer in main)
What is the main bottleneck in your code? (use pprof)

I.e. the main thing we asked you to do is, profile your code better, it
may not be the easy answer you were looking for, but it should give you
your intended answer + you'll understand how to figure out these things on
your own.

+ egon

On Monday, March 10, 2014 1:21:00 AM UTC-7, Péter Szilágyi wrote:

Hi all,

Egon, take care, you removed the deep copy of the suffix slice in the
generator, so all subsequent calls will overwrite the previously generated
ones.

Btw, there are some serious optimization possibilities in the suffix
check (numbers based on a Core2).

- Creating a hasher is expensive. Don't do it in every iteration.
Create one at the beginning of checkSuffix and reset after use. (+33%
performance, from ~600K to ~800K hash/sec)
- Allocating a new buffer for the hasher is not needed. A hasher is
already a writer, which accumulates data until Sum is called, so bin the
buf and write directly into hasher (+25%, from ~800K to ~1M hash/sec).
- In each iteration you convert your byte slice to string 4 times
just for the prefix check. That is a fantastic performance penalty, which
is not needed. The go libs have a package similar to strings (i.e. bytes)
that can do the same prefix check, without the need to convert to string
(+50%, from ~1M to ~1.5M hash/sec).
- Another small gain can be done by placing your full-prefix-check
inside the logger (3-len pref check). If the first 3 don't match, why
bother checking again for full 8 chars?
- Lastly, a hasher Write will *never* return an error<http://golang.org/pkg/hash/#Hash>.
Hence the two error checks are not needed (i.e. less branching).

This was just a quick half hour overview, but I already managed to get
you a 250% performance increase :) With the above mods, the buffer starts
getting below 50% some of the times on my machine too, meaning that 1
thread won't be enough to keep up with a lot of parallel hashers. You
should try and benchmark just the generation speed and I'd way you'll
probably need to split it into either multiple threads, or even better (if
you don't need ordered suffixes), split the suffix space into N pieces and
have each check-er run the generation code too. Less garbage to collect, no
sync between the threads.

Btw, here's my code: http://play.golang.org/p/-v0Ctfkbkc Ps, I've
added a small extra field into the struct to count the number of generated
suffixes (and output them during your 1sec logging). It's a better
performance metric than the content of the buffer.

Cheers,
Peter

On Mon, Mar 10, 2014 at 8:08 AM, egon wrote:

Try to make the whole process more deterministic, that way you can
analyse the whole thing better. Also measure the throughput better,
currently it's hard to tell which one is doing better work... i.e. is the
CPU lower because it's handling something better or because it's constantly
blocking somewhere.

Run it through the profiler and see where all the time is spent.
http://blog.golang.org/profiling-go-programs

Also http://play.golang.org/p/XUriUbzmCK should be better performance
wise.
It's a bit racy due to "running" boolean, but assuming you have your
correct answer at that moment, you probably won't care.

+ egon
On Monday, March 10, 2014 2:30:56 AM UTC+2, shif...@gmail.com wrote:

Hi Go nuts,

I have a simple program ( http://play.golang.org/p/LKgxe17kfx )
that tries to find a sha1 hash that is prefix with the current date
(0x20140309). Running on a 4core Windows machine I get very low throughput
with Task manager showing about 15% load. Running on an 8core Linux
machine I get much higher throughput with around 50-60% load. Is there a
reason why the Windows Go Scheduler might be treating goroutines
differently than Go in Linux?


Additionally, the channel "dh.suffix" is almost continuously empty on
the windows machine, and about half full on the linux one, which is what
leads me to suspect a scheduler difference.

I am setting runtime.GOMAXPROCS(runtime.NumCPU()) in the code, and am
running the following Go versions:
go version go1.2 linux/amd64
go version go1.2 windows/amd64
--
You received this message because you are subscribed to the Google
Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 10 of 13 | next ›
Discussion Overview
groupgolang-nuts @
categoriesgo
postedMar 10, '14 at 12:31a
activeMar 11, '14 at 7:14a
posts13
users6
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase