I did some more experimenting today.
I haven't figured out a way to return both the count and the average
bytes with an API that I like unless I make two separate functions. The
main reason is that there are a dozen instances where we are caring
about the number of allocations and only one in which we care about the
number of bytes, and it's a regression test about a very specific bug.
It looks like I'm discarding an error if I make the byte size the second
argument, and it seems strange to always discard the first argument as
it should be the more important one. For what it's worth, I'd prefer to
leave it the way it is, and copy/paste AllocsPerRun with the requisite
changes in the one place it's needed; if it turns out to be a common
test in the future, it could be promoted to the testing package.
I also tried getting rid of b.Allocs. This seems the right way to go,
as benchmem reports identical data to what Allocs reports:
BenchmarkParser 500 4875246 ns/op 16.03 MB/s 591554 B/op
7995 allocs/op
--- BENCH: BenchmarkParser
parse_test.go:390: 1 iterations, 8099 mallocs per iteration
parse_test.go:390: 100 iterations, 8053.72 mallocs per iteration
parse_test.go:390: 500 iterations, 7995.888 mallocs per iteration
BenchmarkRawLevelTokenizer 5000 734455 ns/op 106.42 MB/s
4957 B/op 12 allocs/op
--- BENCH: BenchmarkRawLevelTokenizer
token_test.go:675: 1 iterations, 12 mallocs per iteration
token_test.go:675: 100 iterations, 12.01 mallocs per iteration
token_test.go:675: 5000 iterations, 12.0296 mallocs per iteration
BenchmarkLowLevelTokenizer 2000 1001182 ns/op 78.07 MB/s
5060 B/op 25 allocs/op
--- BENCH: BenchmarkLowLevelTokenizer
token_test.go:675: 1 iterations, 25 mallocs per iteration
token_test.go:675: 100 iterations, 25.01 mallocs per iteration
token_test.go:675: 2000 iterations, 25.029 mallocs per iteration
BenchmarkHighLevelTokenizer 1000 1636230 ns/op 47.77 MB/s
103299 B/op 3221 allocs/op
--- BENCH: BenchmarkHighLevelTokenizer
token_test.go:675: 1 iterations, 3223 mallocs per iteration
token_test.go:675: 100 iterations, 3221.4 mallocs per iteration
token_test.go:675: 1000 iterations, 3221.277 mallocs per iteration
I think I agree that an EnableMallocReport-like API will be sufficient
for the benchmarks. Only one (the parser benchmark in exp/html) seems
to do significant allocation outside the loop (reading in the text
file), and that can probably be done in an init() that is then used by
the benchmark.
https://codereview.appspot.com/7002055/