On 2012/11/06 19:08:54, rsc wrote:
i&0xf will get you something here.
I am not sure I understand. Since the loop only goes from [0, 15],
anding with 0xf should be a no-op.
The compiler isn't smart enough to recognize that i is guaranteed to
be in range here, but &0xf will force it to notice that and avoid the
bounds check. It is unlikely to matter much.

I adding this change, and got a small increase in speed.

On 2012/11/06 04:06:47, dfc wrote:
for ; i < 20; i++ { .. }
and so forth
FWIW this is unlikely to matter.
I was hoping that maybe the compiler saw this. I didn't unroll the
loops because the compiler should be doing it and it decreases
readability. In addition, the comments at the top of this file hint
it not being a priority to make this as fast as possible.
Dave suggested a single i instead of 5 different i variables. My
comment was only that from a performance standpoint it is unlikely to
matter. The compiler does not unroll the loops, and in general I think
it is unreasonable to expect it to. The big win you can get from the
SHA1 block function is if you unroll the loops 5x and do the variable
renamings in each iteration, so that all the rotating assignments
disappear. If you want to do this it should be done in a separate CL
and the speed gain should justify the increase in code. The last time
I looked at this I think it was not quite enough. crypto/md5/gen.go is
the program that generates the MD5 block routine. You could use
something similar for SHA1.
The fear is that unrolling the loop means potentially more i-cache
misses, so I agree it should be in a future CL. Hoisting i out of the
loop in the meantime showed a little extra gain.


Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 8 of 20 | next ›
Discussion Overview
groupgolang-dev @
postedNov 6, '12 at 3:22a
activeNov 7, '12 at 2:41a



site design / logo © 2023 Grokbase