Looks pretty good to me as well. I'll run some benchmarks on 32 bit
platforms once mine get through building the overnight changes.
File src/pkg/crypto/sha1/sha1block.go (right):
src/pkg/crypto/sha1/sha1block.go:26: j := i * 4
j := i << 2 avoids the imul on intel. The compiler should be smarter