FAQ
Reviewers: golang-dev_googlegroups.com,

Message:
Hello golang-dev@googlegroups.com,

I'd like you to review this change to
https://code.google.com/p/go


Description:
crypto/md5: add arm to md5block special case

benchmark old ns/op new ns/op delta
BenchmarkHash8Bytes 4159 3346 -19.55%
BenchmarkHash1K 24571 11755 -52.16%
BenchmarkHash8K 166339 70581 -57.57%

benchmark old MB/s new MB/s speedup
BenchmarkHash8Bytes 1.92 2.39 1.24x
BenchmarkHash1K 41.67 87.11 2.09x
BenchmarkHash8K 49.25 116.06 2.36x

Please review this at http://codereview.appspot.com/6850043/

Affected files:
M src/pkg/crypto/md5/md5block.go


Index: src/pkg/crypto/md5/md5block.go
===================================================================
--- a/src/pkg/crypto/md5/md5block.go
+++ b/src/pkg/crypto/md5/md5block.go
@@ -16,7 +16,7 @@
aa, bb, cc, dd := a, b, c, d

// This is a constant condition - it is not evaluated on each iteration.
- if runtime.GOARCH == "amd64" || runtime.GOARCH == "386" {
+ if runtime.GOARCH == "amd64" || runtime.GOARCH == "386" ||
runtime.GOARCH == "arm" {
// MD5 was designed so that x86 processors can just iterate
// over the block data directly as uint32s, and we generate
// less code and run 1.3x faster if we take advantage of that.

Search Discussions

  • R at Nov 14, 2012 at 12:06 am
    http://codereview.appspot.com/6850043/diff/5001/src/pkg/crypto/md5/md5block.go
    File src/pkg/crypto/md5/md5block.go (right):

    http://codereview.appspot.com/6850043/diff/5001/src/pkg/crypto/md5/md5block.go#newcode24
    src/pkg/crypto/md5/md5block.go:24: X =
    (*[16]uint32)(unsafe.Pointer(&p[0]))
    can an ARM do unaligned fetches?

    http://codereview.appspot.com/6850043/
  • Dave at Nov 14, 2012 at 12:07 am
    On 2012/11/14 00:06:03, r wrote:

    http://codereview.appspot.com/6850043/diff/5001/src/pkg/crypto/md5/md5block.go
    File src/pkg/crypto/md5/md5block.go (right):

    http://codereview.appspot.com/6850043/diff/5001/src/pkg/crypto/md5/md5block.go#newcode24
    src/pkg/crypto/md5/md5block.go:24: X =
    (*[16]uint32)(unsafe.Pointer(&p[0]))
    can an ARM do unaligned fetches?
    I checked /proc/cpu/alignment and there were no unaligned accesses, but
    this was an arm7 cpu. I will try on arm5

    https://codereview.appspot.com/6850043/
  • Minux at Nov 14, 2012 at 11:44 am

    On Wed, Nov 14, 2012 at 8:07 AM, wrote:
    On 2012/11/14 00:06:03, r wrote:

    can an ARM do unaligned fetches?
    I checked /proc/cpu/alignment and there were no unaligned accesses, but
    this was an arm7 cpu. I will try on arm5
    please apply this cl https://codereview.appspot.com/6782072/
    and re-test on ARM.
  • Dave at Nov 15, 2012 at 1:04 am

    On 2012/11/14 11:44:20, minux wrote:
    On Wed, Nov 14, 2012 at 8:07 AM, wrote:
    On 2012/11/14 00:06:03, r wrote:

    can an ARM do unaligned fetches?
    I checked /proc/cpu/alignment and there were no unaligned accesses,
    but
    this was an arm7 cpu. I will try on arm5
    please apply this cl https://codereview.appspot.com/6782072/
    and re-test on ARM.
    Curiously the test passes with your CL, but I was surprised it did.

    https://codereview.appspot.com/6850043/
  • Minux at Nov 15, 2012 at 3:50 am

    On Thursday, November 15, 2012, wrote:
    On 2012/11/14 11:44:20, minux wrote:

    On Wed, Nov 14, 2012 at 8:07 AM, wrote:
    On 2012/11/14 00:06:03, r wrote:

    can an ARM do unaligned fetches?
    I checked /proc/cpu/alignment and there were no unaligned accesses,
    but
    this was an arm7 cpu. I will try on arm5
    please apply this cl https://codereview.appspot.**com/6782072/<https://codereview.appspot.com/6782072/>
    and re-test on ARM.
    Curiously the test passes with your CL, but I was surprised it did.
    it depends on your /proc/cpu/alignment setting.
    0 (ignore), 1(warn), 2 (fixup), 3 (fix+warn) will work as expected.

    however, 4 (signal) won't.
  • Dave Cheney at Nov 15, 2012 at 4:19 am
    Yeah, i've been checking that, I can't run my pandaboard on setting 4
    (the kernel wont' let me) but I see no alignment failures in user
    mode.
    On Thu, Nov 15, 2012 at 2:50 PM, minux wrote:
    On Thursday, November 15, 2012, wrote:
    On 2012/11/14 11:44:20, minux wrote:

    On Wed, Nov 14, 2012 at 8:07 AM, wrote:
    On 2012/11/14 00:06:03, r wrote:

    can an ARM do unaligned fetches?
    I checked /proc/cpu/alignment and there were no unaligned accesses,
    but
    this was an arm7 cpu. I will try on arm5
    please apply this cl https://codereview.appspot.com/6782072/
    and re-test on ARM.

    Curiously the test passes with your CL, but I was surprised it did.
    it depends on your /proc/cpu/alignment setting.
    0 (ignore), 1(warn), 2 (fixup), 3 (fix+warn) will work as expected.

    however, 4 (signal) won't.
  • Dave at Nov 15, 2012 at 7:26 am
    Thank you, your additional test fails on ARMv5.

    === RUN TestGolden
    --- FAIL: TestGolden (0.00 seconds)
    md5_test.go:73: md5[3](For every action there is an equal and
    opposite government program.) = c2df9afe383d86de9a7f165785085737 want
    637d2fe925c07c113800509964fb0e06
    === RUN: ExampleNew
    --- PASS: ExampleNew (0.00 seconds)



    https://codereview.appspot.com/6850043/
  • Dave at Nov 15, 2012 at 7:27 am
  • Minux at Nov 15, 2012 at 7:36 am
    However, i think you can do something like this to speed up the common case
    where the buffer
    does align to 4-byte boundary:

    diff -r 591fc8a0131a src/pkg/crypto/md5/md5block.go
    --- a/src/pkg/crypto/md5/md5block.go Wed Nov 14 22:04:03 2012 -0800
    +++ b/src/pkg/crypto/md5/md5block.go Thu Nov 15 15:32:51 2012 +0800
    @@ -22,6 +22,8 @@
    // less code and run 1.3x faster if we take
    advantage of that.
    // My apologies.
    X = (*[16]uint32)(unsafe.Pointer(&p[0]))
    + } else if uintptr(unsafe.Pointer(&p[0])) & 3 == 0 {
    + X = (*[16]uint32)(unsafe.Pointer(&p[0]))
    } else {
    X = &xbuf
    j := 0
    diff -r 591fc8a0131a src/pkg/crypto/md5/gen.go
    --- a/src/pkg/crypto/md5/gen.go Wed Nov 14 22:04:03 2012 -0800
    +++ b/src/pkg/crypto/md5/gen.go Thu Nov 15 15:32:51 2012 +0800
    @@ -203,6 +203,8 @@
    // less code and run 1.3x faster if we take
    advantage of that.
    // My apologies.
    X = (*[16]uint32)(unsafe.Pointer(&p[0]))
    + } else if uintptr(unsafe.Pointer(&p[0])) & 3 == 0 {
    + X = (*[16]uint32)(unsafe.Pointer(&p[0]))
    } else {
    X = &xbuf
    j := 0

    Just don't forget to change gen.go.
    I think we can get the same nice benchmark numbers this way
    However, I propose we also add a benchmark for unaligned writes,
    what do you think?
  • Dave Cheney at Nov 15, 2012 at 7:44 am
    SGTM. I think this pattern is also valid for other hashes in the tree.
    On 15/11/2012, at 18:36, minux wrote:

    However, i think you can do something like this to speed up the common case where the buffer
    does align to 4-byte boundary:

    diff -r 591fc8a0131a src/pkg/crypto/md5/md5block.go
    --- a/src/pkg/crypto/md5/md5block.go Wed Nov 14 22:04:03 2012 -0800
    +++ b/src/pkg/crypto/md5/md5block.go Thu Nov 15 15:32:51 2012 +0800
    @@ -22,6 +22,8 @@
    // less code and run 1.3x faster if we take advantage of that.
    // My apologies.
    X = (*[16]uint32)(unsafe.Pointer(&p[0]))
    + } else if uintptr(unsafe.Pointer(&p[0])) & 3 == 0 {
    + X = (*[16]uint32)(unsafe.Pointer(&p[0]))
    } else {
    X = &xbuf
    j := 0
    diff -r 591fc8a0131a src/pkg/crypto/md5/gen.go
    --- a/src/pkg/crypto/md5/gen.go Wed Nov 14 22:04:03 2012 -0800
    +++ b/src/pkg/crypto/md5/gen.go Thu Nov 15 15:32:51 2012 +0800
    @@ -203,6 +203,8 @@
    // less code and run 1.3x faster if we take advantage of that.
    // My apologies.
    X = (*[16]uint32)(unsafe.Pointer(&p[0]))
    + } else if uintptr(unsafe.Pointer(&p[0])) & 3 == 0 {
    + X = (*[16]uint32)(unsafe.Pointer(&p[0]))
    } else {
    X = &xbuf
    j := 0

    Just don't forget to change gen.go.
    I think we can get the same nice benchmark numbers this way
    However, I propose we also add a benchmark for unaligned writes,
    what do you think?

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-dev @
categoriesgo
postedNov 13, '12 at 11:55p
activeNov 15, '12 at 7:44a
posts11
users3
websitegolang.org

3 users in discussion

Dave Cheney: 7 posts Minux: 3 posts R: 1 post

People

Translate

site design / logo © 2022 Grokbase