FAQ
Summary: NEVER use the x86 increment instruction.

Details (n-word fixed precision math library code generated automatically
by my AWK script):

// FixedInc: z++, z is a 2-word (128 bit) fixed precision integer
// func FixedInc(z *Integer)
TEXT ·FixedInc(SB),$0-8
MOVQ z+0(FP),AX // make z[] accessible
MOVQ (AX),BP // load z[0]

// Version (A) using Increment
//INCQ BP // add 1

// Version (B) using Add with an immediate constant of 1
ADDQ $1,BP

MOVQ BP,(AX) // store sum in z[0]
MOVQ 8(AX),BP // load z[1]
ADCQ $0,BP // add carry
MOVQ BP,8(AX) // store sum in z[1]
RET

Driver, because we can't generate a call to a method in assembler. :-(


// increment: z += 1
func (z *Integer) Inc() {
FixedInc(z) // Sigh... not possible to generate methods in assembler
}

Test:

func BenchmarkInc(b *testing.B) {
a := New().SetUint64(0)
for i := 0; i < b.N; i++ {
a.Inc()
}
}

Results:

Version A using INCQ, BenchmarkInc 100000000 10.2 ns/op
Version B using ADDQ, BenchmarkInc 500000000 7.19 ns/op

Summary: NEVER use an x86 increment instruction. This is embarrassing,
because I've known it for >2 decades.

NOTE: 6g also seems to use INC. My suggestion is that we wean ourselves
from this needless expense.

func test() int {
* i := 0*
* i++*
return i
}

--- prog list "test" ---
0032 (/Users/mtj/gocode/src/inc/inc.go:7) TEXT test+0(SB),$0-8
0033 (/Users/mtj/gocode/src/inc/inc.go:7) LOCALS ,$0
0034 (/Users/mtj/gocode/src/inc/inc.go:7) TYPE ~anon0+0(FP){int},$8
0035 (/Users/mtj/gocode/src/inc/inc.go:8)* MOVQ $0,AX*
0036 (/Users/mtj/gocode/src/inc/inc.go:9) *INCQ ,AX*
0037 (/Users/mtj/gocode/src/inc/inc.go:10) MOVQ AX,~anon0+0(FP)
0038 (/Users/mtj/gocode/src/inc/inc.go:10) RET ,



Michael T. Jones | Chief Technology Advocate | mtj@google.com | +1
650-335-5765

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Search Discussions

  • Alexey Borzenkov at Mar 2, 2013 at 9:32 am
    Hi,

    This could likely be a fluke, due to the order of these tests. I tried
    testing inc vs add performance with and see something like this:

    BenchmarkIncECX 1000000000 2.51 ns/op
    BenchmarkAddECX 1000000000 2.47 ns/op
    BenchmarkIncEBP 1000000000 2.53 ns/op
    BenchmarkAddEBP 1000000000 2.47 ns/op

    However, if I change tests order, then suddenly it's like this:

    BenchmarkAddECX 1000000000 2.70 ns/op
    BenchmarkIncECX 1000000000 2.46 ns/op
    BenchmarkAddEBP 1000000000 2.46 ns/op
    BenchmarkIncEBP 1000000000 2.46 ns/op

    Besides, these results on my machine (Intel Core i7, 2.3GHz) are very
    unstable, they often change from run to run.


    On Sat, Mar 2, 2013 at 9:16 AM, Michael Jones wrote:

    Summary: NEVER use the x86 increment instruction.

    Details (n-word fixed precision math library code generated automatically
    by my AWK script):

    // FixedInc: z++, z is a 2-word (128 bit) fixed precision integer
    // func FixedInc(z *Integer)
    TEXT ·FixedInc(SB),$0-8
    MOVQ z+0(FP),AX // make z[] accessible
    MOVQ (AX),BP // load z[0]

    // Version (A) using Increment
    //INCQ BP // add 1

    // Version (B) using Add with an immediate constant of 1
    ADDQ $1,BP

    MOVQ BP,(AX) // store sum in z[0]
    MOVQ 8(AX),BP // load z[1]
    ADCQ $0,BP // add carry
    MOVQ BP,8(AX) // store sum in z[1]
    RET

    Driver, because we can't generate a call to a method in assembler. :-(


    // increment: z += 1
    func (z *Integer) Inc() {
    FixedInc(z) // Sigh... not possible to generate methods in assembler
    }

    Test:

    func BenchmarkInc(b *testing.B) {
    a := New().SetUint64(0)
    for i := 0; i < b.N; i++ {
    a.Inc()
    }
    }

    Results:

    Version A using INCQ, BenchmarkInc 100000000 10.2 ns/op
    Version B using ADDQ, BenchmarkInc 500000000 7.19 ns/op

    Summary: NEVER use an x86 increment instruction. This is embarrassing,
    because I've known it for >2 decades.

    NOTE: 6g also seems to use INC. My suggestion is that we wean ourselves
    from this needless expense.

    func test() int {
    * i := 0*
    * i++*
    return i
    }

    --- prog list "test" ---
    0032 (/Users/mtj/gocode/src/inc/inc.go:7) TEXT test+0(SB),$0-8
    0033 (/Users/mtj/gocode/src/inc/inc.go:7) LOCALS ,$0
    0034 (/Users/mtj/gocode/src/inc/inc.go:7) TYPE ~anon0+0(FP){int},$8
    0035 (/Users/mtj/gocode/src/inc/inc.go:8)* MOVQ $0,AX*
    0036 (/Users/mtj/gocode/src/inc/inc.go:9) *INCQ ,AX*
    0037 (/Users/mtj/gocode/src/inc/inc.go:10) MOVQ AX,~anon0+0(FP)
    0038 (/Users/mtj/gocode/src/inc/inc.go:10) RET ,



    Michael T. Jones | Chief Technology Advocate | mtj@google.com | +1
    650-335-5765

    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Alexey Borzenkov at Mar 2, 2013 at 9:36 am
    And just for completeness, ebp vs ecx also swapped:

    BenchmarkAddEBP 1000000000 2.66 ns/op
    BenchmarkIncEBP 1000000000 2.47 ns/op
    BenchmarkAddECX 1000000000 2.74 ns/op
    BenchmarkIncECX 1000000000 2.47 ns/op

    And inc first:

    BenchmarkIncEBP 1000000000 2.54 ns/op
    BenchmarkAddEBP 1000000000 2.47 ns/op
    BenchmarkIncECX 1000000000 2.52 ns/op
    BenchmarkAddECX 1000000000 2.47 ns/op

    Whoever goes second is the winner... Here's the full code:

    // asm.s
    // func IncECX(z *int64)
    TEXT ·IncECX(SB),$0-8
    MOVQ z+0(FP),AX // make z accessible
    MOVQ (AX),CX // load *z
    INCQ CX // add 1
    MOVQ CX,(AX) // store sum in *z
    RET

    // func IncEBP(z *int64)
    TEXT ·IncEBP(SB),$0-8
    MOVQ z+0(FP),AX // make z accessible
    MOVQ (AX),BP // load *z
    INCQ BP // add 1
    MOVQ BP,(AX) // store sum in *z
    RET

    // func AddEBP(z *int64)
    TEXT ·AddECX(SB),$0-8
    MOVQ z+0(FP),AX // make z accessible
    MOVQ (AX),CX // load *z
    ADDQ $1,CX // add 1
    MOVQ CX,(AX) // store sum in *z
    RET

    // func AddEBP(z *int64)
    TEXT ·AddEBP(SB),$0-8
    MOVQ z+0(FP),AX // make z accessible
    MOVQ (AX),BP // load *z
    ADDQ $1,BP // add 1
    MOVQ BP,(AX) // store sum in *z
    RET

    // main.go
    package main

    func IncECX(z *int64)
    func IncEBP(z *int64)
    func AddECX(z *int64)
    func AddEBP(z *int64)

    // main_test.go
    package main

    import (
    "testing"
    )

    func BenchmarkIncEBP(b *testing.B) {
    a := new(int64)
    for i := 0; i < b.N; i++ {
    IncEBP(a)
    }
    }

    func BenchmarkAddEBP(b *testing.B) {
    a := new(int64)
    for i := 0; i < b.N; i++ {
    AddEBP(a)
    }
    }

    func BenchmarkIncECX(b *testing.B) {
    a := new(int64)
    for i := 0; i < b.N; i++ {
    IncECX(a)
    }
    }

    func BenchmarkAddECX(b *testing.B) {
    a := new(int64)
    for i := 0; i < b.N; i++ {
    AddECX(a)
    }
    }

    On Sat, Mar 2, 2013 at 1:32 PM, Alexey Borzenkov wrote:

    Hi,

    This could likely be a fluke, due to the order of these tests. I tried
    testing inc vs add performance with and see something like this:

    BenchmarkIncECX 1000000000 2.51 ns/op
    BenchmarkAddECX 1000000000 2.47 ns/op
    BenchmarkIncEBP 1000000000 2.53 ns/op
    BenchmarkAddEBP 1000000000 2.47 ns/op

    However, if I change tests order, then suddenly it's like this:

    BenchmarkAddECX 1000000000 2.70 ns/op
    BenchmarkIncECX 1000000000 2.46 ns/op
    BenchmarkAddEBP 1000000000 2.46 ns/op
    BenchmarkIncEBP 1000000000 2.46 ns/op

    Besides, these results on my machine (Intel Core i7, 2.3GHz) are very
    unstable, they often change from run to run.


    On Sat, Mar 2, 2013 at 9:16 AM, Michael Jones wrote:

    Summary: NEVER use the x86 increment instruction.

    Details (n-word fixed precision math library code generated automatically
    by my AWK script):

    // FixedInc: z++, z is a 2-word (128 bit) fixed precision integer
    // func FixedInc(z *Integer)
    TEXT ·FixedInc(SB),$0-8
    MOVQ z+0(FP),AX // make z[] accessible
    MOVQ (AX),BP // load z[0]

    // Version (A) using Increment
    //INCQ BP // add 1

    // Version (B) using Add with an immediate constant of 1
    ADDQ $1,BP

    MOVQ BP,(AX) // store sum in z[0]
    MOVQ 8(AX),BP // load z[1]
    ADCQ $0,BP // add carry
    MOVQ BP,8(AX) // store sum in z[1]
    RET

    Driver, because we can't generate a call to a method in assembler. :-(


    // increment: z += 1
    func (z *Integer) Inc() {
    FixedInc(z) // Sigh... not possible to generate methods in assembler
    }

    Test:

    func BenchmarkInc(b *testing.B) {
    a := New().SetUint64(0)
    for i := 0; i < b.N; i++ {
    a.Inc()
    }
    }

    Results:

    Version A using INCQ, BenchmarkInc 100000000 10.2 ns/op
    Version B using ADDQ, BenchmarkInc 500000000 7.19 ns/op

    Summary: NEVER use an x86 increment instruction. This is embarrassing,
    because I've known it for >2 decades.

    NOTE: 6g also seems to use INC. My suggestion is that we wean ourselves
    from this needless expense.

    func test() int {
    * i := 0*
    * i++*
    return i
    }

    --- prog list "test" ---
    0032 (/Users/mtj/gocode/src/inc/inc.go:7) TEXT test+0(SB),$0-8
    0033 (/Users/mtj/gocode/src/inc/inc.go:7) LOCALS ,$0
    0034 (/Users/mtj/gocode/src/inc/inc.go:7) TYPE ~anon0+0(FP){int},$8
    0035 (/Users/mtj/gocode/src/inc/inc.go:8)* MOVQ $0,AX*
    0036 (/Users/mtj/gocode/src/inc/inc.go:9) *INCQ ,AX*
    0037 (/Users/mtj/gocode/src/inc/inc.go:10) MOVQ AX,~anon0+0(FP)
    0038 (/Users/mtj/gocode/src/inc/inc.go:10) RET ,



    Michael T. Jones | Chief Technology Advocate | mtj@google.com | +1
    650-335-5765

    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Jan Mercl at Mar 2, 2013 at 9:54 am

    On Sat, Mar 2, 2013 at 10:32 AM, Alexey Borzenkov wrote:
    Besides, these results on my machine (Intel Core i7, 2.3GHz) are very
    unstable, they often change from run to run.
    Ad benchmark stability - I think that you would benefit, assuming it's
    not that way alread, from adjusting the assembly benchmark routine
    code to:

    - Be long enough to become nonsensitive to its address position wrt a
    CPU cache line(s). This (hopefully) removes also sensitivity to how
    the tool chain may align/pad/... code.

    - Spend the dominant part performing the measured "thing". This is now
    close to noise compared to the call/return overhead.

    The simple way for both of the above is to repeat the `inc reg` or
    `add reg, 1` more times. I would suggest to try 100 at first.

    -j

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Michael Jones at Mar 2, 2013 at 10:11 am
    All good points, thank you!

    I confirm that making the increment and immediate add versions do 100 extra
    cycles brings the execution times into line. Sorry for the false alarm.

    The real mode of the program has 11-40 word numbers and the times are more
    consistent there too, but I was adding microbenchmarks for completeness and
    managed to fool myself.

    Feeling better that the true performace is not so uneven.
    On Sat, Mar 2, 2013 at 1:54 AM, Jan Mercl wrote:
    On Sat, Mar 2, 2013 at 10:32 AM, Alexey Borzenkov wrote:
    Besides, these results on my machine (Intel Core i7, 2.3GHz) are very
    unstable, they often change from run to run.
    Ad benchmark stability - I think that you would benefit, assuming it's
    not that way alread, from adjusting the assembly benchmark routine
    code to:

    - Be long enough to become nonsensitive to its address position wrt a
    CPU cache line(s). This (hopefully) removes also sensitivity to how
    the tool chain may align/pad/... code.

    - Spend the dominant part performing the measured "thing". This is now
    close to noise compared to the call/return overhead.

    The simple way for both of the above is to repeat the `inc reg` or
    `add reg, 1` more times. I would suggest to try 100 at first.

    -j


    --
    Michael T. Jones | Chief Technology Advocate | mtj@google.com | +1
    650-335-5765

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Dmitry Vyukov at Mar 2, 2013 at 10:17 am

    On Sat, Mar 2, 2013 at 11:32 AM, Alexey Borzenkov wrote:
    Hi,

    This could likely be a fluke, due to the order of these tests. I tried
    testing inc vs add performance with and see something like this:

    BenchmarkIncECX 1000000000 2.51 ns/op
    BenchmarkAddECX 1000000000 2.47 ns/op
    BenchmarkIncEBP 1000000000 2.53 ns/op
    BenchmarkAddEBP 1000000000 2.47 ns/op

    However, if I change tests order, then suddenly it's like this:

    BenchmarkAddECX 1000000000 2.70 ns/op
    BenchmarkIncECX 1000000000 2.46 ns/op
    BenchmarkAddEBP 1000000000 2.46 ns/op
    BenchmarkIncEBP 1000000000 2.46 ns/op

    Besides, these results on my machine (Intel Core i7, 2.3GHz) are very
    unstable, they often change from run to run.
    try go test -benchtime=10s


    On Sat, Mar 2, 2013 at 9:16 AM, Michael Jones wrote:

    Summary: NEVER use the x86 increment instruction.

    Details (n-word fixed precision math library code generated automatically
    by my AWK script):

    // FixedInc: z++, z is a 2-word (128 bit) fixed precision integer
    // func FixedInc(z *Integer)
    TEXT ·FixedInc(SB),$0-8
    MOVQ z+0(FP),AX // make z[] accessible
    MOVQ (AX),BP // load z[0]

    // Version (A) using Increment
    //INCQ BP // add 1

    // Version (B) using Add with an immediate constant of 1
    ADDQ $1,BP

    MOVQ BP,(AX) // store sum in z[0]
    MOVQ 8(AX),BP // load z[1]
    ADCQ $0,BP // add carry
    MOVQ BP,8(AX) // store sum in z[1]
    RET

    Driver, because we can't generate a call to a method in assembler. :-(


    // increment: z += 1
    func (z *Integer) Inc() {
    FixedInc(z) // Sigh... not possible to generate methods in assembler
    }

    Test:

    func BenchmarkInc(b *testing.B) {
    a := New().SetUint64(0)
    for i := 0; i < b.N; i++ {
    a.Inc()
    }
    }

    Results:

    Version A using INCQ, BenchmarkInc 100000000 10.2 ns/op
    Version B using ADDQ, BenchmarkInc 500000000 7.19 ns/op

    Summary: NEVER use an x86 increment instruction. This is embarrassing,
    because I've known it for >2 decades.

    NOTE: 6g also seems to use INC. My suggestion is that we wean ourselves
    from this needless expense.

    func test() int {
    i := 0
    i++
    return i
    }

    --- prog list "test" ---
    0032 (/Users/mtj/gocode/src/inc/inc.go:7) TEXT test+0(SB),$0-8
    0033 (/Users/mtj/gocode/src/inc/inc.go:7) LOCALS ,$0
    0034 (/Users/mtj/gocode/src/inc/inc.go:7) TYPE ~anon0+0(FP){int},$8
    0035 (/Users/mtj/gocode/src/inc/inc.go:8) MOVQ $0,AX
    0036 (/Users/mtj/gocode/src/inc/inc.go:9) INCQ ,AX
    0037 (/Users/mtj/gocode/src/inc/inc.go:10) MOVQ AX,~anon0+0(FP)
    0038 (/Users/mtj/gocode/src/inc/inc.go:10) RET ,



    Michael T. Jones | Chief Technology Advocate | mtj@google.com | +1
    650-335-5765

    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Job van der Zwan at Mar 3, 2013 at 11:59 am
    Look what someone just shared with me in a completely unrelated internet
    discussion:

    http://www.emulators.com/docs/nx06_rmw.htm

    "Last week I posed the question of how one would increment an integer in
    memory. This is not a trick question. Obviously it is a simple matter to do
    in C or C++, but what is the actual code that the C or C++ compiler should
    emit? As easy as this problem sounds, the solution touches on many of the
    fundamental performance and reliability issues that affect real world code."


    Includes a very thorough set of microbenchmarks, and some sad news:

    "What is important to note is that for any given piece of compiled code, no
    one single code sequence is optimal on all architectures. I am repeating
    what I have said before, which is that it is impossible for a software
    developer to ship optimal code. You cannot statically compile C or C++ code
    into an executable that will run as fast as possible on all microprocessors
    today or on ones to be released in the future. What was a cool optimization
    trick on a Pentium III could entirely blow on you on a Pentium 4, as did in
    fact happen when the Pentium 4 was released."
    On Saturday, 2 March 2013 06:16:16 UTC+1, Michael Jones wrote:

    Summary: NEVER use the x86 increment instruction.

    Details (n-word fixed precision math library code generated automatically
    by my AWK script):

    // FixedInc: z++, z is a 2-word (128 bit) fixed precision integer
    // func FixedInc(z *Integer)
    TEXT ·FixedInc(SB),$0-8
    MOVQ z+0(FP),AX // make z[] accessible
    MOVQ (AX),BP // load z[0]

    // Version (A) using Increment
    //INCQ BP // add 1

    // Version (B) using Add with an immediate constant of 1
    ADDQ $1,BP

    MOVQ BP,(AX) // store sum in z[0]
    MOVQ 8(AX),BP // load z[1]
    ADCQ $0,BP // add carry
    MOVQ BP,8(AX) // store sum in z[1]
    RET

    Driver, because we can't generate a call to a method in assembler. :-(


    // increment: z += 1
    func (z *Integer) Inc() {
    FixedInc(z) // Sigh... not possible to generate methods in assembler
    }

    Test:

    func BenchmarkInc(b *testing.B) {
    a := New().SetUint64(0)
    for i := 0; i < b.N; i++ {
    a.Inc()
    }
    }

    Results:

    Version A using INCQ, BenchmarkInc 100000000 10.2 ns/op
    Version B using ADDQ, BenchmarkInc 500000000 7.19 ns/op

    Summary: NEVER use an x86 increment instruction. This is embarrassing,
    because I've known it for >2 decades.

    NOTE: 6g also seems to use INC. My suggestion is that we wean ourselves
    from this needless expense.

    func test() int {
    * i := 0*
    * i++*
    return i
    }

    --- prog list "test" ---
    0032 (/Users/mtj/gocode/src/inc/inc.go:7) TEXT test+0(SB),$0-8
    0033 (/Users/mtj/gocode/src/inc/inc.go:7) LOCALS ,$0
    0034 (/Users/mtj/gocode/src/inc/inc.go:7) TYPE ~anon0+0(FP){int},$8
    0035 (/Users/mtj/gocode/src/inc/inc.go:8)* MOVQ $0,AX*
    0036 (/Users/mtj/gocode/src/inc/inc.go:9) *INCQ ,AX*
    0037 (/Users/mtj/gocode/src/inc/inc.go:10) MOVQ AX,~anon0+0(FP)
    0038 (/Users/mtj/gocode/src/inc/inc.go:10) RET ,



    Michael T. Jones | Chief Technology Advocate | m...@google.com<javascript:>
    +1 650-335-5765
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Michael Jones at Mar 3, 2013 at 3:05 pm
    Extremely interesting reading. Thank you!

    On Sun, Mar 3, 2013 at 3:59 AM, Job van der Zwan
    wrote:
    Look what someone just shared with me in a completely unrelated internet
    discussion:

    http://www.emulators.com/docs/nx06_rmw.htm

    "Last week I posed the question of how one would increment an integer in
    memory. This is not a trick question. Obviously it is a simple matter to do
    in C or C++, but what is the actual code that the C or C++ compiler should
    emit? As easy as this problem sounds, the solution touches on many of the
    fundamental performance and reliability issues that affect real world code."


    Includes a very thorough set of microbenchmarks, and some sad news:

    "What is important to note is that for any given piece of compiled code,
    no one single code sequence is optimal on all architectures. I am repeating
    what I have said before, which is that it is impossible for a software
    developer to ship optimal code. You cannot statically compile C or C++ code
    into an executable that will run as fast as possible on all microprocessors
    today or on ones to be released in the future. What was a cool optimization
    trick on a Pentium III could entirely blow on you on a Pentium 4, as did in
    fact happen when the Pentium 4 was released."
    On Saturday, 2 March 2013 06:16:16 UTC+1, Michael Jones wrote:

    Summary: NEVER use the x86 increment instruction.

    Details (n-word fixed precision math library code generated automatically
    by my AWK script):

    // FixedInc: z++, z is a 2-word (128 bit) fixed precision integer
    // func FixedInc(z *Integer)
    TEXT ·FixedInc(SB),$0-8
    MOVQ z+0(FP),AX // make z[] accessible
    MOVQ (AX),BP // load z[0]

    // Version (A) using Increment
    //INCQ BP // add 1

    // Version (B) using Add with an immediate constant of 1
    ADDQ $1,BP

    MOVQ BP,(AX) // store sum in z[0]
    MOVQ 8(AX),BP // load z[1]
    ADCQ $0,BP // add carry
    MOVQ BP,8(AX) // store sum in z[1]
    RET

    Driver, because we can't generate a call to a method in assembler. :-(


    // increment: z += 1
    func (z *Integer) Inc() {
    FixedInc(z) // Sigh... not possible to generate methods in assembler
    }

    Test:

    func BenchmarkInc(b *testing.B) {
    a := New().SetUint64(0)
    for i := 0; i < b.N; i++ {
    a.Inc()
    }
    }

    Results:

    Version A using INCQ, BenchmarkInc 100000000 10.2 ns/op
    Version B using ADDQ, BenchmarkInc 500000000 7.19 ns/op

    Summary: NEVER use an x86 increment instruction. This is embarrassing,
    because I've known it for >2 decades.

    NOTE: 6g also seems to use INC. My suggestion is that we wean ourselves
    from this needless expense.

    func test() int {
    * i := 0*
    * i++*
    return i
    }

    --- prog list "test" ---
    0032 (/Users/mtj/gocode/src/inc/**inc.go:7) TEXT test+0(SB),$0-8
    0033 (/Users/mtj/gocode/src/inc/**inc.go:7) LOCALS ,$0
    0034 (/Users/mtj/gocode/src/inc/**inc.go:7) TYPE ~anon0+0(FP){int},$8
    0035 (/Users/mtj/gocode/src/inc/**inc.go:8)* MOVQ $0,AX*
    0036 (/Users/mtj/gocode/src/inc/**inc.go:9) *INCQ ,AX*
    0037 (/Users/mtj/gocode/src/inc/**inc.go:10) MOVQ AX,~anon0+0(FP)
    0038 (/Users/mtj/gocode/src/inc/**inc.go:10) RET ,



    Michael T. Jones | Chief Technology Advocate | m...@google.com | +1
    650-335-5765
    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.



    --
    Michael T. Jones | Chief Technology Advocate | mtj@google.com | +1
    650-335-5765

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedMar 2, '13 at 5:16a
activeMar 3, '13 at 3:05p
posts8
users5
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase