FAQ
I was studying the implementation of Once.Do()¹ and noticed that it uses
both a locked mutex to synchronize access to the "once flag" /and/ CAS
to flip the flag. I don't understand why CAS is necessary here.

At line 33, competing goroutines could be reading the flag, outside of a
locked mutex, so in order for their read to be consistent, the lone
writer to the flag must write to it atomically. At line 37, a goroutine
that observed the flag to be false will wait on the mutex. At line 39, a
goroutine holds the lock and can read the flag without any special care.

At line 41, though, that same goroutine still holds the lock when it
goes to change the flag, but the current implementation uses CAS, being
defensive against some other routine having got in there and changed the
flag first. How could that happen? If that did happen, wouldn't that
violate the promised invariant that function "f" will be called no more
than once?

In short, why doesn't line 41 use atomic.StoreUint32() instead?

An alternate design would be to skip the mutex and /just/ use CAS:

,----
func (o *Once) Do(f func()) {
if atomic.LoadUint32(&o.done) == 1 {
return
}
if atomic.CompareAndSwapUint32(&o.done, 0, 1) {
f()
}
}
`----

The first test isn't strictly necessary, but it could reduce contention
against the potentially-more expensive CAS operation.


Footnotes:
¹ http://golang.org/src/pkg/sync/once.go

--
Steven E. Harris

--

Search Discussions

  • Minux at Nov 2, 2012 at 3:04 pm

    On Fri, Nov 2, 2012 at 10:45 PM, Steven E. Harris wrote:

    I was studying the implementation of Once.Do()¹ and noticed that it uses
    both a locked mutex to synchronize access to the "once flag" /and/ CAS
    to flip the flag. I don't understand why CAS is necessary here.
    At line 33, competing goroutines could be reading the flag, outside of a
    locked mutex, so in order for their read to be consistent, the lone
    writer to the flag must write to it atomically. At line 37, a goroutine
    that observed the flag to be false will wait on the mutex. At line 39, a
    goroutine holds the lock and can read the flag without any special care.

    At line 41, though, that same goroutine still holds the lock when it
    goes to change the flag, but the current implementation uses CAS, being
    defensive against some other routine having got in there and changed the
    flag first. How could that happen? If that did happen, wouldn't that
    violate the promised invariant that function "f" will be called no more
    than once?

    In short, why doesn't line 41 use atomic.StoreUint32() instead?
    It already uses that in the tip.
    http://tip.golang.org/src/pkg/sync/once.go

    The original code is written before StoreUint32 is available.
    An alternate design would be to skip the mutex and /just/ use CAS:

    ,----
    func (o *Once) Do(f func()) {
    if atomic.LoadUint32(&o.done) == 1 {
    return
    }
    if atomic.CompareAndSwapUint32(&o.done, 0, 1) {
    you can't set o.done with 1 before f() is executed, because other
    goroutines executing once.Do might observe o.done is 1 and continue before
    f() finishes executing.
    f()
    }
    }
    `----

    The first test isn't strictly necessary, but it could reduce contention
    against the potentially-more expensive CAS operation.


    Footnotes:
    ¹ http://golang.org/src/pkg/sync/once.go

    --
    Steven E. Harris

    --

    --
  • Steven E. Harris at Nov 2, 2012 at 7:10 pm

    minux writes:

    you can't set o.done with 1 before f() is executed, because other
    goroutines executing once.Do might observe o.done is 1 and continue before
    f() finishes executing.
    Ah, I had forgotten about that requirement: that the call to Do() must
    not return until the provided function has finished running, even if in
    the context of a different goroutine.

    Thank you for the clarification.

    --
    Steven E. Harris

    --
  • Anthony Martin at Nov 2, 2012 at 3:05 pm

    "Steven E. Harris" <seh@panix.com> once said:
    In short, why doesn't line 41 use atomic.StoreUint32() instead?
    Here is the explanation from Dmitriy¹:
    Can this be
    o.done = 1
    ?

    If not, why not?
    No, it can't.
    It must be store-release (atomic_store(memory_order_release)) to prevent user
    initialization code (f()) from sinking below store to o.done. Similarly, load of
    o.done on fast path must be load-acquire (that's why it's not plain load).
    It may work on IA-32/Intel64 with plain memory accesses (at least until the
    compiler is sufficiently aggressive), but it won't work no ARM.

    Cheers,
    Anthony

    1. https://codereview.appspot.com/4641066

    --

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedNov 2, '12 at 2:43p
activeNov 2, '12 at 7:10p
posts4
users3
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase