On 1/19/2013 9:37 PM, Ian Lance Taylor wrote:On Sat, Jan 19, 2013 at 11:46 AM, John Nagle wrote:On 1/19/2013 10:25 AM, Ian Lance Taylor wrote:
On Jan 19, 2013 8:18 AM, "John Nagle" wrote:
We've thought quite a bit about how we could avoid race conditions.
We have not thought of a way to do it without sacrificing other
essential aspects of the language. If you have suggestions, let us
know. ...
"Do not communicate by sharing memory; instead, share memory by
communicating" is presented (as) a slogan...
OK. Let's take it seriously, as an enforced feature, and see
where that takes us.
There are two main reasons sharing memory across concurrency
boundaries are considered useful - for performance, and for
explicit shared state. Let's look at the performance issue
first.
If you don't need shared state, copying all the data being
sent across concurrency boundaries works fine. The Python
multiprocessing system does that, even on the same CPU.
The overhead is high, but the functionality is usable.
That could be done in Go. Just do a deep copy of
anything sent on a channel, or passed into a goroutine.
This would break some programs, especially ones based
on the examples in "Effective Go". A slightly different
style is required, but it's not harder, just different.
Like this, for the sort example:
//
// Sort array of strings in background
//
func sorttask(inchan chan []string, outchan chan []string) {
list := <- inchan
sort.Strings(list)
outchan <- list // This would be a deep copy
}
func dosort() []string {
inchan := make(chan []string)
outchan := make(chan []string)
strs := make([]string, 0) // build up a string locally
strs = append(strs,"Hello")
strs = append(strs,"to")
strs = append(strs,"the")
strs = append(strs,"World")
go sorttask(inchan, outchan)
inchan <- in // This would be a deep copy.
result := <- outchan
return result
}
Now we have soundness, but it's cost us some copying.
Can those copies be optimized out? Yes.
The optimization needed here is like tail recursion.
If the last access to a reference before it goes out of
scope is a send, then it need not be copied. In
the example above, both deep copies can be optimized out.
From the programmer perspective, this optimization is
invisible. As with tail recursion, it's valuable for
programmers to know this is going on, and it should
be guaranteed to the programmer that this happens in the
simple cases.
This covers one of the major use cases in server-side
programming - a program makes multiple requests to
other servers and databases to collect data to build
up a reply page. Making those requests in parallel is necessary
for performance. The reply is usually much bigger than the
query, so copying it is undesirable.
The concurrent code which services the requests has no further need of
the data once it's been passed back to the requester. So
the last act of each service goroutine should be to send the
big result on the reply channel. It won't have to be copied.
There's no performance penalty for the safe approach.
So that's a way of addressing the performance issue.
Accessing shared state is tougher. But Go gives us some
tools for that. Channels are explicitly concurrency-safe,
and it's permitted to pass a channel over a channel.
So you can do proxy-type operations, where a concurrent
operation is passed channels by which it can communicate
with some some central state. This is kind of clunky,
but sound. Perhaps it could be packaged in some way
that makes it look more like a function call. Take
a look at the Python multiprocessing module
http://docs.python.org/2/library/multiprocessing.html#module-multiprocessing"16.6.1.4. Sharing state between processes".
That provides two mechanisms - proxy objects, and atomic maps.
Some programmer-friendly way to do proxy objects, where there
are really two channels but it looks like a function call,
could substitute for shared state and mutexes.
Atomic maps are worth having available. Not all maps need to
be atomic, but atomic map types don't have to be deep-copied and
can be shared. So atomic maps become the preferred mechanism
for storing shared state. (However, data put into an atomic
map may have to be deep-copied. This prevents leaking a
reference over a concurrency boundary).
This is a true "share memory by communicating" approach.
One with less overhead, no race conditions, and no need
for user mutexes.
Comments?
John Nagle
--