FAQ
Hi Matt,

   It's a very fancy solution, but unfortunately it doesn't work, as it
encounters exactly the same problems as all previous suggestions: in Go,
all existing readers and writers are synchronous and blocking. Meaning,
that even if a reader or writer is buffered, the buffer is only
filled/flushed when it's empty/full. The issue with all solutions is that
you need to place the buffer in between the two threads, not between a
thread and the reader/writer endpoint. And this is what no solution apart
from my proposal solves (Jan's solution of course also does it, just
requires a bit more locking and gc).

   However, given that it seems a bit hard to grasp the exact issue, I've
decided to put together a more formal proposal, with exact code snippets
simulating the issue and verification that the issue indeed does persist in
all above short/easy solutions:

$ go get github.com/karalabe/bufioprop/shootout
$ shootout

Stable input, stable output:
         io.Copy: 3.38504052s 10.666667 mbps.
  [!] bufio.Copy: 3.37012021s 10.666667 mbps.
rogerpeppe.Copy: 3.414476536s 10.666667 mbps.
mattharden.Copy: 6.368713887s 5.333333 mbps.

Stable input, bursty output:
         io.Copy: 6.251177787s 5.333333 mbps.
  [!] bufio.Copy: 3.387935437s 10.666667 mbps.
rogerpeppe.Copy: 5.98428305s 6.400000 mbps.
mattharden.Copy: 6.250739081s 5.333333 mbps.

Bursty input, stable output:
         io.Copy: 6.25889809s 5.333333 mbps.
  [!] bufio.Copy: 3.347354357s 10.666667 mbps.
rogerpeppe.Copy: 5.999921216s 6.400000 mbps.
mattharden.Copy: 3.473998412s 10.666667 mbps.

There are three scenarios:

    - If both reader and writer produce/consume in the same speed, any
    solution works.
    - If the reader is stable, but the writer processes in bursts, then all
    solutions apart from my proposal fail, as they serialize reading and
    writing.
    - If the reader produces in bursts and the writer consumes stably, then
    a buffered input can handle it (i.e. Matt's solution, not because of the
    threads, but because bufio.Reader).

I'll write a formal proposal thread too to this mailing list, as indeed it
seems a non-trivial task, even though it might first appear to be so.

Input most welcome,
   Peter
On Thu, Jan 29, 2015 at 4:35 AM, Matt Harden wrote:

This one actually compiles http://play.golang.org/p/BivUlAT7_0.

On Wed Jan 28 2015 at 8:27:37 PM Matt Harden wrote:

http://play.golang.org/p/owO96v5See demonstrates what I was saying.

On Wed Jan 28 2015 at 8:08:58 PM Matt Harden <matt.harden@gmail.com>
wrote:
Why not just run two io.Copy() in two goroutines - the first copying
from a bufio.Reader to an io.PipeWriter, and the second copying from an
io.PipeReader to a bufio.Writer? This way you get two buffers so that
reading and writing can occur simultaneously, and data will be copied from
one buffer to the other whenever both buffers need flushing. Seems pretty
simple; what am I missing?

On Wed Jan 28 2015 at 1:04:30 PM Péter Szilágyi <peterke@gmail.com>
wrote:
Just to reply to Donovan, pipes are not buffered. That's the hard part
of the equation, not the threading.
On Jan 28, 2015 8:39 PM, "Péter Szilágyi" wrote:

An initial attempt at getting a buffered concurrent copy. It's a tad
more than a dozen lines, more like 200 :) I haven't tested it too
extensively, nor cleaned it up too much, just a rather quick hack together
to see if the concept works.

https://gist.github.com/karalabe/6de57007034d972b9ab6

Opinions? :)


On Wed, Jan 28, 2015 at 3:09 PM, Nick Craig-Wood <nick@craig-wood.com>
wrote:
On 28/01/15 08:18, Péter Szilágyi wrote:
I've hit an interesting problem, and I was a bit surprised that there
isn't anything in the standard libs that could have solved it
easily. It
isn't too complicated to write, but it isn't trivial either. If by any
chance it's already in the libs, please enlighten me :), otherwise would
anyone be interested in including it?

The thing I was solving is fairly trivial: download a file from the
internet, and stream-upload it somewhere else (Google Cloud Storage
specifically, but it doesn't really matter). The naive solution is
pretty straightforward: wire together the downloader's reader with the
uploader's writer, and voila, magic... until you look at the network
usage: x1 secs download, y1 secs upload, x2 secs download, y2 secs
upload.

I came across exactly the same problem yesterday!

I was reading, gzipping and uploading, but I had exactly the same
problem - 10 seconds of 100% CPU for gzipping into a 64 MB block, then
20 seconds of upload at 0% CPU.

Here was my solution

https://github.com/Memset/snapshot-manager/blob/master/snaps
hot/snapshot.go#L166

Which is an annoying amount of code. The error handling is tricky
too.
bufio.Copy(dst io.Writer, src io.Reader, buffer int) (written
int64, err
error)

Which essentially does what io.Copy does, but starts up a separate
writer go routine and passes everything through a user definable buffer.
Then it could handle both data bursts as well batching
readers/writers.

I like it!

--
Nick Craig-Wood <nick@craig-wood.com> --
http://www.craig-wood.com/nick
--
You received this message because you are subscribed to the Google
Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 14 of 22 | next ›
Discussion Overview
groupgolang-nuts @
categoriesgo
postedJan 28, '15 at 8:18a
activeJan 30, '15 at 1:02p
posts22
users7
websitegolang.org

People

Translate

site design / logo © 2021 Grokbase