FAQ
I've got an open question regarding my synchronization.

I am using channels to wake sleeping goroutines in case some notable event
occurs (reader close, writer close, data/space available (example
<https://github.com/karalabe/bufioprop/blob/master/pipe.go#L132>)). The
solution is imho simple, yet performs really well (the core concept behind
my implementation is probably the same as anyone else's, so the performance
diff is imho related to the syncing). Yet is some rare cases I see an
increased allocation count (see gomaxprocs 8 alloc table), which makes a
minor hit on performance. I'm guessing the culprit is in the internals of
channels, but could someone hint as to what might happen? Why is my code
allocating sporadically?

Thanks
On Sat, Jan 31, 2015 at 11:56 AM, Péter Szilágyi wrote:

I also shot a few emails with Roger that it might be useful to try and
convert the shootout to a testing package based solution. The benefit would
be that we don't need to start writing tests all over when we finally agree
upon a solution. The slight issue is that the testing framework isn't
really compatible with the idea of "shooting out" bad solutions (i.e. if a
test fails on one implementation, the rest shouldn't even run, as we're
trying to filter out bad ones and only reach the benchmarks with the good
ones). Maybe there's some mechanism in the testing package to "do some
stuff" after a test finishes but before the next starts, and then we could
filter the still alive solutions. Dunno, if somebody's up for the
challenge, I'd gladly accept PRs :P

On the same note, I've also pulled in the io.Pipe tests from the std libs
to validate my pipe solution. It is probably not nearly enough, but it's
way better than nothing. I concur that we should definitely try and pull in
every test from the std libs that might catch nasty bugs, but we need to
converge on the API for that. The previous one was a simple copy, but now
we do have a Pipe proposal that not everyone implemented yet.

On Augusto's proposal of using bufio.Pipe(buffer []byte) instead of
bufio.Pipe(buffer int), I think it might be a valid point. It shouldn't
pose significant inconvenience, but could allow finer memory management if
someone needs it. Though it would be nice to have the feedback of others
too, maybe we're missing some things :)

Cheers,
Peter


On Sat, Jan 31, 2015 at 11:46 AM, wrote:

Interesting -- I was thinking today it would be useful to have some tests
using stuff from testing/iotest <http://godoc.org/testing/iotest>, for
example the OneByteReader (would have caught Jan's issue, I think),
DataErrReader (cropped up earlier), and HalfReader. Error handling should
be tested as well for both the source and dest.

For more general discussion, I structured my design around accepting an
external buffer rather than accepting a buffer size. I think this is
better since it allows the caller to reuse buffers (e.g. arena-allocated
bufs), but it doesn't significantly inconvenience the casual user that can
easily provide "make([]byte, N)".

- Augusto
On Saturday, January 31, 2015 at 1:05:45 AM UTC-8, Péter Szilágyi wrote:

Jan, you're going to hate me (if you don't already) :P I've just shot
out your code as *not solving* the problem again :P

Currently the reason you can copy blazing fast is because you've
optimized memory usage to not blindly use the entire buffer as the rest of
us, bur rather work with chunks and reuse the hottest one (i.e. still in
the cache). The flaw in your current implementation is that you don't fill
one chunk fully, but rather stuff into it whatever's available and send it
to the reader. However, if the output stream doesn't touch your chunk for a
good while - even though it might be almost completely empty - the empty
space is wasted, and you run out of buffer allowance to accept the input.

By hitting your copy's input with many many tiny data chunks, if quickly
uses up all "pages", and starts idling because it doesn't have any
allowance left. I've managed to show this exact behavior by modifying the
shootout's "stable" streams to use 10KB/ms throughput instead of 1MB/100ms.
All other implementations work correctly, but your's stalls.

However, I didn't like the idea of trying to fail your code just to show
that it performs badly in some fabricated scenario, so I've actually placed
your implementation in my production code to see if it's just some
theoretical issue, or if it indeed does fail (streaming download, chunked
upload). The result was as I've thought, that since I have a fast download,
it hits copy with a lot of small ops, and your implementation behaves
exactly like io.Copy, it instantly uses up all your pages and then stalls
the download.

To support my claim I've created a small repro, (though it requires
gsutils, google cloud account etc configured), but it's my full code
<https://gist.github.com/karalabe/024fe1f132c18471d411>, which I ran
with io.Copy, jnml.Copy and bufioprop.Copy. The results can be seen on the
attached chart (sorry for the bad/quick photoshop). The yellow line is the
download, whereas the purple one is the upload. As you can see, whenever
gsutils starts uploading, it will block accepting new data. As a
consequence io.Copy stalls more or less immediately, your solution makes a
few more ticks until it wastes all its pages, whereas the proposed solution
happily downloads while the upload is running (note, there are of course
interferences, but that is understandable imho).

Cheers,
Peter

PS: The current stats are:

Manually disabled contenders:
ncw.Copy: deadlock in latency benchmark.
------------------------------------------------

High throughput tests:
io.Copy: test passed.
[!] bufio.Copy: test passed.
rogerpeppe.Copy: test passed.
rogerpeppe.IOCopy: test passed.
mattharden.Copy: test passed.
yiyus.Copy: corrupt data on the output.
egonelbre.Copy: test passed.
jnml.Copy: test passed.
bakulshah.Copy: panic.
augustoroman.Copy: test passed.
------------------------------------------------

Stable input, stable output shootout:
io.Copy: 3.848840481s 8.314192 mbps 6954 allocs
477712 B
[!] bufio.Copy: 3.93977615s 8.122289 mbps 7017 allocs
13033344 B
rogerpeppe.Copy: 3.957739623s 8.085423 mbps 7075 allocs
13035712 B
rogerpeppe.IOCopy: 4.006578497s 7.986865 mbps 7075 allocs
518288 B
mattharden.Copy: 7.587068709s 4.217703 mbps 6686 allocs
25593688 B
egonelbre.Copy: 3.916161761s 8.171266 mbps 7054 allocs
13034368 B
jnml.Copy: 3.996595672s 8.006814 mbps 7061 allocs
13035840 B
augustoroman.Copy: 3.871614938s 8.265285 mbps 6849 allocs
13021312 B

Stable input, bursty output shootout:
io.Copy: 6.836185084s 4.680973 mbps 3444 allocs
254400 B
[!] bufio.Copy: 4.60999474s 6.941440 mbps 3588 allocs
12812608 B
rogerpeppe.Copy: 4.569635021s 7.002747 mbps 3576 allocs
12811776 B
rogerpeppe.IOCopy: 6.923576695s 4.621889 mbps 3439 allocs
285584 B
egonelbre.Copy: 4.612420089s 6.937790 mbps 3754 allocs
12823168 B
jnml.Copy: 6.895554042s 4.640671 mbps 3601 allocs
12814400 B
augustoroman.Copy: 4.611330731s 6.939429 mbps 3521 allocs
12808320 B

Bursty input, stable output shootout:
[!] bufio.Copy: 3.799650855s 8.421826 mbps 3311 allocs
12794880 B
rogerpeppe.Copy: 3.792738473s 8.437175 mbps 3309 allocs
12794688 B
egonelbre.Copy: 3.803217893s 8.413928 mbps 3393 allocs
12800064 B
augustoroman.Copy: 3.797589395s 8.426398 mbps 3306 allocs
12794560 B
------------------------------------------------

Latency benchmarks (GOMAXPROCS = 1):
[!] bufio.Copy: 4.422µs 23 allocs 2288 B.
rogerpeppe.Copy: 4.754µs 21 allocs 2096 B.
egonelbre.Copy: 4.645µs 29 allocs 2368 B.
augustoroman.Copy: 5.034µs 17 allocs 1904 B.

Latency benchmarks (GOMAXPROCS = 8):
[!] bufio.Copy: 4.596µs 673 allocs 43888 B.
rogerpeppe.Copy: 4.934µs 354 allocs 23408 B.
egonelbre.Copy: 4.856µs 370 allocs 24192 B.
augustoroman.Copy: 5.296µs 114 allocs 8112 B.

Throughput (GOMAXPROCS = 1) (256 MB):

+-------------------+--------+---------+---------+---------+----------+
THROUGHPUT | 333 | 4155 | 65359 | 1048559 | 16777301 |
+-------------------+--------+---------+---------+---------+----------+
[!] bufio.Copy | 492.98 | 3011.88 | 4758.31 | 4923.77 | 2057.23 |
rogerpeppe.Copy | 226.69 | 1918.60 | 4495.06 | 4903.63 | 2058.11 |
egonelbre.Copy | 253.95 | 1867.21 | 4484.71 | 4801.99 | 2033.94 |
augustoroman.Copy | 162.78 | 1513.39 | 4324.66 | 4888.18 | 2060.91 |
+-------------------+--------+---------+---------+---------+----------+

+-------------------+-----------------------+---------------
--------+-----------------------+-----------------------+---
--------------------+
ALLOCS/BYTES | 333 | 4155 |
65359 | 1048559 | 16777301 |
+-------------------+-----------------------+---------------
--------+-----------------------+-----------------------+---
--------------------+
[!] bufio.Copy | ( 13 / 1024) | ( 13 / 5280) | (
13 / 66208) | ( 13 / 1049248) | ( 13 / 16786080) |
rogerpeppe.Copy | ( 12 / 896) | ( 12 / 5152) | (
12 / 66080) | ( 12 / 1049120) | ( 12 / 16785952) |
egonelbre.Copy | ( 21 / 1248) | ( 21 / 5504) | (
21 / 66432) | ( 21 / 1049472) | ( 21 / 16786304) |
augustoroman.Copy | ( 12 / 960) | ( 12 / 5216) | (
12 / 66144) | ( 12 / 1049184) | ( 12 / 16786016) |
+-------------------+-----------------------+---------------
--------+-----------------------+-----------------------+---
--------------------+

Throughput (GOMAXPROCS = 8) (256 MB):

+-------------------+--------+---------+---------+---------+----------+
THROUGHPUT | 333 | 4155 | 65359 | 1048559 | 16777301 |
+-------------------+--------+---------+---------+---------+----------+
[!] bufio.Copy | 499.11 | 2922.10 | 4262.06 | 4589.26 | 2041.15 |
rogerpeppe.Copy | 220.56 | 1698.86 | 3953.09 | 4611.53 | 2048.96 |
egonelbre.Copy | 360.14 | 2439.69 | 3931.45 | 4339.05 | 1907.04 |
augustoroman.Copy | 150.06 | 1392.57 | 3735.92 | 4572.96 | 2052.33 |
+-------------------+--------+---------+---------+---------+----------+

+-------------------+-----------------------+---------------
--------+-----------------------+-----------------------+---
--------------------+
ALLOCS/BYTES | 333 | 4155 |
65359 | 1048559 | 16777301 |
+-------------------+-----------------------+---------------
--------+-----------------------+-----------------------+---
--------------------+
[!] bufio.Copy | ( 130 / 9792) | ( 46 / 7392) | (
94 / 71392) | ( 781 / 1098400) | ( 61 / 16789152) |
rogerpeppe.Copy | ( 13 / 960) | ( 14 / 5504) | (
14 / 66432) | ( 14 / 1049472) | ( 14 / 16786304) |
egonelbre.Copy | ( 93 / 6304) | ( 50 / 7584) | (
53 / 68928) | ( 34 / 1050528) | ( 34 / 16787360) |
augustoroman.Copy | ( 22 / 1824) | ( 13 / 5504) | (
14 / 66496) | ( 12 / 1049184) | ( 12 / 16786016) |
+-------------------+-----------------------+---------------
--------+-----------------------+-----------------------+---
--------------------+



On Fri, Jan 30, 2015 at 10:14 PM, Péter Szilágyi <pet...@gmail.com>
wrote:
Up till now, this is what I'm aiming to include in bufio. Feedback
welcome API wise too (though it's just based on Roger's and the io packages
Copy/Pipe)

http://godoc.org/github.com/karalabe/bufioprop
On Fri, Jan 30, 2015 at 9:45 PM, Egon wrote:


On Friday, 30 January 2015 21:03:02 UTC+2, Péter Szilágyi wrote:

Hi all,

A few modifications went into the benchmarks:

- Roger added a repeating reader in order not to need a giant
pre-allocated test data blob, makes startup faster, also should better
reflect algo performance.
- To prevent measuring occasional hiccups, the throughput
benchmarks not use best-out-of-three. Scores are much stabler.

Implementation wise the updates are:

- Roger sent in a full Pipe based solution, arguing that it's way
more flexible than a simple copy (I agree).
- Jan sent in a nice optimization that seems to beat other algos
in the case of very large buffers.
- I've also ported my solution to a pipe version (both coexisting
currently). That codebase needs to be cleaned up, it was to see if it works
ok.

With these, the current standing is:

Latency benchmarks (GOMAXPROCS = 1):
[!] bufio.Copy: 4.322µs 37 allocs 2736 B.
[!] bufio.PipeCopy: 4.396µs 22 allocs 2224 B.
rogerpeppe.Copy: 4.666µs 20 allocs 2032 B.
egonelbre.Copy: 4.681µs 29 allocs 2368 B.
jnml.Copy: 4.964µs 18 allocs 1936 B.

Latency benchmarks (GOMAXPROCS = 8):
[!] bufio.Copy: 4.525µs 398 allocs 25840 B.
[!] bufio.PipeCopy: 4.58µs 632 allocs 41264 B.
rogerpeppe.Copy: 4.918µs 325 allocs 21552 B.
egonelbre.Copy: 4.779µs 518 allocs 33664 B.
jnml.Copy: 5.187µs 321 allocs 21328 B.

Throughput (GOMAXPROCS = 1) (256 MB):

+--------------------+--------+---------+---------+---------
+----------+
THROUGHPUT | 333 | 4155 | 65359 | 1048559 |
16777301 |
+--------------------+--------+---------+---------+---------
+----------+
[!] bufio.Copy | 524.45 | 3137.31 | 4765.79 | 4922.01 | 2083.84 |
[!] bufio.PipeCopy | 523.99 | 3115.08 | 4767.71 | 4924.59 | 2083.15 |
rogerpeppe.Copy | 231.94 | 1942.33 | 4499.48 | 4906.88 | 2085.31 |
egonelbre.Copy | 252.79 | 1865.77 | 4482.94 | 4832.15 | 2053.85 |
jnml.Copy | 233.75 | 1947.74 | 4500.00 | 4914.93 |
6055.04 |
+--------------------+--------+---------+---------+---------
+----------+

+--------------------+-----------------------+--------------
---------+-----------------------+-----------------------+--
---------------------+
ALLOCS/BYTES | 333 | 4155
65359 | 1048559 | 16777301 |
+--------------------+-----------------------+--------------
---------+-----------------------+-----------------------+--
---------------------+
[!] bufio.Copy | ( 24 / 1280) | ( 24 / 5536)
( 24 / 66464) | ( 24 / 1049504) | ( 24 / 16786336) |
[!] bufio.PipeCopy | ( 13 / 1024) | ( 13 / 5280)
( 13 / 66208) | ( 13 / 1049248) | ( 13 / 16786080) |
rogerpeppe.Copy | ( 12 / 896) | ( 12 / 5152)
( 12 / 66080) | ( 12 / 1049120) | ( 12 / 16785952) |
egonelbre.Copy | ( 21 / 1248) | ( 21 / 5504)
( 21 / 66432) | ( 21 / 1049472) | ( 21 / 16786304) |
jnml.Copy | ( 13 / 1008) | ( 13 / 5264)
( 13 / 66192) | ( 13 / 1049232) | ( 13 / 16787424) |
+--------------------+-----------------------+--------------
---------+-----------------------+-----------------------+--
---------------------+

Throughput (GOMAXPROCS = 8) (256 MB):

+--------------------+--------+---------+---------+---------
+----------+
THROUGHPUT | 333 | 4155 | 65359 | 1048559 |
16777301 |
+--------------------+--------+---------+---------+---------
+----------+
[!] bufio.Copy | 510.38 | 2935.96 | 3899.06 | 4593.50 | 2067.04 |
[!] bufio.PipeCopy | 503.73 | 2928.34 | 4204.07 | 4602.74 | 2073.62 |
rogerpeppe.Copy | 206.61 | 1809.26 | 3770.79 | 4608.11 | 2069.03 |
egonelbre.Copy | 350.66 | 2439.46 | 3946.51 | 4377.70 | 1917.33 |
jnml.Copy | 214.09 | 1683.43 | 3773.63 | 4621.34 |
5749.54 |
+--------------------+--------+---------+---------+---------
+----------+

+--------------------+-----------------------+--------------
---------+-----------------------+-----------------------+--
---------------------+
ALLOCS/BYTES | 333 | 4155
65359 | 1048559 | 16777301 |
+--------------------+-----------------------+--------------
---------+-----------------------+-----------------------+--
---------------------+
[!] bufio.Copy | ( 82 / 4992) | ( 46 / 6944)
( 108 / 71840) | ( 536 / 1082272) | ( 56 / 16788384) |
[!] bufio.PipeCopy | ( 65 / 4352) | ( 26 / 6336)
( 66 / 69824) | ( 526 / 1082304) | ( 46 / 16788416) |
rogerpeppe.Copy | ( 14 / 1248) | ( 14 / 5504)
( 14 / 66432) | ( 14 / 1049472) | ( 14 / 16786304) |
egonelbre.Copy | ( 89 / 5600) | ( 49 / 7744)
( 51 / 68800) | ( 39 / 1051072) | ( 29 / 16787264) |
jnml.Copy | ( 13 / 1008) | ( 14 / 5552)
( 13 / 66192) | ( 13 / 1049232) | ( 13 / 16787424) |
+--------------------+-----------------------+--------------
---------+-----------------------+-----------------------+--
---------------------+

@Jan, Egon: would you guys agree that a pipe based solution would be
better/more flexible? If so, we could rework the shootout to use pipe's as
the underlying implementation and a pre-baked copy function (i.e. I
blatantly copied mine from Roger).
SGTM

--
You received this message because you are subscribed to the Google
Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 47 of 50 | next ›
Discussion Overview
groupgolang-nuts @
categoriesgo
postedJan 29, '15 at 11:01a
activeFeb 3, '15 at 11:21a
posts50
users8
websitegolang.org

People

Translate

site design / logo © 2021 Grokbase