FAQ
Jan, you're going to hate me (if you don't already) :P I've just shot out
your code as *not solving* the problem again :P

Currently the reason you can copy blazing fast is because you've optimized
memory usage to not blindly use the entire buffer as the rest of us, bur
rather work with chunks and reuse the hottest one (i.e. still in the
cache). The flaw in your current implementation is that you don't fill one
chunk fully, but rather stuff into it whatever's available and send it to
the reader. However, if the output stream doesn't touch your chunk for a
good while - even though it might be almost completely empty - the empty
space is wasted, and you run out of buffer allowance to accept the input.

By hitting your copy's input with many many tiny data chunks, if quickly
uses up all "pages", and starts idling because it doesn't have any
allowance left. I've managed to show this exact behavior by modifying the
shootout's "stable" streams to use 10KB/ms throughput instead of 1MB/100ms.
All other implementations work correctly, but your's stalls.

However, I didn't like the idea of trying to fail your code just to show
that it performs badly in some fabricated scenario, so I've actually placed
your implementation in my production code to see if it's just some
theoretical issue, or if it indeed does fail (streaming download, chunked
upload). The result was as I've thought, that since I have a fast download,
it hits copy with a lot of small ops, and your implementation behaves
exactly like io.Copy, it instantly uses up all your pages and then stalls
the download.

To support my claim I've created a small repro, (though it requires
gsutils, google cloud account etc configured), but it's my full code
<https://gist.github.com/karalabe/024fe1f132c18471d411>, which I ran with
io.Copy, jnml.Copy and bufioprop.Copy. The results can be seen on the
attached chart (sorry for the bad/quick photoshop). The yellow line is the
download, whereas the purple one is the upload. As you can see, whenever
gsutils starts uploading, it will block accepting new data. As a
consequence io.Copy stalls more or less immediately, your solution makes a
few more ticks until it wastes all its pages, whereas the proposed solution
happily downloads while the upload is running (note, there are of course
interferences, but that is understandable imho).

Cheers,
   Peter

PS: The current stats are:

Manually disabled contenders:
             ncw.Copy: deadlock in latency benchmark.
------------------------------------------------

High throughput tests:
              io.Copy: test passed.
       [!] bufio.Copy: test passed.
      rogerpeppe.Copy: test passed.
    rogerpeppe.IOCopy: test passed.
      mattharden.Copy: test passed.
           yiyus.Copy: corrupt data on the output.
       egonelbre.Copy: test passed.
            jnml.Copy: test passed.
       bakulshah.Copy: panic.
    augustoroman.Copy: test passed.
------------------------------------------------

Stable input, stable output shootout:
              io.Copy: 3.848840481s 8.314192 mbps 6954 allocs 477712
B
       [!] bufio.Copy: 3.93977615s 8.122289 mbps 7017 allocs 13033344
B
      rogerpeppe.Copy: 3.957739623s 8.085423 mbps 7075 allocs 13035712
B
    rogerpeppe.IOCopy: 4.006578497s 7.986865 mbps 7075 allocs 518288
B
      mattharden.Copy: 7.587068709s 4.217703 mbps 6686 allocs 25593688
B
       egonelbre.Copy: 3.916161761s 8.171266 mbps 7054 allocs 13034368
B
            jnml.Copy: 3.996595672s 8.006814 mbps 7061 allocs 13035840
B
    augustoroman.Copy: 3.871614938s 8.265285 mbps 6849 allocs 13021312
B

Stable input, bursty output shootout:
              io.Copy: 6.836185084s 4.680973 mbps 3444 allocs 254400
B
       [!] bufio.Copy: 4.60999474s 6.941440 mbps 3588 allocs 12812608
B
      rogerpeppe.Copy: 4.569635021s 7.002747 mbps 3576 allocs 12811776
B
    rogerpeppe.IOCopy: 6.923576695s 4.621889 mbps 3439 allocs 285584
B
       egonelbre.Copy: 4.612420089s 6.937790 mbps 3754 allocs 12823168
B
            jnml.Copy: 6.895554042s 4.640671 mbps 3601 allocs 12814400
B
    augustoroman.Copy: 4.611330731s 6.939429 mbps 3521 allocs 12808320
B

Bursty input, stable output shootout:
       [!] bufio.Copy: 3.799650855s 8.421826 mbps 3311 allocs 12794880
B
      rogerpeppe.Copy: 3.792738473s 8.437175 mbps 3309 allocs 12794688
B
       egonelbre.Copy: 3.803217893s 8.413928 mbps 3393 allocs 12800064
B
    augustoroman.Copy: 3.797589395s 8.426398 mbps 3306 allocs 12794560
B
------------------------------------------------

Latency benchmarks (GOMAXPROCS = 1):
       [!] bufio.Copy: 4.422µs 23 allocs 2288 B.
      rogerpeppe.Copy: 4.754µs 21 allocs 2096 B.
       egonelbre.Copy: 4.645µs 29 allocs 2368 B.
    augustoroman.Copy: 5.034µs 17 allocs 1904 B.

Latency benchmarks (GOMAXPROCS = 8):
       [!] bufio.Copy: 4.596µs 673 allocs 43888 B.
      rogerpeppe.Copy: 4.934µs 354 allocs 23408 B.
       egonelbre.Copy: 4.856µs 370 allocs 24192 B.
    augustoroman.Copy: 5.296µs 114 allocs 8112 B.

Throughput (GOMAXPROCS = 1) (256 MB):

+-------------------+--------+---------+---------+---------+----------+
THROUGHPUT | 333 | 4155 | 65359 | 1048559 | 16777301 |
+-------------------+--------+---------+---------+---------+----------+
[!] bufio.Copy | 492.98 | 3011.88 | 4758.31 | 4923.77 | 2057.23 |
rogerpeppe.Copy | 226.69 | 1918.60 | 4495.06 | 4903.63 | 2058.11 |
egonelbre.Copy | 253.95 | 1867.21 | 4484.71 | 4801.99 | 2033.94 |
augustoroman.Copy | 162.78 | 1513.39 | 4324.66 | 4888.18 | 2060.91 |
+-------------------+--------+---------+---------+---------+----------+

+-------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+
ALLOCS/BYTES | 333 | 4155 |
   65359 | 1048559 | 16777301 |
+-------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+
[!] bufio.Copy | ( 13 / 1024) | ( 13 / 5280) | (
  13 / 66208) | ( 13 / 1049248) | ( 13 / 16786080) |
rogerpeppe.Copy | ( 12 / 896) | ( 12 / 5152) | (
  12 / 66080) | ( 12 / 1049120) | ( 12 / 16785952) |
egonelbre.Copy | ( 21 / 1248) | ( 21 / 5504) | (
  21 / 66432) | ( 21 / 1049472) | ( 21 / 16786304) |
augustoroman.Copy | ( 12 / 960) | ( 12 / 5216) | (
  12 / 66144) | ( 12 / 1049184) | ( 12 / 16786016) |
+-------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+

Throughput (GOMAXPROCS = 8) (256 MB):

+-------------------+--------+---------+---------+---------+----------+
THROUGHPUT | 333 | 4155 | 65359 | 1048559 | 16777301 |
+-------------------+--------+---------+---------+---------+----------+
[!] bufio.Copy | 499.11 | 2922.10 | 4262.06 | 4589.26 | 2041.15 |
rogerpeppe.Copy | 220.56 | 1698.86 | 3953.09 | 4611.53 | 2048.96 |
egonelbre.Copy | 360.14 | 2439.69 | 3931.45 | 4339.05 | 1907.04 |
augustoroman.Copy | 150.06 | 1392.57 | 3735.92 | 4572.96 | 2052.33 |
+-------------------+--------+---------+---------+---------+----------+

+-------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+
ALLOCS/BYTES | 333 | 4155 |
   65359 | 1048559 | 16777301 |
+-------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+
[!] bufio.Copy | ( 130 / 9792) | ( 46 / 7392) | (
  94 / 71392) | ( 781 / 1098400) | ( 61 / 16789152) |
rogerpeppe.Copy | ( 13 / 960) | ( 14 / 5504) | (
  14 / 66432) | ( 14 / 1049472) | ( 14 / 16786304) |
egonelbre.Copy | ( 93 / 6304) | ( 50 / 7584) | (
  53 / 68928) | ( 34 / 1050528) | ( 34 / 16787360) |
augustoroman.Copy | ( 22 / 1824) | ( 13 / 5504) | (
  14 / 66496) | ( 12 / 1049184) | ( 12 / 16786016) |
+-------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+


On Fri, Jan 30, 2015 at 10:14 PM, Péter Szilágyi wrote:

Up till now, this is what I'm aiming to include in bufio. Feedback welcome
API wise too (though it's just based on Roger's and the io packages
Copy/Pipe)

http://godoc.org/github.com/karalabe/bufioprop
On Fri, Jan 30, 2015 at 9:45 PM, Egon wrote:


On Friday, 30 January 2015 21:03:02 UTC+2, Péter Szilágyi wrote:

Hi all,

A few modifications went into the benchmarks:

- Roger added a repeating reader in order not to need a giant
pre-allocated test data blob, makes startup faster, also should better
reflect algo performance.
- To prevent measuring occasional hiccups, the throughput benchmarks
not use best-out-of-three. Scores are much stabler.

Implementation wise the updates are:

- Roger sent in a full Pipe based solution, arguing that it's way
more flexible than a simple copy (I agree).
- Jan sent in a nice optimization that seems to beat other algos in
the case of very large buffers.
- I've also ported my solution to a pipe version (both coexisting
currently). That codebase needs to be cleaned up, it was to see if it works
ok.

With these, the current standing is:

Latency benchmarks (GOMAXPROCS = 1):
[!] bufio.Copy: 4.322µs 37 allocs 2736 B.
[!] bufio.PipeCopy: 4.396µs 22 allocs 2224 B.
rogerpeppe.Copy: 4.666µs 20 allocs 2032 B.
egonelbre.Copy: 4.681µs 29 allocs 2368 B.
jnml.Copy: 4.964µs 18 allocs 1936 B.

Latency benchmarks (GOMAXPROCS = 8):
[!] bufio.Copy: 4.525µs 398 allocs 25840 B.
[!] bufio.PipeCopy: 4.58µs 632 allocs 41264 B.
rogerpeppe.Copy: 4.918µs 325 allocs 21552 B.
egonelbre.Copy: 4.779µs 518 allocs 33664 B.
jnml.Copy: 5.187µs 321 allocs 21328 B.

Throughput (GOMAXPROCS = 1) (256 MB):

+--------------------+--------+---------+---------+---------+----------+
THROUGHPUT | 333 | 4155 | 65359 | 1048559 | 16777301 |
+--------------------+--------+---------+---------+---------+----------+
[!] bufio.Copy | 524.45 | 3137.31 | 4765.79 | 4922.01 | 2083.84 |
[!] bufio.PipeCopy | 523.99 | 3115.08 | 4767.71 | 4924.59 | 2083.15 |
rogerpeppe.Copy | 231.94 | 1942.33 | 4499.48 | 4906.88 | 2085.31 |
egonelbre.Copy | 252.79 | 1865.77 | 4482.94 | 4832.15 | 2053.85 |
jnml.Copy | 233.75 | 1947.74 | 4500.00 | 4914.93 | 6055.04 |
+--------------------+--------+---------+---------+---------+----------+

+--------------------+-----------------------+--------------
---------+-----------------------+-----------------------+--
---------------------+
ALLOCS/BYTES | 333 | 4155 |
65359 | 1048559 | 16777301 |
+--------------------+-----------------------+--------------
---------+-----------------------+-----------------------+--
---------------------+
[!] bufio.Copy | ( 24 / 1280) | ( 24 / 5536) | (
24 / 66464) | ( 24 / 1049504) | ( 24 / 16786336) |
[!] bufio.PipeCopy | ( 13 / 1024) | ( 13 / 5280) | (
13 / 66208) | ( 13 / 1049248) | ( 13 / 16786080) |
rogerpeppe.Copy | ( 12 / 896) | ( 12 / 5152) | (
12 / 66080) | ( 12 / 1049120) | ( 12 / 16785952) |
egonelbre.Copy | ( 21 / 1248) | ( 21 / 5504) | (
21 / 66432) | ( 21 / 1049472) | ( 21 / 16786304) |
jnml.Copy | ( 13 / 1008) | ( 13 / 5264) | (
13 / 66192) | ( 13 / 1049232) | ( 13 / 16787424) |
+--------------------+-----------------------+--------------
---------+-----------------------+-----------------------+--
---------------------+

Throughput (GOMAXPROCS = 8) (256 MB):

+--------------------+--------+---------+---------+---------+----------+
THROUGHPUT | 333 | 4155 | 65359 | 1048559 | 16777301 |
+--------------------+--------+---------+---------+---------+----------+
[!] bufio.Copy | 510.38 | 2935.96 | 3899.06 | 4593.50 | 2067.04 |
[!] bufio.PipeCopy | 503.73 | 2928.34 | 4204.07 | 4602.74 | 2073.62 |
rogerpeppe.Copy | 206.61 | 1809.26 | 3770.79 | 4608.11 | 2069.03 |
egonelbre.Copy | 350.66 | 2439.46 | 3946.51 | 4377.70 | 1917.33 |
jnml.Copy | 214.09 | 1683.43 | 3773.63 | 4621.34 | 5749.54 |
+--------------------+--------+---------+---------+---------+----------+

+--------------------+-----------------------+--------------
---------+-----------------------+-----------------------+--
---------------------+
ALLOCS/BYTES | 333 | 4155 |
65359 | 1048559 | 16777301 |
+--------------------+-----------------------+--------------
---------+-----------------------+-----------------------+--
---------------------+
[!] bufio.Copy | ( 82 / 4992) | ( 46 / 6944) | (
108 / 71840) | ( 536 / 1082272) | ( 56 / 16788384) |
[!] bufio.PipeCopy | ( 65 / 4352) | ( 26 / 6336) | (
66 / 69824) | ( 526 / 1082304) | ( 46 / 16788416) |
rogerpeppe.Copy | ( 14 / 1248) | ( 14 / 5504) | (
14 / 66432) | ( 14 / 1049472) | ( 14 / 16786304) |
egonelbre.Copy | ( 89 / 5600) | ( 49 / 7744) | (
51 / 68800) | ( 39 / 1051072) | ( 29 / 16787264) |
jnml.Copy | ( 13 / 1008) | ( 14 / 5552) | (
13 / 66192) | ( 13 / 1049232) | ( 13 / 16787424) |
+--------------------+-----------------------+--------------
---------+-----------------------+-----------------------+--
---------------------+

@Jan, Egon: would you guys agree that a pipe based solution would be
better/more flexible? If so, we could rework the shootout to use pipe's as
the underlying implementation and a pre-baked copy function (i.e. I
blatantly copied mine from Roger).
SGTM

--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 44 of 50 | next ›
Discussion Overview
groupgolang-nuts @
categoriesgo
postedJan 29, '15 at 11:01a
activeFeb 3, '15 at 11:21a
posts50
users8
websitegolang.org

People

Translate

site design / logo © 2021 Grokbase