FAQ

On Fri, Feb 26, 2016 at 6:36 PM, Nigel Tao wrote:
As an experiment, I have rewritten the core of the Snappy decoder (and
just the decoder, not the encoder, for now) in amd64 asm:
I've rewritten the encoder as well: see the ten or so most recent
commits at https://github.com/golang/snappy/commits/master

Comparing pure-Go "old" with Go-with-asm "new":

name old speed new speed delta
WordsEncode1e1-8 676MB/s ± 0% 677MB/s ± 1% ~ (p=0.310 n=5+5)
WordsEncode1e2-8 87.5MB/s ± 1% 428.3MB/s ± 0% +389.71% (p=0.008 n=5+5)
WordsEncode1e3-8 258MB/s ± 0% 446MB/s ± 1% +72.67% (p=0.008 n=5+5)
WordsEncode1e4-8 245MB/s ± 0% 316MB/s ± 0% +28.94% (p=0.008 n=5+5)
WordsEncode1e5-8 186MB/s ± 0% 269MB/s ± 0% +44.86% (p=0.008 n=5+5)
WordsEncode1e6-8 211MB/s ± 0% 314MB/s ± 1% +48.84% (p=0.008 n=5+5)
RandomEncode-8 13.2GB/s ± 1% 14.4GB/s ± 1% +9.33% (p=0.008 n=5+5)
_ZFlat0-8 431MB/s ± 0% 792MB/s ± 0% +83.67% (p=0.008 n=5+5)
_ZFlat1-8 277MB/s ± 0% 436MB/s ± 1% +57.46% (p=0.008 n=5+5)
_ZFlat2-8 13.8GB/s ± 2% 16.2GB/s ± 1% +17.16% (p=0.008 n=5+5)
_ZFlat3-8 173MB/s ± 0% 632MB/s ± 1% +265.85% (p=0.008 n=5+5)
_ZFlat4-8 3.10GB/s ± 0% 8.00GB/s ± 0% +157.99% (p=0.008 n=5+5)
_ZFlat5-8 426MB/s ± 0% 768MB/s ± 0% +80.06% (p=0.008 n=5+5)
_ZFlat6-8 190MB/s ± 0% 282MB/s ± 1% +48.48% (p=0.008 n=5+5)
_ZFlat7-8 182MB/s ± 0% 264MB/s ± 1% +44.97% (p=0.008 n=5+5)
_ZFlat8-8 200MB/s ± 0% 298MB/s ± 0% +49.45% (p=0.008 n=5+5)
_ZFlat9-8 175MB/s ± 0% 247MB/s ± 0% +41.02% (p=0.008 n=5+5)
_ZFlat10-8 509MB/s ± 0% 1027MB/s ± 0% +101.72% (p=0.008 n=5+5)
_ZFlat11-8 275MB/s ± 0% 411MB/s ± 0% +49.57% (p=0.008 n=5+5)

One interesting factoid is that the Go-with-asm encoder now appears
faster than the C++ encoder (or, alternatively, I've screwed up
somewhere). My C++ snappy_unittest.log numbers:

BM_ZFlat/0 151725 151761 1306 643.5MB/s html (22.31 %)
BM_ZFlat/1 1857383 1857878 107 360.4MB/s urls (47.78 %)
BM_ZFlat/2 8587 8589 22246 13.3GB/s jpg (99.95 %)
BM_ZFlat/3 482 482 363636 395.3MB/s jpg_200 (73.00 %)
BM_ZFlat/4 15701 15705 12406 6.1GB/s pdf (83.30 %)
BM_ZFlat/5 631610 631613 318 618.5MB/s html4 (22.52 %)
BM_ZFlat/6 639135 639282 311 226.9MB/s txt1 (57.88 %)
BM_ZFlat/7 543036 543197 360 219.8MB/s txt2 (61.91 %)
BM_ZFlat/8 1702822 1703093 118 239.0MB/s txt3 (54.99 %)
BM_ZFlat/9 2224850 2225480 100 206.5MB/s txt4 (66.26 %)
BM_ZFlat/10 128935 128974 1526 876.9MB/s pb (19.68 %)
BM_ZFlat/11 499035 499168 398 352.1MB/s gaviota (37.72 %)

The "643.5MB/s" throughput column is the one of interest, to compare
to e.g. Go's "792MB/s".

Once again, comments welcome, and I'd appreciate a code reviewer.

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 18 of 23 | next ›
Discussion Overview
groupgolang-dev @
categoriesgo
postedFeb 26, '16 at 7:36a
activeApr 24, '16 at 12:39a
posts23
users7
websitegolang.org

People

Translate

site design / logo © 2021 Grokbase