Reason I actually am interested is because it sounded like
LZMA compression at LZO speed from the blog. I am actively working on
some code that needs compression and speed hence it caught my eye.
One innovation that Brotli uses is a static dictionary of 122k generated
based on web content of various languages and formats (like CSS, JS, etc).
This does, however, mean that Brotli is not technically a "general purpose"
algorithm since it assumes that the data it is compressing is largely
text-based. It's a wonderful idea for web content, but you should try
building the open source program yourself and benchmarking the speeds and
compression ratio *for your datasets*. My own tests (compressing the source
code of Linux kernel) have found that the compression ratio is better than
DEFLATE and LZMA at a similar compression speed. However, for maximum
compressibility, LZMA still beats Brotli. Even if your dataset cant take
advantage of the static dictionary, it may still perform better (ratio
wise) than DEFLATE since Brotli allows for sliding windows larger than
32KiB (Brotli allows for windows up to 16MiB in latest RFC draft).
I found this nice interactive benchmark:
Interesting suite of benchmarks. I'm actually surprised to see that DEFLATE
(or zlib) is pretty close to the Pareto frontier since I thought I had read
in the past somewhere that DEFLATE was now far from it. Personally, I have
always felt that DEFLATE did strike a surprisingly good balance between
speed and ratio for *generic input datasets *for a format designed in the
early 1990s.
Some random thoughts about using Brotli:
- It may or may not be worth implementing in pure Go just yet since the
RFC is still a draft and things are still changing. A CGo/SWIG wrapper may
be the better idea for the time being.
- I am a little concerned about the use of a static dictionary.
Languages change over time and formats die and are born, which may reduce
the effectiveness of this dictionary.
- The 122Ki static dictionary obviously needs to be compiled into the
binary. This may be expensive for embedded systems.
- (not-big-deal) The current Go compiler is currently relatively
inefficient about compiling large static byte slices. A Go file with
just the dictionary alone takes 1s to compile on my machine.
On Sunday, October 11, 2015 at 6:52:36 AM UTC-7, Klaus Post wrote:On Sunday, 11 October 2015 10:54:26 UTC+2, Damian Gryski wrote:
The decoder is pure C and looks much simpler to translate than the
compressor.
Yes, but it is still 2k LOC, so it's not done in a day or two ;)
A compressor is really the most useful for Go, since it is aimed at web
servers. Of course you could deploy behind a reverse proxy and let it
handle it. It will take some time for browser support to trickle out to end
users, so luckily we have a bit of time.
Also, with a bit of time, we can also get an impression of how "safe" it
is to use without SSL. Previous experience from trying other compression
formats show problems could occur with proxies that can only handle
deflate/gzip on plain HTTP.
Damian
/Klaus
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/d/optout.