Personally, I think the adjustments to levels 1-3 are worth doing, although
I am worried about the amount of added logic. The loss of compression ratio
does not bother me.

In several of the recent CLs you posted, there has been significant amounts
of duplicated logic. I am wondering if we could write one generic function
that can do the logic of all three levels, but with the portions that
differ segregated using a if constValueX {...} block. Then we can write a
code generator that simply outputs another source file that contains three
renamed versions of that generic function and just replaces each instance
of constValueX with true/false and let the compiler deal with getting rid
of the dead code. Thoughts?

On Wednesday, March 23, 2016 at 5:43:19 AM UTC-7, klau...@gmail.com wrote:
On Wednesday, 23 March 2016 12:28:36 UTC+1, Nigel Tao wrote:

I'm OK with this in principle, with a few remarks below.
I'm glad to hear that.

Re the overall pictures at http://imgur.com/a/5sWM4 and at
it's not obvious to me whether the graphs are showing the old or new
algorithms for levels 1, 2 and 3. In fact, I think the best graph
would show both.
I didn't want to go into details about the re-balancing, since we can take
that discussion separately, but the point was to show that:

* There is a big gap from the proposed level 3 to level 4, which will be
* Default level (6) does not represent a "sweet spot" between
CPU/compression ratio.
* Level 7-8 provide no significant compression ratio increase over level 6.

Actual numbers are from my personal repo. With CL 20929 merged, I expect
numbers to be close, but a bit lower because of the slower token slice and
combined lazy/no-lazy code. Russ asked for the numbers I had shown him, and
here they are.

Once we have the pending things merged, I will submit the re-balance
proposal and I will be able to do the "real" numbers.

Flate is normally Huffman + LZ77. In
https://go-review.googlesource.com/#/c/20982/ you are also proposing a
Huffman-only compression level. How does that relate (if at all) to
this proposal?

It is not directly related, so I separated it out in a separate CL, and
didn't mention it here to have the discussion a bit more focused. It is
more a special case, that definitely has its uses, but doesn't provide a
good generic compressor, but can be used on Snappy output, non-artificial
PNG images and content where this proposed level 1 is even is too slow.

Should we also have a LZ77-only compression level?
Unfortunately that is not possible with the deflate format. LZ77
(matching) data must be Huffman compressed. One thing I will investigate is
the possibility to only recalculate the Huffman tree every n deflate blocks
(which are up to 64k each), since tree calculation takes ~50% of the CPU
time in this proposed level 1. If we calculate them every second block,
that would be a 25% speed increase.

It will probably only be added to my personal package, but if it works
well, who knows if it will find a place in the standard library.

Should we have more than one such level? Conversely, when introducing
Snappy style matching, should we replace levels 1, 2 and 3, or just
level 1 aka BestSpeed?
Level 2+3 offer good progression. The "generic" deflater leaves a big gap
to level 1, and hits a performance ceiling. This is what you see in my blog
post, which was written before I wrote the specialized level 2+3.

Please don't do that. We have had problems with this before:
Don't worry, I know what I am doing. Each deflate block is never longer
thank 64KB, so skipping is reset for every 64KB data processed. This works
as intended. If it didn't 10GB.tar would end up considerably worse.

Btw, now that you mention it, you still haven't fixed Snappy which will
completely stop compressing data once it encounters 1MB uncompressible
data: https://github.com/golang/snappy/pull/23#issuecomment-183889455 -
maybe you should add changes as PRs in the future, instead of committing
directly to master - you are welcome to ping me for review, and maybe
others also have some input.

Also, that CL has three separate copy-pasted-and-specialized methods
for levels 1, 2 and 3. Yes, each method has a sentence or two in
comment, but it would be easier to understand the difference between
the three actual algorithms if there was just a single method, with
some "if e.level > 2" style blocks. Sure, copy-paste-and-specialize
can increase performance (at a cost of having more code), but it can
easily be a follow-up CL.
Sure - I will probably go the route of trying to factor out the common
code in smaller functions. That way the compiler can inline it, and we
don't have the branching overhead.

You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 12 of 20 | next ›
Discussion Overview
groupgolang-dev @
postedMar 22, '16 at 10:53a
activeApr 10, '16 at 2:06p



site design / logo © 2021 Grokbase