On Saturday, October 13, 2012 2:16:01 AM UTC-7, Florian Weimer wrote:
* Gil Tene:
Starting with a precise and generational stop-the-world
implementation that is robust is a must, and a good launching pad
towards a concurrent compacting collector (which is what a
"pauseless" collector must be in server-scale environments).
I wonder why compaction is required. We've got many long-running
processes which use explicit memory management, and fragmentation does
not seem to be a huge issue there. Does something specific to garbage
collection make matters worse?
Robert has already made the good point of allocation speed: there is
nothing faster than a bump-pointer allocator, and that is only possible
with large contiguous empty regions that come from compaction. But the
other, and probably more important reason for compaction being critical to
generational collection is efficiency.
Generational collection is nothing more than an efficiency trick, but it's
a very powerful one (typically a 20:1 or better improvement in work
expended compared to non-generational collection of the same workload). The
efficiency of generational collection derives primarily from the
*combination* of using of a collector mechanism whose complexity is linear
to the live set, *within* a heap region that is known to have a live set
that is dramatically smaller than the heap size itself.
The weak generational hypothesis gives us the former: "most object die
young", which means that the vast majority of the contents of the young
generation heap is dead as long is we continue to promote long living
object out of it. But the former quality only comes from variants of
copying and mark/compact collectors (basically from moving collectors whose
complexity is linear only to the live set), sweeping will basically kill
it. In general, collectors that keep objects in place are forced to
scan/sweep the entire collected heap for dead matter which is then managed
in free lists (or similar structures), while copying and mark/compact
collector move the entire contents of a "from" space to a "to" space and
thereby avoid ever looking or spending cycles on dead matter.
The complexity on non-moving collectors is linear (at least in large part)
to the collected heap size, so using such collection techniques for young
generation collection would dramatically grow the cost of performing
generational collection. It would probably grow it to the point where
generational collection itself would not be that big a benefit...
There actually are (in academic work at least) some generational non-moving
collectors, but I think it's no accident that we don't tend to find them in
the wild. They end up removing most of the win...