FAQ
In certain use cases, Java GC pauses impose real problem in certain use
cases, see e.g. [1] [2]. Azul systems have an interesting solution which
sounds really appealing [3], I know they're selling VMs whose main benefit
is a pauseless GC algorithm.

Is Go considering similar approach? I'm have very little understanding in
GC, but it looks like pauseless GC will be a unique selling point for Go,
not offered by any other open source language.

Are there plans to support such an algorithm?

[1] http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
[2] http://www.slideshare.net/cloudera/hbase-hug-presentation
[3] http://static.usenix.org/events/vee05/full_papers/p46-click.pdf

Search Discussions

  • Ian Lance Taylor at Oct 10, 2012 at 1:12 pm

    On Wed, Oct 10, 2012 at 2:33 AM, Elazar Leibovich wrote:
    In certain use cases, Java GC pauses impose real problem in certain use
    cases, see e.g. [1] [2]. Azul systems have an interesting solution which
    sounds really appealing [3], I know they're selling VMs whose main benefit
    is a pauseless GC algorithm.

    Is Go considering similar approach? I'm have very little understanding in
    GC, but it looks like pauseless GC will be a unique selling point for Go,
    not offered by any other open source language.

    Are there plans to support such an algorithm?
    I'm not aware of anybody working on this.

    The pauseless GC algorithms I've seen impose a severe performance
    penalty on all code running on ordinary multicore processors. It's
    the right choice for some programs, but not for most.

    Ian
  • Gil at Oct 10, 2012 at 3:37 pm
    [I'm one of the authors of both Pauseless and C4].

    The [common] impression that concurrent collection has to come at some
    significant overhead on either single core or multi-core processors is
    mostly due to misinformation.

    Neither the pauseless algorithm, nor more recent [generational] C4
    algorithm impose any multi-core related performance overheads compared to
    the most efficient collectors out there. There are no added cross-thread
    synchronization of any kind compared to stop-the-world algorithms. In fact,
    the whole point of a concurrent algorithm like C4 is to dramatically reduce
    GC-related synchronization overhead [stop-the-world pauses are dramatic
    synchronization points between GC and mutator threads].

    As for the overhead on individual threads (not multi-core related), and the
    impact on overall throughput achieved, the self-healing Loaded Value
    Barrier (LVB) used by C4 on modern, commodity multi-core processors
    exhibits a very small overhead compared to non-concurrent (Parallel GC).
    Mileage obviously varies by workload, but specifically, Figure 5 in the C4
    paper
    (http://dl.acm.org/citation.cfm?id=1993491&dl=ACM&coll=DL&CFID=172051598&CFTOKEN=22910064)
    includes a direct throughput comparison between C4, ParallelGC, and CMS on
    a multicore workload based on SPECjbb. C4 is within 6% of the per-core
    throughput that ParallelGC achieves on the same load, and more than 20%
    better than the throughput CMS achieves.
    On Wednesday, October 10, 2012 6:05:30 AM UTC-7, Ian Lance Taylor wrote:
    On Wed, Oct 10, 2012 at 2:33 AM, Elazar Leibovich wrote:
    In certain use cases, Java GC pauses impose real problem in certain use
    cases, see e.g. [1] [2]. Azul systems have an interesting solution which
    sounds really appealing [3], I know they're selling VMs whose main benefit
    is a pauseless GC algorithm.

    Is Go considering similar approach? I'm have very little understanding in
    GC, but it looks like pauseless GC will be a unique selling point for Go,
    not offered by any other open source language.

    Are there plans to support such an algorithm?
    I'm not aware of anybody working on this.

    The pauseless GC algorithms I've seen impose a severe performance
    penalty on all code running on ordinary multicore processors. It's
    the right choice for some programs, but not for most.

    Ian
  • Ian Lance Taylor at Oct 10, 2012 at 6:20 pm

    On Wed, Oct 10, 2012 at 7:53 AM, wrote:
    The [common] impression that concurrent collection has to come at some
    significant overhead on either single core or multi-core processors is
    mostly due to misinformation.
    Thanks for the info. Any interest in trying to implement this algorithm for Go?
    As for the overhead on individual threads (not multi-core related), and the
    impact on overall throughput achieved, the self-healing Loaded Value Barrier
    (LVB) used by C4 on modern, commodity multi-core processors exhibits a very
    small overhead compared to non-concurrent (Parallel GC). Mileage obviously
    varies by workload, but specifically, Figure 5 in the C4 paper
    (http://dl.acm.org/citation.cfm?id=1993491&dl=ACM&coll=DL&CFID=172051598&CFTOKEN=22910064)
    includes a direct throughput comparison between C4, ParallelGC, and CMS on a
    multicore workload based on SPECjbb. C4 is within 6% of the per-core
    throughput that ParallelGC achieves on the same load, and more than 20%
    better than the throughput CMS achieves.
    I can't access the paper. What processor were the comparisons run on?

    Ian
  • Gil Tene at Oct 10, 2012 at 4:03 pm
    Thanks for the info. Any interest in trying to implement this algorithm for Go?
    Unfortunately, we're a bit busy working on the Zing JVM and on OpenJDK stuff, and don't have cycles to put into Go... Getting a full blown concurrent GC implemented in an actual runtime is typically a year+ effort, mostly dominated by runtime related details and edge conditions (and not by the classic GC'ed heap issues). E.g. class loading/unloading and namespace management, generated code lifeycle management, lock lifecycle management, weak/soft/phantom reference mamangement, etc.

    The paper is an ISMM/ACM publication, and is freely available to ACM members... You can also find a copy on the Azul site at http://www.azulsystems.com/products/zing/c4-java-garbage-collector-wp (but it will want you to register).

    The tests were on a a 2 socket, 12 core Intel x5680 server with 96GB of memory. They spanned heap size from 3 to 35GB.

    -- Gil.
    On Oct 10, 2012, at 8:47 AM, Ian Lance Taylor wrote:
    On Wed, Oct 10, 2012 at 7:53 AM, wrote:

    The [common] impression that concurrent collection has to come at some
    significant overhead on either single core or multi-core processors is
    mostly due to misinformation.
    Thanks for the info. Any interest in trying to implement this algorithm for Go?
    As for the overhead on individual threads (not multi-core related), and the
    impact on overall throughput achieved, the self-healing Loaded Value Barrier
    (LVB) used by C4 on modern, commodity multi-core processors exhibits a very
    small overhead compared to non-concurrent (Parallel GC). Mileage obviously
    varies by workload, but specifically, Figure 5 in the C4 paper
    (http://dl.acm.org/citation.cfm?id=1993491&dl=ACM&coll=DL&CFID=172051598&CFTOKEN=22910064)
    includes a direct throughput comparison between C4, ParallelGC, and CMS on a
    multicore workload based on SPECjbb. C4 is within 6% of the per-core
    throughput that ParallelGC achieves on the same load, and more than 20%
    better than the throughput CMS achieves.
    I can't access the paper. What processor were the comparisons run on?

    Ian
  • Adam Langley at Oct 10, 2012 at 4:27 pm

    On Wed, Oct 10, 2012 at 12:02 PM, Gil Tene wrote:
    The paper is an ISMM/ACM publication, and is freely available to ACM members... You can also find a copy on the Azul site at http://www.azulsystems.com/products/zing/c4-java-garbage-collector-wp (but it will want you to register).

    The tests were on a a 2 socket, 12 core Intel x5680 server with 96GB of memory. They spanned heap size from 3 to 35GB.
    Azul's GC work is seriously impressive. To answer the original
    question, I think a GC of that quality would be very valuable in Go,
    but I don't believe that anyone is currently working on it. As Gil
    pointed out, it is a substantial endeavor.


    Cheers

    AGL
  • Elazar Leibovich at Oct 10, 2012 at 7:28 pm
    I really think that you underestimate the importance of pauseless GC.
    Already today, users of Go are having problems with Go at scale, YouTube's
    vitess pushes memory off-heap to memcached[1] because of GC issues.

    Another example is indexing the entire wikipedia in memory - simply
    impossible even with Java's excellent GC algorithms (5% of request have
    delay of ~5 secs) [2].

    Yet another example is HBase's book that recommends not using more than
    16Gb heap, otherwise GC pauses will make your peers think you're dead (this
    is what made me think).

    These issues are inherent to servers, since those have typically hundreds
    of Gigabytes, and situation will probably get "worse" in the near future.

    This is not a question of having a little higher bar in a certain benchmark
    game, this is a question of whether Go will be able to solve some
    significant engineering problems, while still offering fully garbage
    collected environment or not. It could be the added value that will allow
    Go to fight the "good enough" solutions.

    PS
    I don't work at Azul's marketing department. Honest.

    [1] http://code.google.com/p/vitess/wiki/ProjectGoals#Why_did_we_choose_go?
    [2[
    http://blog.mikemccandless.com/2012/07/lucene-index-in-ram-with-azuls-zing-jvm.html
    On Wed, Oct 10, 2012 at 6:19 PM, Adam Langley wrote:
    On Wed, Oct 10, 2012 at 12:02 PM, Gil Tene wrote:
    The paper is an ISMM/ACM publication, and is freely available to ACM
    members... You can also find a copy on the Azul site at
    http://www.azulsystems.com/products/zing/c4-java-garbage-collector-wp(but it will want you to register).
    The tests were on a a 2 socket, 12 core Intel x5680 server with 96GB of
    memory. They spanned heap size from 3 to 35GB.

    Azul's GC work is seriously impressive. To answer the original
    question, I think a GC of that quality would be very valuable in Go,
    but I don't believe that anyone is currently working on it. As Gil
    pointed out, it is a substantial endeavor.


    Cheers

    AGL
  • Rob Pike at Oct 10, 2012 at 8:07 pm

    On Thu, Oct 11, 2012 at 6:28 AM, Elazar Leibovich wrote:
    I really think that you underestimate the importance of pauseless GC.
    What makes you say that? No one here has said it's not a good thing.
    We've wanted it from the beginning, we spent quite a bit of time early
    on thinking about it, but it's very hard and a research-level problem.
    As Gil says, it's a year or more of work. We don't have a year handy
    at the moment and we're a very small team. And what if we spend that
    year and find it doesn't work as well as we'd hoped?

    I'd love to have a pauseless GC, especially if it adds no overhead for
    regular code, but I can't just snap my fingers and have one appear.

    Look at it this way: Why doesn't the standard Java implementation have one?

    -rob
  • Elazar Leibovich at Oct 10, 2012 at 8:18 pm

    On Wed, Oct 10, 2012 at 9:58 PM, Rob Pike wrote:
    On Thu, Oct 11, 2012 at 6:28 AM, Elazar Leibovich wrote:
    I really think that you underestimate the importance of pauseless GC.
    What makes you say that?

    Indeed no one said that. It's bad phrasing on my part, I apologize.

    I was just trying to emphasize that, in my view, pauseless GC is more than
    "let's make Go faster", it's "let's make make certain important problems
    solvable with Go".

    My apologies again if I put words in anyone's mouth.
  • 2paul De at Oct 11, 2012 at 7:13 pm
    I really find the factual discussion in this thread kind of exciting,
    because it seems to me that the quintessence is: it can be done.

    Maybe this is a really blunt question, but could somebody pin a number on
    the project, a dollar amount? With that information, the problem would be
    significantly narrowed down to the level of a fundraising problem.

    Cheers.
    On Wednesday, October 10, 2012 10:12:55 PM UTC+2, Elazar Leibovich wrote:

    On Wed, Oct 10, 2012 at 9:58 PM, Rob Pike <r...@golang.org <javascript:>>wrote:
    On Thu, Oct 11, 2012 at 6:28 AM, Elazar Leibovich <ela...@gmail.com<javascript:>>
    wrote:
    I really think that you underestimate the importance of pauseless GC.
    What makes you say that?

    Indeed no one said that. It's bad phrasing on my part, I apologize.

    I was just trying to emphasize that, in my view, pauseless GC is more than
    "let's make Go faster", it's "let's make make certain important problems
    solvable with Go".

    My apologies again if I put words in anyone's mouth.
  • Ian Lance Taylor at Oct 11, 2012 at 4:36 pm

    On Thu, Oct 11, 2012 at 3:23 AM, wrote:
    Maybe this is a really blunt question, but could somebody pin a number on
    the project, a dollar amount? With that information, the problem would be
    significantly narrowed down to the level of a fundraising problem.
    I don't think there is any point to trying to raise funds without
    first identifying the people who would actually do it. This is not a
    project that can be put out to bid. The result has to be acceptable
    to the Go maintainers.

    Ian
  • Gil Tene at Oct 11, 2012 at 6:04 pm
    As an external observer with significant knowledge on what it takes to get there (we at Azul seem to currently be the only ones to have actually done it in a shipping commercial product), I would recommend an incremental approach. It is my understanding (based on offline discussion) that Go currently uses a conservative, non-relocating collector, so there is a lot of work yet to be done to support any form of long-running collector (which almost invariably means supporting a moving collector).

    Starting with a precise and generational stop-the-world implementation that is robust is a must, and a good launching pad towards a concurrent compacting collector (which is what a "pauseless" collector must be in server-scale environments). Each of those qualities (precise, generational) slaps serious requirements on the execution environment and on the compilers (whether they are pre-compilers or JIT compilers doesn't matter): precise collectors require full identification of all references at code safepoints, and also require a robust safepoint mechanism. Code safepoints must be frequent (usually devolve to being at every method entry and loop back edge), and support in non-compiler-generated code (e.g. library and runtime code written in C/C++) usually involves some form of reference handle support around safepoints. Generational collectors require a write barrier (a ref-store barrier to be precise) with full coverage for any heap reference store operations (in compiler-generated code and in all runtime code).

    It is my opinion that investing in the above capabilities early in the process (i.e. start now!) is critical. Environments that skip this step for too long and try to live with conservative GC in order to avoid putting in the required work for supporting precise collectors in the compilers and runtime and libraries find themselves heavily invested in compiler code that would need to be completely re-vapmed to move forward. Some (like Mono) get stuck there for many years, and end up with complicated things like mostly-precise-but-still-conservartive scanning that reduce the efficiency of generational collection and bump-pointer allocation. This usually happens because the investment in already existing compilers that do not provide precise information, and the lack of systemic and disciplined support for safepoints and write barriers in the runtime and non-generated code libraries become too big a thing to do a "from scratch overhaul" of. [write barrier support in generated code is usually a localized, near-trivial piece of work, it's the coverage of write barriers everywhere in non-genearated code that usually takes a long time, and has a long bug tail].

    C4-like capability would add the need for a full-coverage read barrier (a ref-load barrier to be precise) for any heap reference load operations (in compiler-generated code and in all runtime code), so it would be good to keep that in mind as part of the overall effort.

    -- Gil.
    On Oct 11, 2012, at 9:36 AM, Ian Lance Taylor wrote:
    On Thu, Oct 11, 2012 at 3:23 AM, wrote:

    Maybe this is a really blunt question, but could somebody pin a number on
    the project, a dollar amount? With that information, the problem would be
    significantly narrowed down to the level of a fundraising problem.
    I don't think there is any point to trying to raise funds without
    first identifying the people who would actually do it. This is not a
    project that can be put out to bid. The result has to be acceptable
    to the Go maintainers.

    Ian
  • Florian Weimer at Oct 13, 2012 at 9:22 am

    * Gil Tene:

    Starting with a precise and generational stop-the-world
    implementation that is robust is a must, and a good launching pad
    towards a concurrent compacting collector (which is what a
    "pauseless" collector must be in server-scale environments).
    I wonder why compaction is required. We've got many long-running
    processes which use explicit memory management, and fragmentation does
    not seem to be a huge issue there. Does something specific to garbage
    collection make matters worse?
  • Robert Griesemer at Oct 13, 2012 at 6:41 pm
    Without compaction, one major benefit of a generational collector goes
    away: fast allocation (besides, are there generational collectors w/o
    compaction? - I don't know).

    With compaction, allocation essentially becomes a single test and increment
    of a pointer (by the size of the allocated amount of memory). If there's
    one "eden" or "newspace" (where the memory is allocated from), this
    operation has to be atomic, if there's one eden per core, it doesn't have
    to be. Either way, this form of allocation beats any other scheme in
    performance by some margin (looking at allocation speed alone) and
    compensates to some extent for the extra cost incurred by read and write
    barriers required in a generational or incremental scheme.

    The eden's are usually smallish (a few 100KBs) and they can be collected
    and compacted in the order of a few ms. The collection time is proportional
    to the size of data surviving - another important benefit. Very large
    objects and objects that don't contain pointers usually go elsewhere.

    As has been pointed out before, the main problem with compacting schemes
    for Go is the presence of interior pointers. It certainly can be done at
    the cost of extra memory (a simple approach would be to make every pointer
    a pointer to the base of an object, plus an offset). But perhaps there's a
    smarter approach (along the lines of computing the bases and offsets when
    garbage collecting only, and undoing it again afterwards).

    - gri

    On Sat, Oct 13, 2012 at 2:15 AM, Florian Weimer wrote:

    * Gil Tene:
    Starting with a precise and generational stop-the-world
    implementation that is robust is a must, and a good launching pad
    towards a concurrent compacting collector (which is what a
    "pauseless" collector must be in server-scale environments).
    I wonder why compaction is required. We've got many long-running
    processes which use explicit memory management, and fragmentation does
    not seem to be a huge issue there. Does something specific to garbage
    collection make matters worse?
  • Gil at Oct 14, 2012 at 12:01 am

    On Saturday, October 13, 2012 11:35:23 AM UTC-7, gri wrote:
    ... <stuff I agree with completely about generational GC efficiency>...

    As has been pointed out before, the main problem with compacting schemes
    for Go is the presence of interior pointers. It certainly can be done at
    the cost of extra memory (a simple approach would be to make every pointer
    a pointer to the base of an object, plus an offset). But perhaps there's a
    smarter approach (along the lines of computing the bases and offsets when
    garbage collecting only, and undoing it again afterwards).
    As I noted before, interior pointers are not really a material issue for
    moving collectors. They just add a bit of a wrinkle and one of those
    subject not often discussed in academic GC work (neither are weak/soft refs
    for that matter). There is "fairly efficient" ways of dealing with interior
    pointers and derived pointers (as long as we know about them). Many of
    those look a lot like the schemes already used by generational collectors
    do manage remembered sets when they deal with card-marking schemes.

    When generational collectors use card marking to track potential roots into
    the young generation, they often use a single card entry to refer to the
    potential existence root reference in a block of heap addresses. E.g. in
    HotSpot each card mark refers to a 512 bytes block that must be scanned for
    references into the young gen. Whenever a reference is stored into the
    heap, an associated card is set (the card address is directly derived from
    the store-to address with a simple shift). "Dirty" cards are effectively
    imprecise interior pointers. locating references within the block they
    refer to requires backtracking to an object boundary and scanning forward
    from there. The exact same technique can be used to deal with marking with
    actual, precise interior pointers provided by the languages like Go and C#.
  • Florian Weimer at Oct 14, 2012 at 10:28 am

    * Robert Griesemer:

    Without compaction, one major benefit of a generational collector
    goes away: fast allocation (besides, are there generational
    collectors w/o compaction? - I don't know).
    You can have size-specific processor-local allocation lists, which
    could have similar performance characteristics.

    Non-moving generational collectors exist. Lua 5.2 has one, and the
    Boehm-Demers-Weiser collector provides two generational modes (but
    both are used rarely).

    I'm not disputing the potential usefulness of compaction. But I find
    it odd that large, long-running processes are apparently fine without
    it (perhaps because they use explicit memory management). And
    compaction certainly comes with costs: heavy write traffic during
    compactions, and the need to copy or pin objects when they are
    referenced from C.
  • Gil at Oct 15, 2012 at 1:35 am

    On Sunday, October 14, 2012 3:28:23 AM UTC-7, Florian Weimer wrote:

    I'm not disputing the potential usefulness of compaction. But I find
    it odd that large, long-running processes are apparently fine without
    it (perhaps because they use explicit memory management). And
    compaction certainly comes with costs: heavy write traffic during
    compactions, and the need to copy or pin objects when they are
    referenced from C.
    The notion that compaction is expensive or costly is a misnomer:

    - For the same live set, compaction tends to be dramatically cheaper than
    marking. Think of it this way: one is a streaming operation, and one is a
    pointer chasing operation, and both visit the same data set. Compaction
    tends to take both fewer cycles and less memory bandwidth (because it tends
    to be very cache efficient). Our experience shows that marking tends to
    take ~5x the CPU work that compaction does for the exact same data set. C4
    is a pure Mark/Compact collector, so these stats stare us in the face every
    day.

    - For sparse heaps (which young generation heaps are, by definition),
    compaction is dramatically cheaper than sweeping would be (as in "copying
    2% of the heap away is way cheaper than sweeping 100% of the heap").

    What I'm pointing out here is that compaction is not only highly desirable
    from a defragmentation perspective - it is also more efficient from a CPU
    perspective compared to non-compacting alternatives. The remaining
    intuitive tendency to delay or avoid compaction comes from the wish to
    avoid large pauses, rather than from the wish to reduce CPU work. This
    intuition only applies to old generation collections, as copying pauses are
    shorter than sweeping pauses in young generation heaps. Concurrent
    compaction takes the reason away completely, for both generations. It's a
    win-win.
  • Minux at Oct 14, 2012 at 10:53 am

    On Sun, Oct 14, 2012 at 2:35 AM, Robert Griesemer wrote:

    As has been pointed out before, the main problem with compacting schemes
    for Go is the presence of interior pointers. It certainly can be done at
    the cost of extra memory (a simple approach would be to make every pointer
    a pointer to the base of an object, plus an offset). But perhaps there's a
    smarter approach (along the lines of computing the bases and offsets when
    garbage collecting only, and undoing it again afterwards).
    Without unsafe, pointers in Go objects can't move backward (decrease in
    value), so i think the interior
    pointer problem is not that big. (i think it only affect finding out the
    type of value a pointer points to)

    for example, if I allocate a large []byte to p and then return
    p[someLargeNumber:] (and don't retain any
    references to the slices), i think it will be great that the memory used by
    p[0:someLargeNumber] could be
    collected. The same applies to strings, too.

    also, for a pointer to a struct field, if it's the only reference to that
    whole struct, then the memory occupied
    by all the other fields could be collected. yes, this is odd, but valid
    (albeit aggressive) GC behavior, right?

    i might get something fundamentally wrong here, please correct me if it's
    the case, thank you.
  • Gil at Oct 15, 2012 at 1:19 am
    Thinking about freeing partial objects makes my head hurt. I suspect it may
    make other things hurt too.

    From a liveness semantic point of view, the simple, "makes sense" treatment
    of interior pointers is to think of them as keeping the object they point
    *into* reachable (and not just the field they point to). This keeps the
    problem within the classic heap GC realm, and simply requires a way to
    derive the object address from an interior pointer (something most
    generational collectors can already do due to card marking, see previous
    note).

    Even if there was some specific partial-freeing optimization possible (e.g.
    through knowing that pointers only move forward as mentioned below), I
    doubt much real-world gains would be had from applying them, at least not
    in a way that would be worth the complexity and correctness risk...
    On Sunday, October 14, 2012 3:53:40 AM UTC-7, minux wrote:


    On Sun, Oct 14, 2012 at 2:35 AM, Robert Griesemer <g...@golang.org<javascript:>
    wrote:
    As has been pointed out before, the main problem with compacting schemes
    for Go is the presence of interior pointers. It certainly can be done at
    the cost of extra memory (a simple approach would be to make every pointer
    a pointer to the base of an object, plus an offset). But perhaps there's a
    smarter approach (along the lines of computing the bases and offsets when
    garbage collecting only, and undoing it again afterwards).
    Without unsafe, pointers in Go objects can't move backward (decrease in
    value), so i think the interior
    pointer problem is not that big. (i think it only affect finding out the
    type of value a pointer points to)

    for example, if I allocate a large []byte to p and then return
    p[someLargeNumber:] (and don't retain any
    references to the slices), i think it will be great that the memory used
    by p[0:someLargeNumber] could be
    collected. The same applies to strings, too.

    also, for a pointer to a struct field, if it's the only reference to that
    whole struct, then the memory occupied
    by all the other fields could be collected. yes, this is odd, but valid
    (albeit aggressive) GC behavior, right?

    i might get something fundamentally wrong here, please correct me if it's
    the case, thank you.
  • Dmitry Vyukov at Oct 15, 2012 at 4:17 am

    On Monday, October 15, 2012 5:11:21 AM UTC+4, (unknown) wrote:
    Thinking about freeing partial objects makes my head hurt. I suspect it
    may make other things hurt too.

    From a liveness semantic point of view, the simple, "makes sense"
    treatment of interior pointers is to think of them as keeping the object
    they point *into* reachable (and not just the field they point to). This
    keeps the problem within the classic heap GC realm, and simply requires a
    way to derive the object address from an interior pointer (something most
    generational collectors can already do due to card marking, see previous
    note).

    Even if there was some specific partial-freeing optimization possible
    (e.g. through knowing that pointers only move forward as mentioned below),
    I doubt much real-world gains would be had from applying them, at least not
    in a way that would be worth the complexity and correctness risk...
    I've heard that a small substring can hold a huge string alive in Java, and
    sometimes it represents a serious problem (put a small substring of a big
    request into a persistent container). I understand that it's posible to
    resolve it manually by copying the string, but wouldn't it be nice to
    resolve it automatically?



    On Sunday, October 14, 2012 3:53:40 AM UTC-7, minux wrote:

    On Sun, Oct 14, 2012 at 2:35 AM, Robert Griesemer wrote:

    As has been pointed out before, the main problem with compacting schemes
    for Go is the presence of interior pointers. It certainly can be done at
    the cost of extra memory (a simple approach would be to make every pointer
    a pointer to the base of an object, plus an offset). But perhaps there's a
    smarter approach (along the lines of computing the bases and offsets when
    garbage collecting only, and undoing it again afterwards).
    Without unsafe, pointers in Go objects can't move backward (decrease in
    value), so i think the interior
    pointer problem is not that big. (i think it only affect finding out the
    type of value a pointer points to)

    for example, if I allocate a large []byte to p and then return
    p[someLargeNumber:] (and don't retain any
    references to the slices), i think it will be great that the memory used
    by p[0:someLargeNumber] could be
    collected. The same applies to strings, too.

    also, for a pointer to a struct field, if it's the only reference to that
    whole struct, then the memory occupied
    by all the other fields could be collected. yes, this is odd, but valid
    (albeit aggressive) GC behavior, right?

    i might get something fundamentally wrong here, please correct me if it's
    the case, thank you.
  • Patrick Higgins at Oct 15, 2012 at 6:18 pm
    I am glad this topic is being discussed, as it was my earliest concern. The
    first Go code I read was the garbage collector and I was impressed by how
    simple it was. I bought some books on garbage collection and considered
    implementing a generational collector for Go, but then realized my
    experience with Java has been that heap size was limited to 2-4 GB because
    compaction pauses became too long for a web server above that. That seems
    ridiculously low on modern hardware. C4 is the only thing I have heard
    about which allows really large heaps to be used with acceptable pause
    times. Is anyone familiar with anything else?

    Anyway, thinking about the problem also made me realize that Java is able
    to allow different GC strategies to be selected because it is JIT
    compiled--read barriers and write barriers can be inserted if needed for
    the chosen strategy. Has anyone thought about how Go could support
    different strategies without requiring a full recompile of your app and all
    its dependencies?
    On Sunday, October 14, 2012 10:10:09 PM UTC-6, Dmitry Vyukov wrote:


    On Monday, October 15, 2012 5:11:21 AM UTC+4, (unknown) wrote:

    Thinking about freeing partial objects makes my head hurt. I suspect it
    may make other things hurt too.

    From a liveness semantic point of view, the simple, "makes sense"
    treatment of interior pointers is to think of them as keeping the object
    they point *into* reachable (and not just the field they point to). This
    keeps the problem within the classic heap GC realm, and simply requires a
    way to derive the object address from an interior pointer (something most
    generational collectors can already do due to card marking, see previous
    note).

    Even if there was some specific partial-freeing optimization possible
    (e.g. through knowing that pointers only move forward as mentioned below),
    I doubt much real-world gains would be had from applying them, at least not
    in a way that would be worth the complexity and correctness risk...
    I've heard that a small substring can hold a huge string alive in Java,
    and sometimes it represents a serious problem (put a small substring of a
    big request into a persistent container). I understand that it's posible to
    resolve it manually by copying the string, but wouldn't it be nice to
    resolve it automatically?



    On Sunday, October 14, 2012 3:53:40 AM UTC-7, minux wrote:

    On Sun, Oct 14, 2012 at 2:35 AM, Robert Griesemer wrote:

    As has been pointed out before, the main problem with compacting
    schemes for Go is the presence of interior pointers. It certainly can be
    done at the cost of extra memory (a simple approach would be to make every
    pointer a pointer to the base of an object, plus an offset). But perhaps
    there's a smarter approach (along the lines of computing the bases and
    offsets when garbage collecting only, and undoing it again afterwards).
    Without unsafe, pointers in Go objects can't move backward (decrease in
    value), so i think the interior
    pointer problem is not that big. (i think it only affect finding out the
    type of value a pointer points to)

    for example, if I allocate a large []byte to p and then return
    p[someLargeNumber:] (and don't retain any
    references to the slices), i think it will be great that the memory used
    by p[0:someLargeNumber] could be
    collected. The same applies to strings, too.

    also, for a pointer to a struct field, if it's the only reference to
    that whole struct, then the memory occupied
    by all the other fields could be collected. yes, this is odd, but valid
    (albeit aggressive) GC behavior, right?

    i might get something fundamentally wrong here, please correct me if
    it's the case, thank you.
  • Florian Weimer at Oct 15, 2012 at 7:11 pm

    This keeps the problem within the classic heap GC realm, and simply
    requires a way to derive the object address from an interior pointer
    (something most generational collectors can already do due to card
    marking, see previous note).
    I'm not sure if the problems are comparable. Marked cards are rare,
    and they can be processed in the order of increasing addresses.
    Neither applies to generic pointer traversal.
  • Ian Lance Taylor at Oct 15, 2012 at 4:06 am

    On Sat, Oct 13, 2012 at 11:35 AM, Robert Griesemer wrote:
    As has been pointed out before, the main problem with compacting schemes for
    Go is the presence of interior pointers.
    Another issue is pointers passed into C code via cgo, which according
    to the current rules can be collected but can not be moved. I don't
    know how significant that would be in practice.

    Ian
  • Gil at Oct 13, 2012 at 11:22 pm

    On Saturday, October 13, 2012 2:16:01 AM UTC-7, Florian Weimer wrote:
    * Gil Tene:
    Starting with a precise and generational stop-the-world
    implementation that is robust is a must, and a good launching pad
    towards a concurrent compacting collector (which is what a
    "pauseless" collector must be in server-scale environments).
    I wonder why compaction is required. We've got many long-running
    processes which use explicit memory management, and fragmentation does
    not seem to be a huge issue there. Does something specific to garbage
    collection make matters worse?
    Robert has already made the good point of allocation speed: there is
    nothing faster than a bump-pointer allocator, and that is only possible
    with large contiguous empty regions that come from compaction. But the
    other, and probably more important reason for compaction being critical to
    generational collection is efficiency.

    Generational collection is nothing more than an efficiency trick, but it's
    a very powerful one (typically a 20:1 or better improvement in work
    expended compared to non-generational collection of the same workload). The
    efficiency of generational collection derives primarily from the
    *combination* of using of a collector mechanism whose complexity is linear
    to the live set, *within* a heap region that is known to have a live set
    that is dramatically smaller than the heap size itself.

    The weak generational hypothesis gives us the former: "most object die
    young", which means that the vast majority of the contents of the young
    generation heap is dead as long is we continue to promote long living
    object out of it. But the former quality only comes from variants of
    copying and mark/compact collectors (basically from moving collectors whose
    complexity is linear only to the live set), sweeping will basically kill
    it. In general, collectors that keep objects in place are forced to
    scan/sweep the entire collected heap for dead matter which is then managed
    in free lists (or similar structures), while copying and mark/compact
    collector move the entire contents of a "from" space to a "to" space and
    thereby avoid ever looking or spending cycles on dead matter.
    The complexity on non-moving collectors is linear (at least in large part)
    to the collected heap size, so using such collection techniques for young
    generation collection would dramatically grow the cost of performing
    generational collection. It would probably grow it to the point where
    generational collection itself would not be that big a benefit...

    There actually are (in academic work at least) some generational non-moving
    collectors, but I think it's no accident that we don't tend to find them in
    the wild. They end up removing most of the win...
  • Kyle Lemons at Oct 10, 2012 at 9:04 pm

    On Wed, Oct 10, 2012 at 12:28 PM, Elazar Leibovich wrote:

    I really think that you underestimate the importance of pauseless GC.
    Already today, users of Go are having problems with Go at scale, YouTube's
    vitess pushes memory off-heap to memcached[1] because of GC issues.

    Another example is indexing the entire wikipedia in memory - simply
    impossible even with Java's excellent GC algorithms (5% of request have
    delay of ~5 secs) [2].

    Yet another example is HBase's book that recommends not using more than
    16Gb heap, otherwise GC pauses will make your peers think you're dead (this
    is what made me think).

    These issues are inherent to servers, since those have typically hundreds
    of Gigabytes, and situation will probably get "worse" in the near future.

    This is not a question of having a little higher bar in a certain
    benchmark game, this is a question of whether Go will be able to solve some
    significant engineering problems, while still offering fully garbage
    collected environment or not. It could be the added value that will allow
    Go to fight the "good enough" solutions.
    Go, the language, does not specify a particular garbage collection
    mechanism. It is also very young in the grand scheme of things. I would
    expect the scheduler and the garbage collector to be two active areas of
    development for some time to come.

    PS
    I don't work at Azul's marketing department. Honest.

    [1] http://code.google.com/p/vitess/wiki/ProjectGoals#Why_did_we_choose_go
    ?
    [2[
    http://blog.mikemccandless.com/2012/07/lucene-index-in-ram-with-azuls-zing-jvm.html

    On Wed, Oct 10, 2012 at 6:19 PM, Adam Langley wrote:
    On Wed, Oct 10, 2012 at 12:02 PM, Gil Tene wrote:
    The paper is an ISMM/ACM publication, and is freely available to ACM
    members... You can also find a copy on the Azul site at
    http://www.azulsystems.com/products/zing/c4-java-garbage-collector-wp(but it will want you to register).
    The tests were on a a 2 socket, 12 core Intel x5680 server with 96GB of
    memory. They spanned heap size from 3 to 35GB.

    Azul's GC work is seriously impressive. To answer the original
    question, I think a GC of that quality would be very valuable in Go,
    but I don't believe that anyone is currently working on it. As Gil
    pointed out, it is a substantial endeavor.


    Cheers

    AGL
  • Ian Lance Taylor at Oct 10, 2012 at 6:21 pm

    On Wed, Oct 10, 2012 at 9:02 AM, Gil Tene wrote:
    Thanks for the info. Any interest in trying to implement this algorithm for Go?
    Unfortunately, we're a bit busy working on the Zing JVM and on OpenJDK stuff, and don't have cycles to put into Go... Getting a full blown concurrent GC implemented in an actual runtime is typically a year+ effort, mostly dominated by runtime related details and edge conditions (and not by the classic GC'ed heap issues). E.g. class loading/unloading and namespace management, generated code lifeycle management, lock lifecycle management, weak/soft/phantom reference mamangement, etc.
    Several of those issues don't arise for Go, of course. Go does have a
    different issue: the language permits pointers to the interior of
    objects.

    Ian
  • Gil Tene at Oct 10, 2012 at 5:51 pm
    Refs to fields and array members are not a significant issue for most algorithms. They are just a twist that takes some extra work and some annoying backtracking to deal with. .Net has those too.

    The starting point is usually a precise, generational stop-the-world collector. Before a runtime (or language execution environment) reaches that level, talking about concurrency is usually premature. The burden of supporting precise GC, write barriers (needed for any generational collector) and read barriers (needed for practical concurrent collectors) falls mostly on the compilers and runtime. This is true regardless of whether JITs or pre-compilers are used, and whether the runtime takes the form of libraries or more intricate other stuff is not really relevant there (meta-circular runtimes are probably the easiest to deal with, but those rarely appear in production yet). Reflection and introspection (I don't know enough about Go to tell if they matter here) add their own silly twists, depending on how they are implemented.

    -- Gil.
    On Oct 10, 2012, at 10:29 AM, Ian Lance Taylor wrote:
    On Wed, Oct 10, 2012 at 9:02 AM, Gil Tene wrote:
    Thanks for the info. Any interest in trying to implement this algorithm for Go?
    Unfortunately, we're a bit busy working on the Zing JVM and on OpenJDK stuff, and don't have cycles to put into Go... Getting a full blown concurrent GC implemented in an actual runtime is typically a year+ effort, mostly dominated by runtime related details and edge conditions (and not by the classic GC'ed heap issues). E.g. class loading/unloading and namespace management, generated code lifeycle management, lock lifecycle management, weak/soft/phantom reference mamangement, etc.
    Several of those issues don't arise for Go, of course. Go does have a
    different issue: the language permits pointers to the interior of
    objects.

    Ian
  • Moru0011 at Dec 25, 2013 at 5:50 am
    Having built large realtime data processing systems, i'd like to point out
    that from a industry perspective, GC is one of the most important issues if
    not the most important.
    Especially in distributed, latency sensitive message driven systems
    (multicast messaging dislikes GC), GC is the major issue when using Java.
    I'd say in many cases losing 30% performance to get a near pauseless GC
    could be considered a good tradeoff.
    In fact the major point in favouring C++ over Java is GC, not performance
    (with modern class libs and smart pointers, C++ performance is below Java
    frequently).
    I can't see GO squeeze inbetween C++ and Java without a significant
    improvement in GC. As Gil Tene said above, the later one starts adressing
    the issue, the more expensive it gets. Should be top priority



    Am Mittwoch, 10. Oktober 2012 11:33:04 UTC+2 schrieb Elazar Leibovich:
    In certain use cases, Java GC pauses impose real problem in certain use
    cases, see e.g. [1] [2]. Azul systems have an interesting solution which
    sounds really appealing [3], I know they're selling VMs whose main benefit
    is a pauseless GC algorithm.

    Is Go considering similar approach? I'm have very little understanding in
    GC, but it looks like pauseless GC will be a unique selling point for Go,
    not offered by any other open source language.

    Are there plans to support such an algorithm?

    [1]
    http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
    [2] http://www.slideshare.net/cloudera/hbase-hug-presentation
    [3] http://static.usenix.org/events/vee05/full_papers/p46-click.pdf
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • ⚛ at Dec 25, 2013 at 7:39 am
    The incomplete Tulgo compiler is aiming for immediate deallocation of
    objects that aren't forming a cycle. Tulgo uses reference counting as the
    starting point for implementing the garbage collector, while the standard
    Go compiler chose the stop-the-world garbage collector as the starting
    point.
    On Tuesday, December 24, 2013 11:02:33 PM UTC+1, moru...@gmail.com wrote:

    Having built large realtime data processing systems, i'd like to point out
    that from a industry perspective, GC is one of the most important issues if
    not the most important.
    Especially in distributed, latency sensitive message driven systems
    (multicast messaging dislikes GC), GC is the major issue when using Java.
    I'd say in many cases losing 30% performance to get a near pauseless GC
    could be considered a good tradeoff.
    In fact the major point in favouring C++ over Java is GC, not performance
    (with modern class libs and smart pointers, C++ performance is below Java
    frequently).
    I can't see GO squeeze inbetween C++ and Java without a significant
    improvement in GC.
    >
    As Gil Tene said above, the later one starts adressing the issue, the more
    expensive it gets. Should be top priority



    Am Mittwoch, 10. Oktober 2012 11:33:04 UTC+2 schrieb Elazar Leibovich:
    In certain use cases, Java GC pauses impose real problem in certain use
    cases, see e.g. [1] [2]. Azul systems have an interesting solution which
    sounds really appealing [3], I know they're selling VMs whose main benefit
    is a pauseless GC algorithm.

    Is Go considering similar approach? I'm have very little understanding in
    GC, but it looks like pauseless GC will be a unique selling point for Go,
    not offered by any other open source language.

    Are there plans to support such an algorithm?

    [1]
    http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
    [2] http://www.slideshare.net/cloudera/hbase-hug-presentation
    [3] http://static.usenix.org/events/vee05/full_papers/p46-click.pdf
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Elazar Leibovich at Dec 25, 2013 at 7:44 am
    Won't cycle breaking still cause stop-the-world pauses? You still need to
    scan the entire heap from time to time.

    On Wed, Dec 25, 2013 at 9:39 AM, ⚛ wrote:

    The incomplete Tulgo compiler is aiming for immediate deallocation of
    objects that aren't forming a cycle. Tulgo uses reference counting as the
    starting point for implementing the garbage collector, while the standard
    Go compiler chose the stop-the-world garbage collector as the starting
    point.

    On Tuesday, December 24, 2013 11:02:33 PM UTC+1, moru...@gmail.com wrote:

    Having built large realtime data processing systems, i'd like to point
    out that from a industry perspective, GC is one of the most important
    issues if not the most important.
    Especially in distributed, latency sensitive message driven systems
    (multicast messaging dislikes GC), GC is the major issue when using Java.
    I'd say in many cases losing 30% performance to get a near pauseless GC
    could be considered a good tradeoff.
    In fact the major point in favouring C++ over Java is GC, not performance
    (with modern class libs and smart pointers, C++ performance is below Java
    frequently).
    I can't see GO squeeze inbetween C++ and Java without a significant
    improvement in GC.

    As Gil Tene said above, the later one starts adressing the issue, the more
    expensive it gets. Should be top priority



    Am Mittwoch, 10. Oktober 2012 11:33:04 UTC+2 schrieb Elazar Leibovich:
    In certain use cases, Java GC pauses impose real problem in certain use
    cases, see e.g. [1] [2]. Azul systems have an interesting solution which
    sounds really appealing [3], I know they're selling VMs whose main benefit
    is a pauseless GC algorithm.

    Is Go considering similar approach? I'm have very little understanding
    in GC, but it looks like pauseless GC will be a unique selling point for
    Go, not offered by any other open source language.

    Are there plans to support such an algorithm?

    [1] http://www.cloudera.com/blog/2011/02/avoiding-full-
    gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
    [2] http://www.slideshare.net/cloudera/hbase-hug-presentation
    [3] http://static.usenix.org/events/vee05/full_papers/p46-click.pdf
    --
    ---
    You received this message because you are subscribed to a topic in the
    Google Groups "golang-dev" group.
    To unsubscribe from this topic, visit
    https://groups.google.com/d/topic/golang-dev/GvA0DaCI2BU/unsubscribe.
    To unsubscribe from this group and all its topics, send an email to
    golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • ⚛ at Dec 25, 2013 at 8:06 am

    On Wednesday, December 25, 2013 8:44:49 AM UTC+1, Elazar Leibovich wrote:
    Won't cycle breaking still cause stop-the-world pauses? You still need to
    scan the entire heap from time to time.
    I do not want to comment on this now because the Tul compiler isn't there
    yet. Garbage collection experience with Tul will be different from the
    experience people have with other compilers.

    I imagine that (if the Tulgo compiler is ever finished) the biggest
    obstacle for use by people other than myself will be how to gain access to
    the compiler.


    On Wed, Dec 25, 2013 at 9:39 AM, ⚛ <0xe2.0x...@gmail.com <javascript:>>wrote:
    The incomplete Tulgo compiler is aiming for immediate deallocation of
    objects that aren't forming a cycle. Tulgo uses reference counting as the
    starting point for implementing the garbage collector, while the standard
    Go compiler chose the stop-the-world garbage collector as the starting
    point.

    On Tuesday, December 24, 2013 11:02:33 PM UTC+1, moru...@gmail.com wrote:

    Having built large realtime data processing systems, i'd like to point
    out that from a industry perspective, GC is one of the most important
    issues if not the most important.
    Especially in distributed, latency sensitive message driven systems
    (multicast messaging dislikes GC), GC is the major issue when using Java.
    I'd say in many cases losing 30% performance to get a near pauseless GC
    could be considered a good tradeoff.
    In fact the major point in favouring C++ over Java is GC, not
    performance (with modern class libs and smart pointers, C++ performance is
    below Java frequently).
    I can't see GO squeeze inbetween C++ and Java without a significant
    improvement in GC.

    As Gil Tene said above, the later one starts adressing the issue, the
    more expensive it gets. Should be top priority



    Am Mittwoch, 10. Oktober 2012 11:33:04 UTC+2 schrieb Elazar Leibovich:
    In certain use cases, Java GC pauses impose real problem in certain use
    cases, see e.g. [1] [2]. Azul systems have an interesting solution which
    sounds really appealing [3], I know they're selling VMs whose main benefit
    is a pauseless GC algorithm.

    Is Go considering similar approach? I'm have very little understanding
    in GC, but it looks like pauseless GC will be a unique selling point for
    Go, not offered by any other open source language.

    Are there plans to support such an algorithm?

    [1] http://www.cloudera.com/blog/2011/02/avoiding-full-
    gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
    [2] http://www.slideshare.net/cloudera/hbase-hug-presentation
    [3] http://static.usenix.org/events/vee05/full_papers/p46-click.pdf
    --
    ---
    You received this message because you are subscribed to a topic in the
    Google Groups "golang-dev" group.
    To unsubscribe from this topic, visit
    https://groups.google.com/d/topic/golang-dev/GvA0DaCI2BU/unsubscribe.
    To unsubscribe from this group and all its topics, send an email to
    golang-dev+...@googlegroups.com <javascript:>.
    For more options, visit https://groups.google.com/groups/opt_out.
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Rüdiger Möller at Dec 25, 2013 at 3:06 pm
    The issues with GC persist since smalltalk's days. I think one should
    explore opportunities how to relax the burden put onto garbage collectors
    by trying to do something inbetween fully automated gc and malloc/free
    allocation. Having tested Azuls C4 i can confirm it is really impressing
    and showed me some program-induced latencies which we attributed to GC
    before :-).
    However it seems to be very complex to implement, I always get a headach
    when Gil starts going to the details of concurrent GC in a concurrent
    system :-).

    As memory has become very cheap, heaps beyond 64GB will become the norm.

    What are we doing in Java to deal with that ?
    In many cases it is easy to identify the data structures which are
    semi-static and will make up the bunch of heap consumption. In Java we put
    this stuff 'off-heap' (e.g. fast-serialization structs) and access it via a
    transaction internal API facade. This memory is managed manually then,
    however this is not that big of a problem, stuff is copied to the "GC'ed"
    heap in case (or referenced for simple cases). So we have a 2-heap model,
    where the dynamic data (frequently <500mb) is gc-managed, semi-static data
    is managed manually. However dealing with off-heap in Java is a big hack.

    I could imagine introducing a dual heap model at language level. Objects
    living in the manual heap are forbidden to reference objects in the GC'ed
    Heap this should be checked at runtime). The application needs to deep-copy
    an object to move it from manual to gc'ed heap and vice versa.
    Objects in the GC'ed heap could be allowed to contain references to manual
    heap as we'd like to contain the size of memory being scanned.

    This way one can still write fully gc'ed applications, but there is an
    escape in case you have to deal with 80 GByte of reference data :-). For
    extreme realtime requirements, parts of an application can choose to stay
    completely GC free.

    Regarding reference counting: afaik locality suffers a lot because with a
    store 2 memory regions are affected, so the performance degradation will be
    pretty hard due to cache misses.





    2013/12/25 ⚛ <0xe2.0x9a.0x9b@gmail.com>
    On Wednesday, December 25, 2013 8:44:49 AM UTC+1, Elazar Leibovich wrote:

    Won't cycle breaking still cause stop-the-world pauses? You still need to
    scan the entire heap from time to time.
    I do not want to comment on this now because the Tul compiler isn't there
    yet. Garbage collection experience with Tul will be different from the
    experience people have with other compilers.

    I imagine that (if the Tulgo compiler is ever finished) the biggest
    obstacle for use by people other than myself will be how to gain access to
    the compiler.

    On Wed, Dec 25, 2013 at 9:39 AM, ⚛ wrote:

    The incomplete Tulgo compiler is aiming for immediate deallocation of
    objects that aren't forming a cycle. Tulgo uses reference counting as the
    starting point for implementing the garbage collector, while the standard
    Go compiler chose the stop-the-world garbage collector as the starting
    point.


    On Tuesday, December 24, 2013 11:02:33 PM UTC+1, moru...@gmail.comwrote:
    Having built large realtime data processing systems, i'd like to point
    out that from a industry perspective, GC is one of the most important
    issues if not the most important.
    Especially in distributed, latency sensitive message driven systems
    (multicast messaging dislikes GC), GC is the major issue when using Java.
    I'd say in many cases losing 30% performance to get a near pauseless GC
    could be considered a good tradeoff.
    In fact the major point in favouring C++ over Java is GC, not
    performance (with modern class libs and smart pointers, C++ performance is
    below Java frequently).
    I can't see GO squeeze inbetween C++ and Java without a significant
    improvement in GC.

    As Gil Tene said above, the later one starts adressing the issue, the
    more expensive it gets. Should be top priority



    Am Mittwoch, 10. Oktober 2012 11:33:04 UTC+2 schrieb Elazar Leibovich:
    In certain use cases, Java GC pauses impose real problem in certain
    use cases, see e.g. [1] [2]. Azul systems have an interesting solution
    which sounds really appealing [3], I know they're selling VMs whose main
    benefit is a pauseless GC algorithm.

    Is Go considering similar approach? I'm have very little understanding
    in GC, but it looks like pauseless GC will be a unique selling point for
    Go, not offered by any other open source language.

    Are there plans to support such an algorithm?

    [1] http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-
    in-hbase-with-memstore-local-allocation-buffers-part-1/
    [2] http://www.slideshare.net/cloudera/hbase-hug-presentation
    [3] http://static.usenix.org/events/vee05/full_papers/p46-click.pdf
    --
    ---
    You received this message because you are subscribed to a topic in the
    Google Groups "golang-dev" group.
    To unsubscribe from this topic, visit https://groups.google.com/d/
    topic/golang-dev/GvA0DaCI2BU/unsubscribe.
    To unsubscribe from this group and all its topics, send an email to
    golang-dev+...@googlegroups.com.

    For more options, visit https://groups.google.com/groups/opt_out.
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • 2paul De at Dec 26, 2013 at 7:41 pm

    On Tuesday, December 24, 2013 11:02:33 PM UTC+1, Rüdiger Möller wrote:
    Having built large realtime data processing systems, i'd like to point out
    that from a industry perspective, GC is one of the most important issues if
    not the most important.
    Especially in distributed, latency sensitive message driven systems
    (multicast messaging dislikes GC), GC is the major issue when using Java.
    I'd say in many cases losing 30% performance to get a near pauseless GC
    could be considered a good tradeoff.
    In fact the major point in favouring C++ over Java is GC, not performance
    (with modern class libs and smart pointers, C++ performance is below Java
    frequently).
    I can't see GO squeeze inbetween C++ and Java without a significant
    improvement in GC. As Gil Tene said above, the later one starts adressing
    the issue, the more expensive it gets. Should be top priority
    As far as prioritys go, have you read this thread?: Go compiler and other
    tools written in Go
    https://groups.google.com/forum/#!searchin/golang-dev/russ$20cox/golang-dev/M_FU0599nZ0/0aLTwyEuY1sJ


    It looks to me like the prioritys for the next 12 months have already been
    set with sufficient ambition. Since in this process it is planned that
    the code will be refactored and restructed into idomatic go, I suspect that
    there may be openings (I am just speculating) for improving and innovating
    storage management as well. At least I think its a good opportunity to get
    involved.

    Frankly a 30% performace hit looks like a very tough sale to me. I have
    come to understand that maintenance is also a big consideration.
    Am Mittwoch, 10. Oktober 2012 11:33:04 UTC+2 schrieb Elazar Leibovich:
    In certain use cases, Java GC pauses impose real problem in certain use
    cases, see e.g. [1] [2]. Azul systems have an interesting solution which
    sounds really appealing [3], I know they're selling VMs whose main benefit
    is a pauseless GC algorithm.

    Is Go considering similar approach? I'm have very little understanding in
    GC, but it looks like pauseless GC will be a unique selling point for Go,
    not offered by any other open source language.

    Are there plans to support such an algorithm?

    [1]
    http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
    [2] http://www.slideshare.net/cloudera/hbase-hug-presentation
    [3] http://static.usenix.org/events/vee05/full_papers/p46-click.pdf
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Dave Cheney at Dec 26, 2013 at 10:19 pm

    On 27 Dec 2013, at 6:41, 2paul.de@googlemail.com wrote:


    On Tuesday, December 24, 2013 11:02:33 PM UTC+1, Rüdiger Möller wrote:
    Having built large realtime data processing systems, i'd like to point out that from a industry perspective, GC is one of the most important issues if not the most important.
    Especially in distributed, latency sensitive message driven systems (multicast messaging dislikes GC), GC is the major issue when using Java.
    I'd say in many cases losing 30% performance to get a near pauseless GC could be considered a good tradeoff.
    In fact the major point in favouring C++ over Java is GC, not performance (with modern class libs and smart pointers, C++ performance is below Java frequently).
    I can't see GO squeeze inbetween C++ and Java without a significant improvement in GC. As Gil Tene said above, the later one starts adressing the issue, the more expensive it gets. Should be top priority
    As far as prioritys go, have you read this thread?: Go compiler and other tools written in Go
    https://groups.google.com/forum/#!searchin/golang-dev/russ$20cox/golang-dev/M_FU0599nZ0/0aLTwyEuY1sJ

    It looks to me like the prioritys for the next 12 months have already been set with sufficient ambition. Since in this process it is planned that the code will be refactored and restructed into idomatic go, I suspect that there may be openings (I am just speculating) for improving and innovating storage management as well. At least I think its a good opportunity to get involved.
    I cannot think of any other runtime, save maybe C with LD_LOAD_PATH hacks that let's you interchange your gc algorithm like the JVM does. Pluggable gc should be viewed as an outlier, one that has generated it's own subspecialty of tooling and consulting, rather than the goal of every language runtime.
    Frankly a 30% performace hit looks like a very tough sale to me. I have come to understand that maintenance is also a big consideration.
    I agree. Gil is amazingly important and Azul solves problems for people who have no other choice, but I think it is still preferable to avoid creating the garbage in the first place, which is what Go allows you to do if you are careful.
    Am Mittwoch, 10. Oktober 2012 11:33:04 UTC+2 schrieb Elazar Leibovich:
    In certain use cases, Java GC pauses impose real problem in certain use cases, see e.g. [1] [2]. Azul systems have an interesting solution which sounds really appealing [3], I know they're selling VMs whose main benefit is a pauseless GC algorithm.

    Is Go considering similar approach? I'm have very little understanding in GC, but it looks like pauseless GC will be a unique selling point for Go, not offered by any other open source language.

    Are there plans to support such an algorithm?

    [1] http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
    [2] http://www.slideshare.net/cloudera/hbase-hug-presentation
    [3] http://static.usenix.org/events/vee05/full_papers/p46-click.pdf
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Rüdiger Möller at Dec 28, 2013 at 2:06 pm
    "
    but I think it is still preferable to avoid creating the garbage in the
    first place, which is what Go allows you to do if you are careful.
    "

    That does not help at all if you have 64GByte of data allocated (which is
    only 25%-50% of a modern server's available memory). At some point there
    will be a GC and it will stop your go/java program for >20..40seconds in
    case of 64GByte heap size. (Full)GC duration depends on the heap size not
    the amount of garbage created.

    There is an underlying assumption in all GC implementations, that the size
    of avaiable memory scales proportional with memory traversal speed. But
    this does not hold true. Generational GC and/or careful programming style
    reduce frequency, but not duration of full GC's. Even fully concurrent Azul
    style collection suffers, because the more memory size and traversal speed
    divide, the more buffer space is required, so it might easily require >3
    times heap size compared to actual data size.

    I really think a >2 heap memory model is urgently needed. It just
    generalizes the workarounds currently made.

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Dave Cheney at Dec 28, 2013 at 2:08 pm

    On Sun, Dec 29, 2013 at 1:06 AM, Rüdiger Möller wrote:
    "
    but I think it is still preferable to avoid creating the garbage in the
    first place, which is what Go allows you to do if you are careful.
    "

    That does not help at all if you have 64GByte of data allocated (which is
    only 25%-50% of a modern server's available memory). At some point there
    will be a GC and it will stop your go/java program for >20..40seconds in
    case of 64GByte heap size. (Full)GC duration depends on the heap size not
    the amount of garbage created.

    There is an underlying assumption in all GC implementations, that the size
    of avaiable memory scales proportional with memory traversal speed. But this
    does not hold true. Generational GC and/or careful programming style reduce
    frequency, but not duration of full GC's. Even fully concurrent Azul style
    collection suffers, because the more memory size and traversal speed divide,
    the more buffer space is required, so it might easily require >3 times heap
    size compared to actual data size.

    I really think a >2 heap memory model is urgently needed. It just
    generalizes the workarounds currently made.
    I think a model that allows you to store values outside the heap is
    what is needed.
    --

    ---
    You received this message because you are subscribed to the Google Groups
    "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • ⚛ at Dec 28, 2013 at 3:43 pm
    I find it to hard to reason about this without having actual example code.

    If it is possible to reduce your application's code into no more than 100
    lines of Go code demonstrating the core problem, please send the code to
    golang-dev. I will add try to analyse it and add it to the Tul compiler as
    a benchmark. The code should import standard Go packages only.
    On Saturday, December 28, 2013 3:06:52 PM UTC+1, Rüdiger Möller wrote:

    "
    but I think it is still preferable to avoid creating the garbage in the
    first place, which is what Go allows you to do if you are careful.
    "

    That does not help at all if you have 64GByte of data allocated (which is
    only 25%-50% of a modern server's available memory). At some point there
    will be a GC and it will stop your go/java program for >20..40seconds in
    case of 64GByte heap size. (Full)GC duration depends on the heap size not
    the amount of garbage created.

    There is an underlying assumption in all GC implementations, that the size
    of avaiable memory scales proportional with memory traversal speed. But
    this does not hold true. Generational GC and/or careful programming style
    reduce frequency, but not duration of full GC's. Even fully concurrent Azul
    style collection suffers, because the more memory size and traversal speed
    divide, the more buffer space is required, so it might easily require >3
    times heap size compared to actual data size.

    I really think a >2 heap memory model is urgently needed. It just
    generalizes the workarounds currently made.
    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Rüdiger Möller at Dec 28, 2013 at 2:13 pm
    "store values outside heap"

    read my post, that's what I proposed. However why abandon a language's
    object model to store values off-heap ?. By adding a second manually
    managed heap with proper migration semantics (GC Heap <=> manual managed
    heap), use of off-heap memory could be much smoother and faster than with
    in-memory databases/cache workarounds. It would still be possible to access
    off-heap objects directly from the language.

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Dmitry Vyukov at Dec 28, 2013 at 2:25 pm

    On Sat, Dec 28, 2013 at 6:13 PM, Rüdiger Möller wrote:
    "store values outside heap"

    read my post, that's what I proposed. However why abandon a language's
    object model to store values off-heap ?. By adding a second manually managed
    heap with proper migration semantics (GC Heap <=> manual managed heap), use
    of off-heap memory could be much smoother and faster than with in-memory
    databases/cache workarounds. It would still be possible to access off-heap
    objects directly from the language.

    It's already available in Go.
    If you put your big data in big slices of structs w/o pointers, then
    GC won't scan that slices at all. This allows you to have 128GB heap,
    but GC will work as if the heap is 128MB.

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Rüdiger Möller at Dec 28, 2013 at 2:43 pm

    It's already available in Go.
    If you put your big data in big slices of structs w/o pointers, then
    GC won't scan that slices at all. This allows you to have 128GB heap,
    but GC will work as if the heap is 128MB.
    That's nice, but it still prevents use of common language structures like
    hashmaps etc. Adding a second"manual heap", then disallowing pointers from
    manual to GC'ed heap (and vice versa), would allow nearby uniform reuse of
    classes and data structures wether they are allocated in GC'ed heap or in
    "Manual" heap. This could also open really convenient data management down
    the memory hierarchy (e.g. persisting object graphs to SSD or HD).

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-dev @
categoriesgo
postedOct 10, '12 at 9:33a
activeDec 28, '13 at 3:43p
posts40
users15
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase