FAQ
I must create a lot of objects.
My concern is to ease the work of the garbage collector, to avoid frequent
and long pauses.

What is your opinion about the following sentences ?

- Instead of creating 100'000 objects of type T separately, is it better to
create one big array of 100'000 elements [100000]T ?

- Each of these objects contains 8 pointers *T to other objects. Is it
better to replace these pointers *T by indices (int) into the array of
objects ?

- The frequency of the GC is proportional to the garbage creation rate ?

- The GC duration is proportional to the total number of pointers used in
all live objects (mark phase), and the number of objects to free (sweep
phase) ?

- Each time a pointer is updated by the Go code, is it true that the
garbage collector must do some processing ?

- What is the most important, finally:
     - avoid creating garbage ?
     - avoid creating too many objects ?
     - avoid creating too many pointers ?
     - avoid code that update pointers with high frequency ?

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

  • Daniel Skinner at Oct 26, 2014 at 9:17 pm
    Certainly, someone more knowledgeable will comment on these things but in
    my experience what you mention is correct though I was working in the area
    of millions of objects that contained pointers to tens of millions of
    objects. I would suggest writing benchmarks to see if you should even care
    though. In my case, I was wanting to recreate these large structures many
    times per second and the related calculations were cheap enough that the GC
    overhead wasn't really call for. Also, I found it more fitting to provide a
    copy of an object since the original data structure that I was working with
    could change so this made sense for me as well.

    So in all, benchmark, see if you really need to be concerned about this.
    Profiling your tests can be really helpful here as well.
    On Sunday, October 26, 2014 3:48:07 PM UTC-5, nicolas...@gmail.com wrote:

    I must create a lot of objects.
    My concern is to ease the work of the garbage collector, to avoid frequent
    and long pauses.

    What is your opinion about the following sentences ?

    - Instead of creating 100'000 objects of type T separately, is it better
    to create one big array of 100'000 elements [100000]T ?

    - Each of these objects contains 8 pointers *T to other objects. Is it
    better to replace these pointers *T by indices (int) into the array of
    objects ?

    - The frequency of the GC is proportional to the garbage creation rate ?

    - The GC duration is proportional to the total number of pointers used in
    all live objects (mark phase), and the number of objects to free (sweep
    phase) ?

    - Each time a pointer is updated by the Go code, is it true that the
    garbage collector must do some processing ?

    - What is the most important, finally:
    - avoid creating garbage ?
    - avoid creating too many objects ?
    - avoid creating too many pointers ?
    - avoid code that update pointers with high frequency ?
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Tahir at Oct 26, 2014 at 9:22 pm
    How do you discard your objects ? Can two objects T hold the same
    references (at least partially) ?

    What is your cache line(s) size ?
    On Sunday, October 26, 2014 8:48:07 PM UTC, nicolas...@gmail.com wrote:

    I must create a lot of objects.
    My concern is to ease the work of the garbage collector, to avoid frequent
    and long pauses.

    What is your opinion about the following sentences ?

    - Instead of creating 100'000 objects of type T separately, is it better
    to create one big array of 100'000 elements [100000]T ?

    - Each of these objects contains 8 pointers *T to other objects. Is it
    better to replace these pointers *T by indices (int) into the array of
    objects ?

    - The frequency of the GC is proportional to the garbage creation rate ?

    - The GC duration is proportional to the total number of pointers used in
    all live objects (mark phase), and the number of objects to free (sweep
    phase) ?

    - Each time a pointer is updated by the Go code, is it true that the
    garbage collector must do some processing ?

    - What is the most important, finally:
    - avoid creating garbage ?
    - avoid creating too many objects ?
    - avoid creating too many pointers ?
    - avoid code that update pointers with high frequency ?
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Tahir at Oct 26, 2014 at 9:43 pm
    Btw, it seems like you could have :

    type T [8]int32

    and define L:= [100000]T{} and T would just be a permutation of 8 integers
    between 0 and 100,000.

    Maybe your example is a bit too contrived or I am mistaken.
    On Sunday, October 26, 2014 8:48:07 PM UTC, nicolas...@gmail.com wrote:

    I must create a lot of objects.
    My concern is to ease the work of the garbage collector, to avoid frequent
    and long pauses.

    What is your opinion about the following sentences ?

    - Instead of creating 100'000 objects of type T separately, is it better
    to create one big array of 100'000 elements [100000]T ?

    - Each of these objects contains 8 pointers *T to other objects. Is it
    better to replace these pointers *T by indices (int) into the array of
    objects ?

    - The frequency of the GC is proportional to the garbage creation rate ?

    - The GC duration is proportional to the total number of pointers used in
    all live objects (mark phase), and the number of objects to free (sweep
    phase) ?

    - Each time a pointer is updated by the Go code, is it true that the
    garbage collector must do some processing ?

    - What is the most important, finally:
    - avoid creating garbage ?
    - avoid creating too many objects ?
    - avoid creating too many pointers ?
    - avoid code that update pointers with high frequency ?
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Tahir at Oct 26, 2014 at 9:52 pm
    *between 0 and 99,999* (wooops)
    On Sunday, October 26, 2014 9:43:53 PM UTC, Tahir wrote:

    Btw, it seems like you could have :

    type T [8]int32

    and define L:= [100000]T{} and T would just be a permutation of 8 integers
    between 0 and 100,000.

    Maybe your example is a bit too contrived or I am mistaken.
    On Sunday, October 26, 2014 8:48:07 PM UTC, nicolas...@gmail.com wrote:

    I must create a lot of objects.
    My concern is to ease the work of the garbage collector, to avoid
    frequent and long pauses.

    What is your opinion about the following sentences ?

    - Instead of creating 100'000 objects of type T separately, is it better
    to create one big array of 100'000 elements [100000]T ?

    - Each of these objects contains 8 pointers *T to other objects. Is it
    better to replace these pointers *T by indices (int) into the array of
    objects ?

    - The frequency of the GC is proportional to the garbage creation rate ?

    - The GC duration is proportional to the total number of pointers used in
    all live objects (mark phase), and the number of objects to free (sweep
    phase) ?

    - Each time a pointer is updated by the Go code, is it true that the
    garbage collector must do some processing ?

    - What is the most important, finally:
    - avoid creating garbage ?
    - avoid creating too many objects ?
    - avoid creating too many pointers ?
    - avoid code that update pointers with high frequency ?
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Andrew Gerrand at Oct 26, 2014 at 10:01 pm

    On Mon Oct 27 2014 at 7:48:12 AM wrote:

    I must create a lot of objects.
    My concern is to ease the work of the garbage collector, to avoid frequent
    and long pauses.

    What is your opinion about the following sentences ?

    - Instead of creating 100'000 objects of type T separately, is it better
    to create one big array of 100'000 elements [100000]T ?
    This might help, because the GC can more easily see that whole range of
    values is still in use.

    - Each of these objects contains 8 pointers *T to other objects. Is it
    better to replace these pointers *T by indices (int) into the array of
    objects ?
    Probably not if you're already using the above trick, but benchmarking and
    profiling will tell the whole story.

    - The frequency of the GC is proportional to the garbage creation rate ?
    GC is triggered by allocations (when the heap size reaches some percentage
    of the heap when the GC was last run). If your program is not allocating
    then the GC will not run.

    - The GC duration is proportional to the total number of pointers used in
    all live objects (mark phase), and the number of objects to free (sweep
    phase) ?
    This is true, I believe.

    - Each time a pointer is updated by the Go code, is it true that the
    garbage collector must do some processing ?
    What do you mean by updating a pointer?

    - What is the most important, finally:
    - avoid creating garbage ?
    The most important, as if you're not allocating then you won't trigger the
    GC.

    - avoid creating too many objects ?
    Also important, as GC time is proportional to the number of objects in the
    heap.

    - avoid creating too many pointers ?
    Not so relevant.

    - avoid code that update pointers with high frequency ?
    Not sure what this means.

    Andrew

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Nicolas Riesch at Oct 26, 2014 at 11:31 pm
    - Each time a pointer is updated by the Go code, is it true that the
    garbage collector must do some processing ?
    What do you mean by updating a pointer?
    I am sorry, I realize my assumption was incorrect.
    I thought that Go 1.3 has an incremental GC. It is precise, but not
    incremental yet.

    The problem that some work must be done each time a pointer is updated
    exists only with incremental GC, as indicated in the document below.
    But the impact of this "write barrier work" should not be significant,
    after all...

    Go 1.4+ Garbage Collection (GC) Plan and Roadmap

    https://docs.google.com/document/d/16Y4IsnNRCN43Mx0NZc5YXZLovrHvvLhK_h0KN8woTO4/edit

         "In the sometimes redundant jargon of GC what we propose for 1.5 is a
    non-generational, non-moving, concurrent, tri-color, mark and sweep
    collector.
          The mark phase establishes which objects are reachable.
          The sweep phase establishes which objects are unreachable. Both will
    be able to run concurrently with the mutators.
          Of course achieving low latency will impact the throughput of the
    mutators.
          In particular the mutators will have to monitor whether the mark phase
    is active and* if so perform some write barrier work when mutating
    pointers.*
          Furthermore each mutator will occasionally be asked to suspend and
    scan its stack.
          Similar to what we do today, mutator allocation may require the
    mutator doing sweep work on behalf of the GC.
         "


    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Nicolas Riesch at Oct 27, 2014 at 12:03 am
    @Daniel Skinner:

    By curiosity, did you create these tens of millions of objects from scratch
    many times per second, with acceptable performance ?
    Or did you put them in a cache for reuse ?

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Daniel Skinner at Oct 27, 2014 at 2:57 am
    Well, yes I created them each second (but not the tens of millions ones),
    and no, the performance didn't permit what I mispoke before in the tens of
    millions (sorry!). You can see for yourself the project here:

    https://github.com/dskinner/htm

    A level 11 subdivided sphere takes about 1.4 seconds on my laptop and would
    produce roughly 4.2 million nodes if I'm remembering my math right :) but
    then there is also a data structure that tracks edges during subdivision
    for reuse later which produces quite a bit of data on top of that (the tens
    of millions).

    I doubt I could squeeze much more performance out of it that would be
    significant with the current approach, but in practice, the GC overhead was
    shy of 35% when I used pointers and dropping pointers allowed me to achieve
    what I was after at the time which was partial subdivisions down to level
    20 at about 30 frames per second.

    I don't recall how much data that was producing though but I'd ball park it
    in the 60k range of nodes in the tree plus accompanying edges.
    Or did you put them in a cache for reuse ?
    what I'm doing could very easily be cached to disk but I'm not doing that.
    On Sunday, October 26, 2014 7:03:24 PM UTC-5, nicolas...@gmail.com wrote:

    @Daniel Skinner:

    By curiosity, did you create these tens of millions of objects from
    scratch many times per second, with acceptable performance ?
    Or did you put them in a cache for reuse ?
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Tahir at Oct 27, 2014 at 6:53 am
    I think he might have meant using a Pool of objects as in sync/Pool for
    caching.

    Might be my fault because I asked about the processor cache line size. This
    is a tiny bit orthogonal to the topic of GC but since he mentioned having a
    big array of objects it made me think that avoiding false sharing could be
    something to be kept in mind too.
    Still that's an issue that is only relevant if multiple processors are in
    play and I suspect that having exactly 8 pointers in T is an informed
    choice.

    (I still think that with the datastructure you presented, you could forego
    the use of pointers completely. Btw I've been bugging, T would be an array
    of uint64 not int32)

    Note that with `potentially` so many cross referencing objects, not sure
    any of them ever becomes garbage.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Daniel Skinner at Oct 27, 2014 at 1:27 pm
    I think he might have meant using a Pool of objects as in sync/Pool for
    caching

    Ah, well I did try that out but it didn't help my particular situation.

    Not to hijack the thread but to reiterate benchmarks (and the intel link
    from Dmitry is a great resource), for my case i found two different
    approaches for growing storage of my quad tree and my edges worked best.

    As new nodes for my tree are created, simply appending those to a slice
    instead of trying to allocate space up front worked best for the general
    case.

    As new edges are created, I guess due to the sheer number, I found that
    using an approach similar to bytes.Buffer for pre-allocating space plus
    growing the capacity by *= 2 provided significant speed improvements.

    Now, my code for edges as-is actually requires this due to how it's used,
    but before it did I found this rather curious and the benchmarks and
    profiles helped keep me from unnecessarily complicating code.
    On Mon, Oct 27, 2014 at 1:53 AM, Tahir wrote:

    I think he might have meant using a Pool of objects as in sync/Pool for
    caching.

    Might be my fault because I asked about the processor cache line size.
    This is a tiny bit orthogonal to the topic of GC but since he mentioned
    having a big array of objects it made me think that avoiding false sharing
    could be something to be kept in mind too.
    Still that's an issue that is only relevant if multiple processors are in
    play and I suspect that having exactly 8 pointers in T is an informed
    choice.

    (I still think that with the datastructure you presented, you could forego
    the use of pointers completely. Btw I've been bugging, T would be an array
    of uint64 not int32)

    Note that with `potentially` so many cross referencing objects, not sure
    any of them ever becomes garbage.

    --
    You received this message because you are subscribed to the Google Groups
    "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Dmitry Vyukov at Oct 27, 2014 at 8:05 am

    On Mon, Oct 27, 2014 at 12:48 AM, wrote:
    I must create a lot of objects.
    My concern is to ease the work of the garbage collector, to avoid frequent
    and long pauses.

    What is your opinion about the following sentences ?

    - Instead of creating 100'000 objects of type T separately, is it better to
    create one big array of 100'000 elements [100000]T ?
    Most likely yes. But note that in this case scanning of the array
    won't be parallelized. It's also highly tied to the next point.
    - Each of these objects contains 8 pointers *T to other objects. Is it
    better to replace these pointers *T by indices (int) into the array of
    objects ?
    There is a radical difference between an object with pointer and an
    objects w/o pointers at all. The latter is extremely efficient for GC.
    So if these pointers are the only pointers in the object, then, yes,
    it's worth doing if GC is a problem.

    - The frequency of the GC is proportional to the garbage creation rate ? Yes.
    - The GC duration is proportional to the total number of pointers used in
    all live objects (mark phase), and the number of objects to free (sweep
    phase) ?
    Depending on what you mean by GC duration. Check out:
    https://software.intel.com/en-us/blogs/2014/05/10/debugging-performance-issues-in-go-programs
    (Garbage Collector Trace section).



    - Each time a pointer is updated by the Go code, is it true that the garbage
    collector must do some processing ?
    This is currently not true. But will be true in future releases. Also
    usually the additional work must be done only if GC is currently in
    progress.
    - What is the most important, finally:
    - avoid creating garbage ?
    - avoid creating too many objects ?
    - avoid creating too many pointers ?
    - avoid code that update pointers with high frequency ?

    Creating less garbage for less frequent GCs.
    Preferring larger objects w/o pointers for faster GC.

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Frank Schröder at Oct 27, 2014 at 8:15 am
    One of the things that helped us was to replace arrays of pointers with
    arrays of structs, i.e. lotsOfTs []T instead of lotsOfTs []*T

    It depends on how volatile your objects are, whether you delete them often.
    You might need to use a free list for managing the deleted objects.

    In our case that is not the case we don't have to delete individual
    objects. We can always update the objects content or replace the whole
    array.

    Frank

    --
    You received this message because you are subscribed to the Google Groups "golang-nuts" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedOct 26, '14 at 8:48p
activeOct 27, '14 at 1:27p
posts13
users6
websitegolang.org

People

Translate

site design / logo © 2021 Grokbase