On Saturday, November 17, 2012 2:06:59 PM UTC+1, Dmitry Vyukov wrote:On Sat, Nov 17, 2012 at 4:50 PM, ⚛ <0xe2.0x...@gmail.com <javascript:>>wrote:
On Saturday, November 17, 2012 12:49:40 PM UTC+1, Dmitry Vyukov wrote:
I am curious have you considered reordering fields in structs so that
pointers are packed together? I understand that it's impossible in general
case (when there are sub-structs), but in otherwise I think it's OK to
arbitrary reorder fields. Then you can say in the metainfo -- in this
object of size 128 scan only 4 words. This is, of course, complicates
things.
There is cgo so reordering fields in general is forbidden. Also the
runtime is sharing certain Go types and C types.
In my opinion, if GC already knows where the pointer fields are in the
struct (knows their byte-offsets in the struct) then reordering the fields
will not improve performance.
One reason for this is that in any case the GC needs to check the value
of every pointer in the struct. The performance gain obtained by reordering
fields seems negligible in comparison to checking the values.
A second reason is that while walking the heap each step ("instruction")
in the GC implementation should generate a finite number of pointers.
Typically, 1 step generates at most 1 pointer that needs to be checked.
There should be an upper bound on the number of pointers generated in one
step, so considering N consecutive words in one step would be problematic
if N can be an arbitrary number.
Each pointer needs to be considered separately and there is absolutely no
relation between any two pointers P and Q. This uncertainty appears to be
the main reason why reordering fields wouldn't make the GC much faster.
One area where reordering fields would help is when the structure is big,
the non-pointer fields in the structure are forming a gap, and this gap is
putting pointer fields on different cachelines.
I was thinking about an ideal case where you have only 1 instruction than
says "first N fields are pointers", and you just need a single for loop
1..N. It should be not much slower than current non-precise gc in situation
when structs contain only points.
But I think your analysis is correct.
Do I get it right that with type info there is no need check whether the
pointer points to an allocated chunk of memory. It can be either 0 or a
valid pointer to a valid allocated memory chunk, no other cases possible,
right?
It can be a pointer from C, or if the GC does not know the actual type of
the object then it may be an integer (not a pointer).
And additionally you know size of the pointed-to object -- it's either
determined by type, or if it's a slice then size is in the subsequent word.
Right?
In general, no. The actual size of an object is unknown to GC, because in
some cases it cannot be inferred from the type of the pointer. There are
however cases when the GC knows the actual type of the object and thus
knows the actual size.
The GC code is robust in the sense that it will work even if there is no
type information available about any object. If some type information is
available, GC will use it and may be able to free more objects.
It is impossible to decide whether to prefer the partial typeinfo about an
object or the full typeinfo. Getting the full typeinfo is more costly than
getting the partial one. In some cases the full typeinfo isn't available at
all. The GC implementation is primarily using the partial typeinfo and
tries to retrieve the full typeinfo only if something goes wrong.
The length and capacity of a slice are insufficient to determine where the
underlying array starts or ends. However, the knowledge that the object is
a slice can be used to determine the full typeinfo of the slice's element
(and thus the actual size of the slice's element). However, if the GC sees
a pointer (Go type: *T) to any part of the underlying array prior to seeing
that the block is actually of type [N]U, it will retrieve the full typeinfo
U - if U can be determined and T is insufficient. Typeinfo T may be
sufficient to process the whole array.
The easiest values to GC are for example Go's maps because they are
completely self-contained and it is impossible to get a pointer to their
interior.
--