FAQ
Hi all,

Could I have a couple of reviews for this change? This is on the
critical-watch list for hs23, so it would be great if I could get some
reviews already today.

http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/

Some background:

There are two problems that this fix solves:

(1) When we run with -XX:+UseNUMA we have to make the heap parsable
since we leave some holes in the heap. We do this by faking Integer
arrays in those places. However, on a large heap those holes can be
larger than what can be covered by a single Integer array. The fix is to
allocate several arrays in those cases.

The easy fix would have been to call CollectedHeap::fill_with_objects()
instead of CollectedHeap::fill_with_object(). Note the extra "s" in the
method name. But MutableNUMASpace::ensure_parsability(), where the
change is needed, need to know about all the objects that are allocated
so I saw no better solution than to implement my own loop. Any ideas on
how I could re-use fill_with_objects() instead are welcome.

(2) CollectedHeap::_filler_array_max_size should be the maximum size
that can be covered by a fake Integer array. This value is in words, but
since the word size is different on different platforms but the Integer
size is always the same we need to convert between word and sizeof(jint)
at some point. Unfortunately we do the conversion in the wrong
direction, which means that on 64 bit platforms we end up with a max
size that is 4 times larger than intended. This in turn makes us
overflow an int when we convert from a word size to the length of the
array that we are setting up.

Here is the code that overflows:

328 void
329 CollectedHeap::fill_with_array(HeapWord* start, size_t words, bool
zap)
330 {
331 assert(words >= filler_array_min_size(), "too small for an array");
332 assert(words <= filler_array_max_size(), "too big for a single
object");
333
334 const size_t payload_size = words - filler_array_hdr_size();
335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
336
337 // Set the length first for concurrent GC.
338 ((arrayOop)start)->set_length((int)len);
339 post_allocation_setup_common(Universe::intArrayKlassObj(),
start, words);
340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
341 }

If filler_array_max_size() on line 332, which returns
CollectedHeap::_filler_array_max_size, has a too large value we will
overflow the int conversation on line 338.

Testing:
This fix solves the issue that was found in the reproducer that I could
set up on a Linux x64 machine. I have asked the one who initially
reported the issue to run on their Solaris amd64 setup. I will also try
to set up a reproducer on a Solaris machine.

I also ran the fix through JPRT. It did not fail, but there are no NUMA
tests in there as far as I know. But the change for issue (2) was
hopefully tested.

Thanks,
Bengt

Search Discussions

  • Azeem Jiva at Mar 23, 2012 at 6:17 am
    Looks good.


    Azeem Jiva
    @javawithjiva

    On 03/23/2012 07:56 AM, Bengt Rutisson wrote:

    Hi all,

    Could I have a couple of reviews for this change? This is on the
    critical-watch list for hs23, so it would be great if I could get some
    reviews already today.

    http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap parsable
    since we leave some holes in the heap. We do this by faking Integer
    arrays in those places. However, on a large heap those holes can be
    larger than what can be covered by a single Integer array. The fix is
    to allocate several arrays in those cases.

    The easy fix would have been to call
    CollectedHeap::fill_with_objects() instead of
    CollectedHeap::fill_with_object(). Note the extra "s" in the method
    name. But MutableNUMASpace::ensure_parsability(), where the change is
    needed, need to know about all the objects that are allocated so I saw
    no better solution than to implement my own loop. Any ideas on how I
    could re-use fill_with_objects() instead are welcome.

    (2) CollectedHeap::_filler_array_max_size should be the maximum size
    that can be covered by a fake Integer array. This value is in words,
    but since the word size is different on different platforms but the
    Integer size is always the same we need to convert between word and
    sizeof(jint) at some point. Unfortunately we do the conversion in the
    wrong direction, which means that on 64 bit platforms we end up with a
    max size that is 4 times larger than intended. This in turn makes us
    overflow an int when we convert from a word size to the length of the
    array that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_array(HeapWord* start, size_t words,
    bool zap)
    330 {
    331 assert(words >= filler_array_min_size(), "too small for an
    array");
    332 assert(words <= filler_array_max_size(), "too big for a single
    object");
    333
    334 const size_t payload_size = words - filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length((int)len);
    339 post_allocation_setup_common(Universe::intArrayKlassObj(),
    start, words);
    340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns
    CollectedHeap::_filler_array_max_size, has a too large value we will
    overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer that I
    could set up on a Linux x64 machine. I have asked the one who
    initially reported the issue to run on their Solaris amd64 setup. I
    will also try to set up a reproducer on a Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there are no
    NUMA tests in there as far as I know. But the change for issue (2) was
    hopefully tested.

    Thanks,
    Bengt
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120323/c0edca54/attachment-0001.html
  • Bengt Rutisson at Mar 23, 2012 at 7:36 am
    Thanks, Azeem.

    Bengt
    On 2012-03-23 14:17, Azeem Jiva wrote:
    Looks good.


    Azeem Jiva
    @javawithjiva
    On 03/23/2012 07:56 AM, Bengt Rutisson wrote:

    Hi all,

    Could I have a couple of reviews for this change? This is on the
    critical-watch list for hs23, so it would be great if I could get
    some reviews already today.

    http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap parsable
    since we leave some holes in the heap. We do this by faking Integer
    arrays in those places. However, on a large heap those holes can be
    larger than what can be covered by a single Integer array. The fix is
    to allocate several arrays in those cases.

    The easy fix would have been to call
    CollectedHeap::fill_with_objects() instead of
    CollectedHeap::fill_with_object(). Note the extra "s" in the method
    name. But MutableNUMASpace::ensure_parsability(), where the change is
    needed, need to know about all the objects that are allocated so I
    saw no better solution than to implement my own loop. Any ideas on
    how I could re-use fill_with_objects() instead are welcome.

    (2) CollectedHeap::_filler_array_max_size should be the maximum size
    that can be covered by a fake Integer array. This value is in words,
    but since the word size is different on different platforms but the
    Integer size is always the same we need to convert between word and
    sizeof(jint) at some point. Unfortunately we do the conversion in the
    wrong direction, which means that on 64 bit platforms we end up with
    a max size that is 4 times larger than intended. This in turn makes
    us overflow an int when we convert from a word size to the length of
    the array that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_array(HeapWord* start, size_t words,
    bool zap)
    330 {
    331 assert(words >= filler_array_min_size(), "too small for an
    array");
    332 assert(words <= filler_array_max_size(), "too big for a single
    object");
    333
    334 const size_t payload_size = words - filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length((int)len);
    339 post_allocation_setup_common(Universe::intArrayKlassObj(),
    start, words);
    340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns
    CollectedHeap::_filler_array_max_size, has a too large value we will
    overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer that I
    could set up on a Linux x64 machine. I have asked the one who
    initially reported the issue to run on their Solaris amd64 setup. I
    will also try to set up a reproducer on a Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there are no
    NUMA tests in there as far as I know. But the change for issue (2)
    was hopefully tested.

    Thanks,
    Bengt
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120323/78162001/attachment.html
  • Vitaly Davidovich at Mar 23, 2012 at 6:45 am
    Hi Bengt,

    mutableNUMASpace line 99 has a small typo in the assert message: "Remaining
    siz" should be "Remaining size".

    Vitaly

    Sent from my phone
    On Mar 23, 2012 9:06 AM, "Bengt Rutisson" wrote:


    Hi all,

    Could I have a couple of reviews for this change? This is on the
    critical-watch list for hs23, so it would be great if I could get some
    reviews already today.

    http://cr.openjdk.java.net/~**brutisso/7103665/webrev.00/<http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/>

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap parsable since
    we leave some holes in the heap. We do this by faking Integer arrays in
    those places. However, on a large heap those holes can be larger than what
    can be covered by a single Integer array. The fix is to allocate several
    arrays in those cases.

    The easy fix would have been to call CollectedHeap::fill_with_**objects()
    instead of CollectedHeap::fill_with_**object(). Note the extra "s" in the
    method name. But MutableNUMASpace::ensure_**parsability(), where the
    change is needed, need to know about all the objects that are allocated so
    I saw no better solution than to implement my own loop. Any ideas on how I
    could re-use fill_with_objects() instead are welcome.

    (2) CollectedHeap::_filler_array_**max_size should be the maximum size
    that can be covered by a fake Integer array. This value is in words, but
    since the word size is different on different platforms but the Integer
    size is always the same we need to convert between word and sizeof(jint) at
    some point. Unfortunately we do the conversion in the wrong direction,
    which means that on 64 bit platforms we end up with a max size that is 4
    times larger than intended. This in turn makes us overflow an int when we
    convert from a word size to the length of the array that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_**array(HeapWord* start, size_t words, bool
    zap)
    330 {
    331 assert(words >= filler_array_min_size(), "too small for an array");
    332 assert(words <= filler_array_max_size(), "too big for a single
    object");
    333
    334 const size_t payload_size = words - filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length(**(int)len);
    339 post_allocation_setup_common(**Universe::intArrayKlassObj(),
    start, words);
    340 DEBUG_ONLY(zap_filler_array(**start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns
    CollectedHeap::_filler_array_**max_size, has a too large value we will
    overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer that I could
    set up on a Linux x64 machine. I have asked the one who initially reported
    the issue to run on their Solaris amd64 setup. I will also try to set up a
    reproducer on a Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there are no NUMA
    tests in there as far as I know. But the change for issue (2) was hopefully
    tested.

    Thanks,
    Bengt
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120323/1c8c65d5/attachment.html
  • Bengt Rutisson at Mar 23, 2012 at 7:33 am
    Hi Vitaly,
    On 2012-03-23 14:45, Vitaly Davidovich wrote:

    Hi Bengt,

    mutableNUMASpace line 99 has a small typo in the assert message:
    "Remaining siz" should be "Remaining size".
    Thanks for seeing it. Fixed.

    BTW, you don't have an OpenJDK username, right? If you consider this a
    review I can list you as "Also-reviewed-by", when I push.

    Thanks,
    Bengt
    Vitaly

    Sent from my phone

    On Mar 23, 2012 9:06 AM, "Bengt Rutisson" <bengt.rutisson at oracle.com
    wrote:


    Hi all,

    Could I have a couple of reviews for this change? This is on the
    critical-watch list for hs23, so it would be great if I could get
    some reviews already today.

    http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/
    <http://cr.openjdk.java.net/%7Ebrutisso/7103665/webrev.00/>

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap
    parsable since we leave some holes in the heap. We do this by
    faking Integer arrays in those places. However, on a large heap
    those holes can be larger than what can be covered by a single
    Integer array. The fix is to allocate several arrays in those cases.

    The easy fix would have been to call
    CollectedHeap::fill_with_objects() instead of
    CollectedHeap::fill_with_object(). Note the extra "s" in the
    method name. But MutableNUMASpace::ensure_parsability(), where the
    change is needed, need to know about all the objects that are
    allocated so I saw no better solution than to implement my own
    loop. Any ideas on how I could re-use fill_with_objects() instead
    are welcome.

    (2) CollectedHeap::_filler_array_max_size should be the maximum
    size that can be covered by a fake Integer array. This value is in
    words, but since the word size is different on different platforms
    but the Integer size is always the same we need to convert between
    word and sizeof(jint) at some point. Unfortunately we do the
    conversion in the wrong direction, which means that on 64 bit
    platforms we end up with a max size that is 4 times larger than
    intended. This in turn makes us overflow an int when we convert
    from a word size to the length of the array that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_array(HeapWord* start, size_t words,
    bool zap)
    330 {
    331 assert(words >= filler_array_min_size(), "too small for an
    array");
    332 assert(words <= filler_array_max_size(), "too big for a
    single object");
    333
    334 const size_t payload_size = words - filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length((int)len);
    339 post_allocation_setup_common(Universe::intArrayKlassObj(),
    start, words);
    340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns
    CollectedHeap::_filler_array_max_size, has a too large value we
    will overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer that I
    could set up on a Linux x64 machine. I have asked the one who
    initially reported the issue to run on their Solaris amd64 setup.
    I will also try to set up a reproducer on a Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there are no
    NUMA tests in there as far as I know. But the change for issue (2)
    was hopefully tested.

    Thanks,
    Bengt
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120323/099bfaa8/attachment.html
  • Vitaly Davidovich at Mar 23, 2012 at 7:37 am
    I'm not a reviewer, just a curious bystander. :)

    Sent from my phone
    On Mar 23, 2012 10:32 AM, "Bengt Rutisson" wrote:


    Hi Vitaly,

    On 2012-03-23 14:45, Vitaly Davidovich wrote:

    Hi Bengt,

    mutableNUMASpace line 99 has a small typo in the assert message:
    "Remaining siz" should be "Remaining size".


    Thanks for seeing it. Fixed.

    BTW, you don't have an OpenJDK username, right? If you consider this a
    review I can list you as "Also-reviewed-by", when I push.

    Thanks,
    Bengt

    Vitaly

    Sent from my phone
    On Mar 23, 2012 9:06 AM, "Bengt Rutisson" wrote:


    Hi all,

    Could I have a couple of reviews for this change? This is on the
    critical-watch list for hs23, so it would be great if I could get some
    reviews already today.

    http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap parsable since
    we leave some holes in the heap. We do this by faking Integer arrays in
    those places. However, on a large heap those holes can be larger than what
    can be covered by a single Integer array. The fix is to allocate several
    arrays in those cases.

    The easy fix would have been to call CollectedHeap::fill_with_objects()
    instead of CollectedHeap::fill_with_object(). Note the extra "s" in the
    method name. But MutableNUMASpace::ensure_parsability(), where the change
    is needed, need to know about all the objects that are allocated so I saw
    no better solution than to implement my own loop. Any ideas on how I could
    re-use fill_with_objects() instead are welcome.

    (2) CollectedHeap::_filler_array_max_size should be the maximum size that
    can be covered by a fake Integer array. This value is in words, but since
    the word size is different on different platforms but the Integer size is
    always the same we need to convert between word and sizeof(jint) at some
    point. Unfortunately we do the conversion in the wrong direction, which
    means that on 64 bit platforms we end up with a max size that is 4 times
    larger than intended. This in turn makes us overflow an int when we convert
    from a word size to the length of the array that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_array(HeapWord* start, size_t words, bool
    zap)
    330 {
    331 assert(words >= filler_array_min_size(), "too small for an array");
    332 assert(words <= filler_array_max_size(), "too big for a single
    object");
    333
    334 const size_t payload_size = words - filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length((int)len);
    339 post_allocation_setup_common(Universe::intArrayKlassObj(), start,
    words);
    340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns
    CollectedHeap::_filler_array_max_size, has a too large value we will
    overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer that I could
    set up on a Linux x64 machine. I have asked the one who initially reported
    the issue to run on their Solaris amd64 setup. I will also try to set up a
    reproducer on a Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there are no NUMA
    tests in there as far as I know. But the change for issue (2) was hopefully
    tested.

    Thanks,
    Bengt
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120323/0a2770cc/attachment-0001.html
  • Bengt Rutisson at Mar 23, 2012 at 7:39 am
    OK. Thanks for showing interest!

    Bengt
    On 2012-03-23 15:37, Vitaly Davidovich wrote:

    I'm not a reviewer, just a curious bystander. :)

    Sent from my phone

    On Mar 23, 2012 10:32 AM, "Bengt Rutisson" <bengt.rutisson at oracle.com
    wrote:


    Hi Vitaly,
    On 2012-03-23 14:45, Vitaly Davidovich wrote:

    Hi Bengt,

    mutableNUMASpace line 99 has a small typo in the assert message:
    "Remaining siz" should be "Remaining size".
    Thanks for seeing it. Fixed.

    BTW, you don't have an OpenJDK username, right? If you consider
    this a review I can list you as "Also-reviewed-by", when I push.

    Thanks,
    Bengt
    Vitaly

    Sent from my phone

    On Mar 23, 2012 9:06 AM, "Bengt Rutisson"
    <bengt.rutisson at oracle.com wrote:


    Hi all,

    Could I have a couple of reviews for this change? This is on
    the critical-watch list for hs23, so it would be great if I
    could get some reviews already today.

    http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/
    <http://cr.openjdk.java.net/%7Ebrutisso/7103665/webrev.00/>

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap
    parsable since we leave some holes in the heap. We do this by
    faking Integer arrays in those places. However, on a large
    heap those holes can be larger than what can be covered by a
    single Integer array. The fix is to allocate several arrays
    in those cases.

    The easy fix would have been to call
    CollectedHeap::fill_with_objects() instead of
    CollectedHeap::fill_with_object(). Note the extra "s" in the
    method name. But MutableNUMASpace::ensure_parsability(),
    where the change is needed, need to know about all the
    objects that are allocated so I saw no better solution than
    to implement my own loop. Any ideas on how I could re-use
    fill_with_objects() instead are welcome.

    (2) CollectedHeap::_filler_array_max_size should be the
    maximum size that can be covered by a fake Integer array.
    This value is in words, but since the word size is different
    on different platforms but the Integer size is always the
    same we need to convert between word and sizeof(jint) at some
    point. Unfortunately we do the conversion in the wrong
    direction, which means that on 64 bit platforms we end up
    with a max size that is 4 times larger than intended. This in
    turn makes us overflow an int when we convert from a word
    size to the length of the array that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_array(HeapWord* start, size_t
    words, bool zap)
    330 {
    331 assert(words >= filler_array_min_size(), "too small
    for an array");
    332 assert(words <= filler_array_max_size(), "too big for
    a single object");
    333
    334 const size_t payload_size = words -
    filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize /
    sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length((int)len);
    339
    post_allocation_setup_common(Universe::intArrayKlassObj(),
    start, words);
    340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns
    CollectedHeap::_filler_array_max_size, has a too large value
    we will overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer
    that I could set up on a Linux x64 machine. I have asked the
    one who initially reported the issue to run on their Solaris
    amd64 setup. I will also try to set up a reproducer on a
    Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there
    are no NUMA tests in there as far as I know. But the change
    for issue (2) was hopefully tested.

    Thanks,
    Bengt
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120323/64380eea/attachment-0001.html
  • Bengt Rutisson at Mar 23, 2012 at 7:38 am
    Just an update on the testing. I tried this out on a Solaris x64 machine
    and I can reproduce the crash. The crash goes away with my fix.

    Bengt
    On 2012-03-23 13:56, Bengt Rutisson wrote:

    Hi all,

    Could I have a couple of reviews for this change? This is on the
    critical-watch list for hs23, so it would be great if I could get some
    reviews already today.

    http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap parsable
    since we leave some holes in the heap. We do this by faking Integer
    arrays in those places. However, on a large heap those holes can be
    larger than what can be covered by a single Integer array. The fix is
    to allocate several arrays in those cases.

    The easy fix would have been to call
    CollectedHeap::fill_with_objects() instead of
    CollectedHeap::fill_with_object(). Note the extra "s" in the method
    name. But MutableNUMASpace::ensure_parsability(), where the change is
    needed, need to know about all the objects that are allocated so I saw
    no better solution than to implement my own loop. Any ideas on how I
    could re-use fill_with_objects() instead are welcome.

    (2) CollectedHeap::_filler_array_max_size should be the maximum size
    that can be covered by a fake Integer array. This value is in words,
    but since the word size is different on different platforms but the
    Integer size is always the same we need to convert between word and
    sizeof(jint) at some point. Unfortunately we do the conversion in the
    wrong direction, which means that on 64 bit platforms we end up with a
    max size that is 4 times larger than intended. This in turn makes us
    overflow an int when we convert from a word size to the length of the
    array that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_array(HeapWord* start, size_t words,
    bool zap)
    330 {
    331 assert(words >= filler_array_min_size(), "too small for an
    array");
    332 assert(words <= filler_array_max_size(), "too big for a single
    object");
    333
    334 const size_t payload_size = words - filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length((int)len);
    339 post_allocation_setup_common(Universe::intArrayKlassObj(),
    start, words);
    340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns
    CollectedHeap::_filler_array_max_size, has a too large value we will
    overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer that I
    could set up on a Linux x64 machine. I have asked the one who
    initially reported the issue to run on their Solaris amd64 setup. I
    will also try to set up a reproducer on a Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there are no
    NUMA tests in there as far as I know. But the change for issue (2) was
    hopefully tested.

    Thanks,
    Bengt
  • Jon Masamitsu at Mar 23, 2012 at 7:46 am
    Bengt,

    Is it correct that there is still the possibility that during the final
    iteration of the loop to do the filling, that the space to be filled
    can be too small to fit an object? I note the assertion
    98 assert(words_to_fill>= CollectedHeap::min_fill_size(),
    99 err_msg("Remaining siz ("SIZE_FORMAT ") is too small to fill (based on " SIZE_FORMAT " and " SIZE_FORMAT ")",
    100 words_to_fill, words_left_to_fill, CollectedHeap::filler_array_max_size()));
    You could fix that by checking for the situation in the next to last
    iteration and shortening the filler for the next to last iteration?

    Also a question below.

    On 3/23/2012 5:56 AM, Bengt Rutisson wrote:

    Hi all,

    Could I have a couple of reviews for this change? This is on the
    critical-watch list for hs23, so it would be great if I could get some
    reviews already today.

    http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap parsable
    since we leave some holes in the heap. We do this by faking Integer
    arrays in those places. However, on a large heap those holes can be
    larger than what can be covered by a single Integer array. The fix is
    to allocate several arrays in those cases.

    The easy fix would have been to call
    CollectedHeap::fill_with_objects() instead of
    CollectedHeap::fill_with_object(). Note the extra "s" in the method
    name. But MutableNUMASpace::ensure_parsability(), where the change is
    needed, need to know about all the objects that are allocated so I saw
    no better solution than to implement my own loop. Any ideas on how I
    could re-use fill_with_objects() instead are welcome.
    Do you need to know all the objects that are allocated so that you can
    do the invalid region detection?

    113 if (crossing_start != crossing_end) {
    114 // If object header crossed a small page boundary we
    mark the area
    115 // as invalid rounding it to a page_size().


    Jon
    (2) CollectedHeap::_filler_array_max_size should be the maximum size
    that can be covered by a fake Integer array. This value is in words,
    but since the word size is different on different platforms but the
    Integer size is always the same we need to convert between word and
    sizeof(jint) at some point. Unfortunately we do the conversion in the
    wrong direction, which means that on 64 bit platforms we end up with a
    max size that is 4 times larger than intended. This in turn makes us
    overflow an int when we convert from a word size to the length of the
    array that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_array(HeapWord* start, size_t words,
    bool zap)
    330 {
    331 assert(words >= filler_array_min_size(), "too small for an
    array");
    332 assert(words <= filler_array_max_size(), "too big for a single
    object");
    333
    334 const size_t payload_size = words - filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length((int)len);
    339 post_allocation_setup_common(Universe::intArrayKlassObj(),
    start, words);
    340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns
    CollectedHeap::_filler_array_max_size, has a too large value we will
    overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer that I
    could set up on a Linux x64 machine. I have asked the one who
    initially reported the issue to run on their Solaris amd64 setup. I
    will also try to set up a reproducer on a Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there are no
    NUMA tests in there as far as I know. But the change for issue (2) was
    hopefully tested.

    Thanks,
    Bengt
  • Bengt Rutisson at Mar 23, 2012 at 8:31 am
    Jon,

    Thanks for looking at this!

    23 mar 2012 kl. 15:46 skrev Jon Masamitsu <jon.masamitsu at oracle.com>:
    Bengt,

    Is it correct that there is still the possibility that during the final
    iteration of the loop to do the filling, that the space to be filled
    can be too small to fit an object? I note the assertion
    98 assert(words_to_fill>= CollectedHeap::min_fill_size(),
    99 err_msg("Remaining siz ("SIZE_FORMAT ") is too small to fill (based on " SIZE_FORMAT " and " SIZE_FORMAT ")",
    100 words_to_fill, words_left_to_fill, CollectedHeap::filler_array_max_size()));
    You could fix that by checking for the situation in the next to last
    iteration and shortening the filler for the next to last iteration?
    I think this should be safe. The heap size is object aligned and we allocate aligned objects, so the hole should be object aligned, right?

    The max array size is also aligned. So I think we are safe. But to be sure I added the assert.
    Also a question below.

    On 3/23/2012 5:56 AM, Bengt Rutisson wrote:

    Hi all,

    Could I have a couple of reviews for this change? This is on the critical-watch list for hs23, so it would be great if I could get some reviews already today.

    http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap parsable since we leave some holes in the heap. We do this by faking Integer arrays in those places. However, on a large heap those holes can be larger than what can be covered by a single Integer array. The fix is to allocate several arrays in those cases.

    The easy fix would have been to call CollectedHeap::fill_with_objects() instead of CollectedHeap::fill_with_object(). Note the extra "s" in the method name. But MutableNUMASpace::ensure_parsability(), where the change is needed, need to know about all the objects that are allocated so I saw no better solution than to implement my own loop. Any ideas on how I could re-use fill_with_objects() instead are welcome.
    Do you need to know all the objects that are allocated so that you can
    do the invalid region detection?

    113 if (crossing_start != crossing_end) {
    114 // If object header crossed a small page boundary we mark the area
    115 // as invalid rounding it to a page_size().
    Yes, that's why.

    Bengt

    Jon
    (2) CollectedHeap::_filler_array_max_size should be the maximum size that can be covered by a fake Integer array. This value is in words, but since the word size is different on different platforms but the Integer size is always the same we need to convert between word and sizeof(jint) at some point. Unfortunately we do the conversion in the wrong direction, which means that on 64 bit platforms we end up with a max size that is 4 times larger than intended. This in turn makes us overflow an int when we convert from a word size to the length of the array that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_array(HeapWord* start, size_t words, bool zap)
    330 {
    331 assert(words >= filler_array_min_size(), "too small for an array");
    332 assert(words <= filler_array_max_size(), "too big for a single object");
    333
    334 const size_t payload_size = words - filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length((int)len);
    339 post_allocation_setup_common(Universe::intArrayKlassObj(), start, words);
    340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns CollectedHeap::_filler_array_max_size, has a too large value we will overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer that I could set up on a Linux x64 machine. I have asked the one who initially reported the issue to run on their Solaris amd64 setup. I will also try to set up a reproducer on a Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there are no NUMA tests in there as far as I know. But the change for issue (2) was hopefully tested.

    Thanks,
    Bengt
  • Jon Masamitsu at Mar 23, 2012 at 8:57 am
    Bengt,

    The changes you made are correct and an improvement, so if this
    is time critical, I'm good with this version. I do still have some
    questions below.

    On 3/23/2012 8:31 AM, Bengt Rutisson wrote:
    Jon,

    Thanks for looking at this!

    23 mar 2012 kl. 15:46 skrev Jon Masamitsu<jon.masamitsu at oracle.com>:
    Bengt,

    Is it correct that there is still the possibility that during the final
    iteration of the loop to do the filling, that the space to be filled
    can be too small to fit an object? I note the assertion
    98 assert(words_to_fill>= CollectedHeap::min_fill_size(),
    99 err_msg("Remaining siz ("SIZE_FORMAT ") is too small to fill (based on " SIZE_FORMAT " and " SIZE_FORMAT ")",
    100 words_to_fill, words_left_to_fill, CollectedHeap::filler_array_max_size()));
    You could fix that by checking for the situation in the next to last
    iteration and shortening the filler for the next to last iteration?
    I think this should be safe. The heap size is object aligned and we allocate aligned objects, so the hole should be object aligned, right?

    The max array size is also aligned. So I think we are safe. But to be sure I added the assert.
    This should be safe in 32bit but I'm not sure about 64bit. The hole can
    be objected aligned
    but still an odd number of words?

    Also a question below.

    On 3/23/2012 5:56 AM, Bengt Rutisson wrote:
    Hi all,

    Could I have a couple of reviews for this change? This is on the critical-watch list for hs23, so it would be great if I could get some reviews already today.

    http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap parsable since we leave some holes in the heap. We do this by faking Integer arrays in those places. However, on a large heap those holes can be larger than what can be covered by a single Integer array. The fix is to allocate several arrays in those cases.

    The easy fix would have been to call CollectedHeap::fill_with_objects() instead of CollectedHeap::fill_with_object(). Note the extra "s" in the method name. But MutableNUMASpace::ensure_parsability(), where the change is needed, need to know about all the objects that are allocated so I saw no better solution than to implement my own loop. Any ideas on how I could re-use fill_with_objects() instead are welcome.
    Do you need to know all the objects that are allocated so that you can
    do the invalid region detection?

    113 if (crossing_start != crossing_end) {
    114 // If object header crossed a small page boundary we mark the area
    115 // as invalid rounding it to a page_size().
    Yes, that's why.
    If you used fill_with_objects() (with an "s"), you do know where first
    filler object is so
    you could step through the objects using the object sizes to find the
    next filler object.
    Then it seems like you could do this check (which I can't say I
    understand what's going
    on there) in a second loop.

    Jon
    Bengt

    Jon
    (2) CollectedHeap::_filler_array_max_size should be the maximum size that can be covered by a fake Integer array. This value is in words, but since the word size is different on different platforms but the Integer size is always the same we need to convert between word and sizeof(jint) at some point. Unfortunately we do the conversion in the wrong direction, which means that on 64 bit platforms we end up with a max size that is 4 times larger than intended. This in turn makes us overflow an int when we convert from a word size to the length of the array that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_array(HeapWord* start, size_t words, bool zap)
    330 {
    331 assert(words>= filler_array_min_size(), "too small for an array");
    332 assert(words<= filler_array_max_size(), "too big for a single object");
    333
    334 const size_t payload_size = words - filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length((int)len);
    339 post_allocation_setup_common(Universe::intArrayKlassObj(), start, words);
    340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns CollectedHeap::_filler_array_max_size, has a too large value we will overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer that I could set up on a Linux x64 machine. I have asked the one who initially reported the issue to run on their Solaris amd64 setup. I will also try to set up a reproducer on a Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there are no NUMA tests in there as far as I know. But the change for issue (2) was hopefully tested.

    Thanks,
    Bengt
  • Bengt Rutisson at Mar 23, 2012 at 10:11 am
    Jon,
    On 2012-03-23 16:57, Jon Masamitsu wrote:
    Bengt,

    The changes you made are correct and an improvement, so if this
    is time critical, I'm good with this version. I do still have some
    questions below.
    OK. Thanks.
    On 3/23/2012 8:31 AM, Bengt Rutisson wrote:
    Jon,

    Thanks for looking at this!

    23 mar 2012 kl. 15:46 skrev Jon Masamitsu<jon.masamitsu at oracle.com>:
    Bengt,

    Is it correct that there is still the possibility that during the final
    iteration of the loop to do the filling, that the space to be filled
    can be too small to fit an object? I note the assertion
    98 assert(words_to_fill>= CollectedHeap::min_fill_size(),
    99 err_msg("Remaining siz ("SIZE_FORMAT ") is too
    small to fill (based on " SIZE_FORMAT " and " SIZE_FORMAT ")",
    100 words_to_fill, words_left_to_fill,
    CollectedHeap::filler_array_max_size()));
    You could fix that by checking for the situation in the next to last
    iteration and shortening the filler for the next to last iteration?
    I think this should be safe. The heap size is object aligned and we
    allocate aligned objects, so the hole should be object aligned, right?

    The max array size is also aligned. So I think we are safe. But to be
    sure I added the assert.
    This should be safe in 32bit but I'm not sure about 64bit. The hole
    can be objected aligned
    but still an odd number of words?
    Good point. But I think it is unlikely that you get holes that are
    larger than what can be covered with one Integer array on 32 bit
    plaforms, right?

    Can it be an issue with compressed oops on 64 bit? I think we are safe
    as long as we still don't have compressed headers. But please correct me
    if I am wrong. My head is spinning a bit from all the word/byte/jint
    conversions.
    Also a question below.

    On 3/23/2012 5:56 AM, Bengt Rutisson wrote:
    Hi all,

    Could I have a couple of reviews for this change? This is on the
    critical-watch list for hs23, so it would be great if I could get
    some reviews already today.

    http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap parsable
    since we leave some holes in the heap. We do this by faking Integer
    arrays in those places. However, on a large heap those holes can be
    larger than what can be covered by a single Integer array. The fix
    is to allocate several arrays in those cases.

    The easy fix would have been to call
    CollectedHeap::fill_with_objects() instead of
    CollectedHeap::fill_with_object(). Note the extra "s" in the method
    name. But MutableNUMASpace::ensure_parsability(), where the change
    is needed, need to know about all the objects that are allocated so
    I saw no better solution than to implement my own loop. Any ideas
    on how I could re-use fill_with_objects() instead are welcome.
    Do you need to know all the objects that are allocated so that you can
    do the invalid region detection?

    113 if (crossing_start != crossing_end) {
    114 // If object header crossed a small page boundary
    we mark the area
    115 // as invalid rounding it to a page_size().
    Yes, that's why.
    If you used fill_with_objects() (with an "s"), you do know where
    first filler object is so
    you could step through the objects using the object sizes to find the
    next filler object.
    Then it seems like you could do this check (which I can't say I
    understand what's going
    on there) in a second loop.
    Right. That's probably a nicer way of doing it. Especially since it is
    only needed on Solaris (I think).

    However, I'd like to get this in to have a theoretical chance of porting
    it to hs23. So, since you stated that you are ok with it as it is I'll
    go ahead and push what I have now.

    Thanks again for the review and the good comments!
    Bengt
    Jon
    Bengt

    Jon
    (2) CollectedHeap::_filler_array_max_size should be the maximum
    size that can be covered by a fake Integer array. This value is in
    words, but since the word size is different on different platforms
    but the Integer size is always the same we need to convert between
    word and sizeof(jint) at some point. Unfortunately we do the
    conversion in the wrong direction, which means that on 64 bit
    platforms we end up with a max size that is 4 times larger than
    intended. This in turn makes us overflow an int when we convert
    from a word size to the length of the array that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_array(HeapWord* start, size_t words,
    bool zap)
    330 {
    331 assert(words>= filler_array_min_size(), "too small for an
    array");
    332 assert(words<= filler_array_max_size(), "too big for a single
    object");
    333
    334 const size_t payload_size = words - filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length((int)len);
    339 post_allocation_setup_common(Universe::intArrayKlassObj(),
    start, words);
    340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns
    CollectedHeap::_filler_array_max_size, has a too large value we
    will overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer that I
    could set up on a Linux x64 machine. I have asked the one who
    initially reported the issue to run on their Solaris amd64 setup. I
    will also try to set up a reproducer on a Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there are no
    NUMA tests in there as far as I know. But the change for issue (2)
    was hopefully tested.

    Thanks,
    Bengt
  • Jon Masamitsu at Mar 23, 2012 at 10:29 am
    Bengt,

    Replies in-line.

    On 3/23/2012 10:11 AM, Bengt Rutisson wrote:

    Jon,
    On 2012-03-23 16:57, Jon Masamitsu wrote:
    Bengt,

    The changes you made are correct and an improvement, so if this
    is time critical, I'm good with this version. I do still have some
    questions below.
    OK. Thanks.
    On 3/23/2012 8:31 AM, Bengt Rutisson wrote:
    Jon,

    Thanks for looking at this!

    23 mar 2012 kl. 15:46 skrev Jon Masamitsu<jon.masamitsu at oracle.com>:
    Bengt,

    Is it correct that there is still the possibility that during the
    final
    iteration of the loop to do the filling, that the space to be filled
    can be too small to fit an object? I note the assertion
    98 assert(words_to_fill>=
    CollectedHeap::min_fill_size(),
    99 err_msg("Remaining siz ("SIZE_FORMAT ") is too
    small to fill (based on " SIZE_FORMAT " and " SIZE_FORMAT ")",
    100 words_to_fill, words_left_to_fill,
    CollectedHeap::filler_array_max_size()));
    You could fix that by checking for the situation in the next to last
    iteration and shortening the filler for the next to last iteration?
    I think this should be safe. The heap size is object aligned and we
    allocate aligned objects, so the hole should be object aligned, right?

    The max array size is also aligned. So I think we are safe. But to
    be sure I added the assert.
    This should be safe in 32bit but I'm not sure about 64bit. The hole
    can be objected aligned
    but still an odd number of words?
    Good point. But I think it is unlikely that you get holes that are
    larger than what can be covered with one Integer array on 32 bit
    plaforms, right?
    Extremely unlikely but probably 5-10 lines of code to fix it? I'll let your
    conscience be your guide :-) But you don't have to do anything right now.
    Can it be an issue with compressed oops on 64 bit? I think we are safe
    as long as we still don't have compressed headers. But please correct
    me if I am wrong. My head is spinning a bit from all the
    word/byte/jint conversions.
    Also a question below.

    On 3/23/2012 5:56 AM, Bengt Rutisson wrote:
    Hi all,

    Could I have a couple of reviews for this change? This is on the
    critical-watch list for hs23, so it would be great if I could get
    some reviews already today.

    http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap
    parsable since we leave some holes in the heap. We do this by
    faking Integer arrays in those places. However, on a large heap
    those holes can be larger than what can be covered by a single
    Integer array. The fix is to allocate several arrays in those cases.

    The easy fix would have been to call
    CollectedHeap::fill_with_objects() instead of
    CollectedHeap::fill_with_object(). Note the extra "s" in the
    method name. But MutableNUMASpace::ensure_parsability(), where the
    change is needed, need to know about all the objects that are
    allocated so I saw no better solution than to implement my own
    loop. Any ideas on how I could re-use fill_with_objects() instead
    are welcome.
    Do you need to know all the objects that are allocated so that you can
    do the invalid region detection?

    113 if (crossing_start != crossing_end) {
    114 // If object header crossed a small page boundary
    we mark the area
    115 // as invalid rounding it to a page_size().
    Yes, that's why.
    If you used fill_with_objects() (with an "s"), you do know where
    first filler object is so
    you could step through the objects using the object sizes to find the
    next filler object.
    Then it seems like you could do this check (which I can't say I
    understand what's going
    on there) in a second loop.
    Right. That's probably a nicer way of doing it. Especially since it is
    only needed on Solaris (I think).

    However, I'd like to get this in to have a theoretical chance of
    porting it to hs23. So, since you stated that you are ok with it as it
    is I'll go ahead and push what I have now.
    I have absolutely no problem with this.

    Jon
    Thanks again for the review and the good comments!
    Bengt
    Jon
    Bengt

    Jon
    (2) CollectedHeap::_filler_array_max_size should be the maximum
    size that can be covered by a fake Integer array. This value is in
    words, but since the word size is different on different platforms
    but the Integer size is always the same we need to convert between
    word and sizeof(jint) at some point. Unfortunately we do the
    conversion in the wrong direction, which means that on 64 bit
    platforms we end up with a max size that is 4 times larger than
    intended. This in turn makes us overflow an int when we convert
    from a word size to the length of the array that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_array(HeapWord* start, size_t words,
    bool zap)
    330 {
    331 assert(words>= filler_array_min_size(), "too small for an
    array");
    332 assert(words<= filler_array_max_size(), "too big for a
    single object");
    333
    334 const size_t payload_size = words - filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length((int)len);
    339 post_allocation_setup_common(Universe::intArrayKlassObj(),
    start, words);
    340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns
    CollectedHeap::_filler_array_max_size, has a too large value we
    will overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer that I
    could set up on a Linux x64 machine. I have asked the one who
    initially reported the issue to run on their Solaris amd64 setup.
    I will also try to set up a reproducer on a Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there are no
    NUMA tests in there as far as I know. But the change for issue (2)
    was hopefully tested.

    Thanks,
    Bengt
  • Bengt Rutisson at Mar 23, 2012 at 1:43 pm
    Jon,

    Inline.
    On 2012-03-23 18:29, Jon Masamitsu wrote:
    Bengt,

    Replies in-line.

    On 3/23/2012 10:11 AM, Bengt Rutisson wrote:

    Jon,
    On 2012-03-23 16:57, Jon Masamitsu wrote:
    Bengt,

    The changes you made are correct and an improvement, so if this
    is time critical, I'm good with this version. I do still have some
    questions below.
    OK. Thanks.
    On 3/23/2012 8:31 AM, Bengt Rutisson wrote:
    Jon,

    Thanks for looking at this!

    23 mar 2012 kl. 15:46 skrev Jon Masamitsu<jon.masamitsu at oracle.com>:
    Bengt,

    Is it correct that there is still the possibility that during the
    final
    iteration of the loop to do the filling, that the space to be filled
    can be too small to fit an object? I note the assertion
    98 assert(words_to_fill>=
    CollectedHeap::min_fill_size(),
    99 err_msg("Remaining siz ("SIZE_FORMAT ") is too
    small to fill (based on " SIZE_FORMAT " and " SIZE_FORMAT ")",
    100 words_to_fill, words_left_to_fill,
    CollectedHeap::filler_array_max_size()));
    You could fix that by checking for the situation in the next to last
    iteration and shortening the filler for the next to last iteration?
    I think this should be safe. The heap size is object aligned and we
    allocate aligned objects, so the hole should be object aligned, right?

    The max array size is also aligned. So I think we are safe. But to
    be sure I added the assert.
    This should be safe in 32bit but I'm not sure about 64bit. The hole
    can be objected aligned
    but still an odd number of words?
    Good point. But I think it is unlikely that you get holes that are
    larger than what can be covered with one Integer array on 32 bit
    plaforms, right?
    Extremely unlikely but probably 5-10 lines of code to fix it? I'll
    let your
    conscience be your guide :-) But you don't have to do anything right
    now.
    Thanks. :-)

    In fact I think it is not just unlikely, but actually impossible. On 32
    bit the filler_array_max_size will be 1 GB words. The word size is 4, so
    we can cover a hole the size of 4 GB. This is of course also the max
    heap size on 32 bit, but since this is a NUMA problem it means that we
    have split the 4 GB up over several nodes. So, no hole can be 4 GB.

    I'll leave it as it is.

    My push to hotspot-gc just went through - well in time for the nightly
    testing. Thanks again for helping out with this so quickly!

    Bengt
    Can it be an issue with compressed oops on 64 bit? I think we are
    safe as long as we still don't have compressed headers. But please
    correct me if I am wrong. My head is spinning a bit from all the
    word/byte/jint conversions.
    Also a question below.

    On 3/23/2012 5:56 AM, Bengt Rutisson wrote:
    Hi all,

    Could I have a couple of reviews for this change? This is on the
    critical-watch list for hs23, so it would be great if I could get
    some reviews already today.

    http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap
    parsable since we leave some holes in the heap. We do this by
    faking Integer arrays in those places. However, on a large heap
    those holes can be larger than what can be covered by a single
    Integer array. The fix is to allocate several arrays in those cases.

    The easy fix would have been to call
    CollectedHeap::fill_with_objects() instead of
    CollectedHeap::fill_with_object(). Note the extra "s" in the
    method name. But MutableNUMASpace::ensure_parsability(), where
    the change is needed, need to know about all the objects that are
    allocated so I saw no better solution than to implement my own
    loop. Any ideas on how I could re-use fill_with_objects() instead
    are welcome.
    Do you need to know all the objects that are allocated so that you
    can
    do the invalid region detection?

    113 if (crossing_start != crossing_end) {
    114 // If object header crossed a small page
    boundary we mark the area
    115 // as invalid rounding it to a page_size().
    Yes, that's why.
    If you used fill_with_objects() (with an "s"), you do know where
    first filler object is so
    you could step through the objects using the object sizes to find
    the next filler object.
    Then it seems like you could do this check (which I can't say I
    understand what's going
    on there) in a second loop.
    Right. That's probably a nicer way of doing it. Especially since it
    is only needed on Solaris (I think).

    However, I'd like to get this in to have a theoretical chance of
    porting it to hs23. So, since you stated that you are ok with it as
    it is I'll go ahead and push what I have now.
    I have absolutely no problem with this.

    Jon
    Thanks again for the review and the good comments!
    Bengt
    Jon
    Bengt

    Jon
    (2) CollectedHeap::_filler_array_max_size should be the maximum
    size that can be covered by a fake Integer array. This value is
    in words, but since the word size is different on different
    platforms but the Integer size is always the same we need to
    convert between word and sizeof(jint) at some point.
    Unfortunately we do the conversion in the wrong direction, which
    means that on 64 bit platforms we end up with a max size that is
    4 times larger than intended. This in turn makes us overflow an
    int when we convert from a word size to the length of the array
    that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_array(HeapWord* start, size_t words,
    bool zap)
    330 {
    331 assert(words>= filler_array_min_size(), "too small for an
    array");
    332 assert(words<= filler_array_max_size(), "too big for a
    single object");
    333
    334 const size_t payload_size = words - filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length((int)len);
    339 post_allocation_setup_common(Universe::intArrayKlassObj(),
    start, words);
    340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns
    CollectedHeap::_filler_array_max_size, has a too large value we
    will overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer that I
    could set up on a Linux x64 machine. I have asked the one who
    initially reported the issue to run on their Solaris amd64 setup.
    I will also try to set up a reproducer on a Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there are
    no NUMA tests in there as far as I know. But the change for issue
    (2) was hopefully tested.

    Thanks,
    Bengt
  • Igor Veresov at Mar 23, 2012 at 9:46 am
    Bengt,

    Good catch! The fix looks good to me.

    igor

    On Friday, March 23, 2012 at 5:56 AM, Bengt Rutisson wrote:


    Hi all,

    Could I have a couple of reviews for this change? This is on the
    critical-watch list for hs23, so it would be great if I could get some
    reviews already today.

    http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap parsable
    since we leave some holes in the heap. We do this by faking Integer
    arrays in those places. However, on a large heap those holes can be
    larger than what can be covered by a single Integer array. The fix is to
    allocate several arrays in those cases.

    The easy fix would have been to call CollectedHeap::fill_with_objects()
    instead of CollectedHeap::fill_with_object(). Note the extra "s" in the
    method name. But MutableNUMASpace::ensure_parsability(), where the
    change is needed, need to know about all the objects that are allocated
    so I saw no better solution than to implement my own loop. Any ideas on
    how I could re-use fill_with_objects() instead are welcome.

    (2) CollectedHeap::_filler_array_max_size should be the maximum size
    that can be covered by a fake Integer array. This value is in words, but
    since the word size is different on different platforms but the Integer
    size is always the same we need to convert between word and sizeof(jint)
    at some point. Unfortunately we do the conversion in the wrong
    direction, which means that on 64 bit platforms we end up with a max
    size that is 4 times larger than intended. This in turn makes us
    overflow an int when we convert from a word size to the length of the
    array that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_array(HeapWord* start, size_t words, bool
    zap)
    330 {
    331 assert(words >= filler_array_min_size(), "too small for an array");
    332 assert(words <= filler_array_max_size(), "too big for a single
    object");
    333
    334 const size_t payload_size = words - filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length((int)len);
    339 post_allocation_setup_common(Universe::intArrayKlassObj(),
    start, words);
    340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns
    CollectedHeap::_filler_array_max_size, has a too large value we will
    overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer that I could
    set up on a Linux x64 machine. I have asked the one who initially
    reported the issue to run on their Solaris amd64 setup. I will also try
    to set up a reproducer on a Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there are no NUMA
    tests in there as far as I know. But the change for issue (2) was
    hopefully tested.

    Thanks,
    Bengt

    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120323/a2c07bc9/attachment-0001.html
  • Bengt Rutisson at Mar 23, 2012 at 10:11 am
    Igor,

    Thanks for looking at this so quickly!

    Bengt
    On 2012-03-23 17:46, Igor Veresov wrote:
    Bengt,

    Good catch! The fix looks good to me.

    igor
    On Friday, March 23, 2012 at 5:56 AM, Bengt Rutisson wrote:


    Hi all,

    Could I have a couple of reviews for this change? This is on the
    critical-watch list for hs23, so it would be great if I could get some
    reviews already today.

    http://cr.openjdk.java.net/~brutisso/7103665/webrev.00/
    <http://cr.openjdk.java.net/%7Ebrutisso/7103665/webrev.00/>

    Some background:

    There are two problems that this fix solves:

    (1) When we run with -XX:+UseNUMA we have to make the heap parsable
    since we leave some holes in the heap. We do this by faking Integer
    arrays in those places. However, on a large heap those holes can be
    larger than what can be covered by a single Integer array. The fix is to
    allocate several arrays in those cases.

    The easy fix would have been to call CollectedHeap::fill_with_objects()
    instead of CollectedHeap::fill_with_object(). Note the extra "s" in the
    method name. But MutableNUMASpace::ensure_parsability(), where the
    change is needed, need to know about all the objects that are allocated
    so I saw no better solution than to implement my own loop. Any ideas on
    how I could re-use fill_with_objects() instead are welcome.

    (2) CollectedHeap::_filler_array_max_size should be the maximum size
    that can be covered by a fake Integer array. This value is in words, but
    since the word size is different on different platforms but the Integer
    size is always the same we need to convert between word and sizeof(jint)
    at some point. Unfortunately we do the conversion in the wrong
    direction, which means that on 64 bit platforms we end up with a max
    size that is 4 times larger than intended. This in turn makes us
    overflow an int when we convert from a word size to the length of the
    array that we are setting up.

    Here is the code that overflows:

    328 void
    329 CollectedHeap::fill_with_array(HeapWord* start, size_t words, bool
    zap)
    330 {
    331 assert(words >= filler_array_min_size(), "too small for an array");
    332 assert(words <= filler_array_max_size(), "too big for a single
    object");
    333
    334 const size_t payload_size = words - filler_array_hdr_size();
    335 const size_t len = payload_size * HeapWordSize / sizeof(jint);
    336
    337 // Set the length first for concurrent GC.
    338 ((arrayOop)start)->set_length((int)len);
    339 post_allocation_setup_common(Universe::intArrayKlassObj(),
    start, words);
    340 DEBUG_ONLY(zap_filler_array(start, words, zap);)
    341 }

    If filler_array_max_size() on line 332, which returns
    CollectedHeap::_filler_array_max_size, has a too large value we will
    overflow the int conversation on line 338.

    Testing:
    This fix solves the issue that was found in the reproducer that I could
    set up on a Linux x64 machine. I have asked the one who initially
    reported the issue to run on their Solaris amd64 setup. I will also try
    to set up a reproducer on a Solaris machine.

    I also ran the fix through JPRT. It did not fail, but there are no NUMA
    tests in there as far as I know. But the change for issue (2) was
    hopefully tested.

    Thanks,
    Bengt

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphotspot-gc-dev @
categoriesopenjdk
postedMar 23, '12 at 5:56a
activeMar 23, '12 at 1:43p
posts16
users5
websiteopenjdk.java.net

People

Translate

site design / logo © 2022 Grokbase