FAQ
Float.floatToRawIntBits (in Java1.4) gives the raw float bits without
normalization (like *(int*)&floatvar would in C). Since it doesn't do
normalization of NaN values, it's faster (and hopefully optimized to a
simple inline machine instruction by the JVM).

On my Pentium4, using floatToRawIntBits is over 5 times as fast as
floatToIntBits.
That can really add up in something like Similarity.floatToByte() for
encoding norms, especially if used as a way to compress an array of
float during query time as suggested by Doug.

-Yonik
Now hiring -- http://forms.cnet.com/slink?231706

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Search Discussions

  • Paul Smith at Nov 16, 2005 at 9:21 pm
    I can confirm this takes ~ 20% of an overall Indexing operation (see
    attached link from YourKit).

    http://people.apache.org/~psmith/luceneYourkit.jpg

    Mind you, the whole "signalling via IOException" in the
    FastCharStream is a way bigger overhead, although I agree much harder
    to fix.

    Paul Smith
    On 17/11/2005, at 7:21 AM, Yonik Seeley wrote:

    Float.floatToRawIntBits (in Java1.4) gives the raw float bits without
    normalization (like *(int*)&floatvar would in C). Since it doesn't do
    normalization of NaN values, it's faster (and hopefully optimized to a
    simple inline machine instruction by the JVM).

    On my Pentium4, using floatToRawIntBits is over 5 times as fast as
    floatToIntBits.
    That can really add up in something like Similarity.floatToByte() for
    encoding norms, especially if used as a way to compress an array of
    float during query time as suggested by Doug.

    -Yonik
    Now hiring -- http://forms.cnet.com/slink?231706

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Yonik Seeley at Nov 16, 2005 at 10:12 pm
    Wow! A much larger gain than I expected!
    Thanks for the profile Paul!

    -Yonik
    Now hiring -- http://forms.cnet.com/slink?231706

    On 11/16/05, Paul Smith wrote:
    I can confirm this takes ~ 20% of an overall Indexing operation (see
    attached link from YourKit).

    http://people.apache.org/~psmith/luceneYourkit.jpg

    Mind you, the whole "signalling via IOException" in the
    FastCharStream is a way bigger overhead, although I agree much harder
    to fix.

    Paul Smith
    On 17/11/2005, at 7:21 AM, Yonik Seeley wrote:

    Float.floatToRawIntBits (in Java1.4) gives the raw float bits without
    normalization (like *(int*)&floatvar would in C). Since it doesn't do
    normalization of NaN values, it's faster (and hopefully optimized to a
    simple inline machine instruction by the JVM).

    On my Pentium4, using floatToRawIntBits is over 5 times as fast as
    floatToIntBits.
    That can really add up in something like Similarity.floatToByte() for
    encoding norms, especially if used as a way to compress an array of
    float during query time as suggested by Doug.

    -Yonik
    Now hiring -- http://forms.cnet.com/slink?231706

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Doug Cutting at Nov 16, 2005 at 10:25 pm
    In general I would not take this sort of profiler output too literally.
    If floatToRawIntBits is 5x faster, then you'd expect a 16% improvement
    from using it, but my guess is you'll see far less. Still, it's
    probably worth switching & measuring as it might be significant.

    Doug

    Paul Smith wrote:
    I can confirm this takes ~ 20% of an overall Indexing operation (see
    attached link from YourKit).

    http://people.apache.org/~psmith/luceneYourkit.jpg

    Mind you, the whole "signalling via IOException" in the FastCharStream
    is a way bigger overhead, although I agree much harder to fix.

    Paul Smith
    On 17/11/2005, at 7:21 AM, Yonik Seeley wrote:

    Float.floatToRawIntBits (in Java1.4) gives the raw float bits without
    normalization (like *(int*)&floatvar would in C). Since it doesn't do
    normalization of NaN values, it's faster (and hopefully optimized to a
    simple inline machine instruction by the JVM).

    On my Pentium4, using floatToRawIntBits is over 5 times as fast as
    floatToIntBits.
    That can really add up in something like Similarity.floatToByte() for
    encoding norms, especially if used as a way to compress an array of
    float during query time as suggested by Doug.

    -Yonik
    Now hiring -- http://forms.cnet.com/slink?231706

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Paul Smith at Nov 16, 2005 at 10:31 pm

    On 17/11/2005, at 9:24 AM, Doug Cutting wrote:

    In general I would not take this sort of profiler output too
    literally. If floatToRawIntBits is 5x faster, then you'd expect a
    16% improvement from using it, but my guess is you'll see far
    less. Still, it's probably worth switching & measuring as it might
    be significant.
    Yes I don't think we'll get 5x speed update, as it will likely move
    the bottleneck back to the IO layer, but still... If you can reduce
    CPU usage, then multithreaded indexing operations can gain better CPU
    utilization (doing other stuff while waiting for IO). Seems like an
    easy win and dead easy to unit test?

    I've been meaning to have a crack at reworking FastCharStream but
    everytime I start thinking about it I realise there is a bit of a
    depency on this IOExecption signalling EOF that I'm pretty sure it's
    going to be much harder task. The JavaCC stuff is really designed
    for compiling tree's which is usually a 'once off' type usage, but
    Lucenes usage of it (large indexing operations) means the flaws in it
    are exacerbated.

    Paul
  • Chris Lamprecht at Nov 16, 2005 at 11:21 pm
    1. Run profiler
    2. Sort methods by CPU time spent
    3. Optimize
    4. Repeat

    :)
    On 11/16/05, Paul Smith wrote:
    On 17/11/2005, at 9:24 AM, Doug Cutting wrote:

    In general I would not take this sort of profiler output too
    literally. If floatToRawIntBits is 5x faster, then you'd expect a
    16% improvement from using it, but my guess is you'll see far
    less. Still, it's probably worth switching & measuring as it might
    be significant.
    Yes I don't think we'll get 5x speed update, as it will likely move
    the bottleneck back to the IO layer, but still... If you can reduce
    CPU usage, then multithreaded indexing operations can gain better CPU
    utilization (doing other stuff while waiting for IO). Seems like an
    easy win and dead easy to unit test?

    I've been meaning to have a crack at reworking FastCharStream but
    everytime I start thinking about it I realise there is a bit of a
    depency on this IOExecption signalling EOF that I'm pretty sure it's
    going to be much harder task. The JavaCC stuff is really designed
    for compiling tree's which is usually a 'once off' type usage, but
    Lucenes usage of it (large indexing operations) means the flaws in it
    are exacerbated.

    Paul


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Paul Smith at Nov 16, 2005 at 11:47 pm

    On 17/11/2005, at 10:21 AM, Chris Lamprecht wrote:

    1. Run profiler
    2. Sort methods by CPU time spent
    3. Optimize
    4. Repeat

    :)
    Umm, well I know I could make it quicker, it's just whether it still
    _works_ as expected.... Maintaining the contract means I'll need to
    develop some good junit tests that I feel confident cover the current
    workings before making changes. That's the hard bit.

    Paul

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-dev @
categorieslucene
postedNov 16, '05 at 8:21p
activeNov 16, '05 at 11:47p
posts7
users4
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase