FAQ
Hi

I am trying to run the performance tests against lucene, and am suprised
about the results.

I have a test that creates a queue of queries, and a number of threads.
The threads run concurrently getting the next query available, peforming
a query on the index and taking the top hits. The index is 2GB in size,
and was originally created froma database table of about 7 millions rows.

I ran the test a number of times with 30 threads, and max memory of
3500mb I was processing 10,000 records in about 43 seconds ( 233
queries/second) , the index was stored on a solid state drive running on
a MacBook Pro (2.66 Ghz Intel Core 2 Duo, 4GB DDR). I dont really have a
view on whether this is a good result or not but I was keen to try a few
other things to see if I could improve performance further, but all my
efforts have had minimal effect.

I tried creating a RAMDirectory based on the file index, once the index
had been created (4 min 20 seconds) it again took
I copied the index to a slower external convention hard drive and it
still took 43 seconds.

Reducing/increasing the memory allocated and the number of threads had
minimal impact.

The main thing Im suprised about is I was expecting a massive difference
in holding the index in memory instead on disk

thanks Paul



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Michael McCandless at Mar 27, 2009 at 11:32 am
    Are you opening your IndexReader with readOnly=true? If not, you're
    likely hitting contention on the "isDeleted" method.

    When you run with a "normal" directory, either on a traditional hard
    drive or SSD device, do you use NIOFSDirectory? That removes
    contention, but, it only works on non-Windows platform due to a
    long-standing bug in Sun's JRE.

    Likely the OS is caching stuff in RAM, anyway, so you don't see much
    improvement when you explicitly load into a RAMDir.

    Mike
    On Fri, Mar 27, 2009 at 7:07 AM, Paul Taylor wrote:
    Hi

    I am trying to run the performance tests against lucene, and am suprised
    about the results.

    I have a test that creates a queue of queries, and a number of threads. The
    threads run concurrently getting the next query available, peforming a query
    on the index and taking the top hits. The index is 2GB in size, and was
    originally created froma database table of about 7 millions rows.

    I ran the test a number of times with 30 threads, and max memory of 3500mb I
    was processing 10,000 records in about 43 seconds ( 233 queries/second) ,
    the index was stored on a solid state drive running on a MacBook Pro (2.66
    Ghz Intel Core 2 Duo, 4GB DDR). I dont really have a view on whether this is
    a good result or not but I was keen to try a few other things to see if I
    could improve performance further, but all my efforts have had minimal
    effect.

    I tried creating a RAMDirectory based on the file index, once the index had
    been created (4 min 20 seconds) it again took
    I copied the index to a slower external convention hard drive and it still
    took 43 seconds.

    Reducing/increasing the memory allocated and the number of threads had
    minimal impact.

    The main thing Im suprised about is I was expecting a massive difference in
    holding the index in memory instead on disk

    thanks Paul



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Spring at Mar 27, 2009 at 3:43 pm

    Are you opening your IndexReader with readOnly=true? If not, you're
    likely hitting contention on the "isDeleted" method.
    How can I open it "readonly"?


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Ian Lea at Mar 27, 2009 at 3:50 pm

    Are you opening your IndexReader with readOnly=true?  If not, you're
    likely hitting contention on the "isDeleted" method.
    How can I open it "readonly"?
    See the javadocs for IndexReader.

    --
    Ian.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Spring at Mar 27, 2009 at 3:55 pm

    How can I open it "readonly"?
    See the javadocs for IndexReader.
    I did it already for 2.3 - cannot find readonly


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Mar 27, 2009 at 3:57 pm
    Alas, it's new as of 2.4. Can you upgrade?

    Mike
    On Fri, Mar 27, 2009 at 11:55 AM, wrote:
    How can I open it "readonly"?
    See the javadocs for IndexReader.
    I did it already for 2.3 - cannot find readonly


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Simon Willnauer at Mar 27, 2009 at 4:02 pm
    ReadOnly option was introduce with 2.4
    from javadoc: "...as of 2.4, it's possible to open a read-only
    IndexReader using one of the static open methods that accepts the
    boolean readOnly parameter."

    http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/index/IndexReader.html#open(org.apache.lucene.store.Directory,%20boolean)

    simon
    On Fri, Mar 27, 2009 at 4:55 PM, wrote:
    How can I open it "readonly"?
    See the javadocs for IndexReader.
    I did it already for 2.3 - cannot find readonly


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Paul Taylor at Mar 27, 2009 at 4:07 pm

    Michael McCandless wrote:
    Are you opening your IndexReader with readOnly=true? If not, you're
    likely hitting contention on the "isDeleted" method.

    When you run with a "normal" directory, either on a traditional hard
    drive or SSD device, do you use NIOFSDirectory? That removes
    contention, but, it only works on non-Windows platform due to a
    long-standing bug in Sun's JRE.
    It was a long lunch, actually Im just creating an IndexSearcher directly
    on a file

    i.e Searcher searcher = new IndexSearcher(indexDir + "/track_index");

    I was struggling to see how to create an NIOFSDirectory until I realised
    I needed Lucene 2.9, which Ive done as follows

    Searcher searcher = new IndexSearcher(IndexReader.open(new
    NIOFSDirectory(new File(indexDir + "/track_index"),null),true)));

    Anyway the end result is query times have been reduced from 43 seconds
    to 23 seconds, so a pretty good result. (although I dont really
    understand why the RAMDirectory method didnt perform at least this well
    because it would have no file io contention)

    thanks alot Paul


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Mar 27, 2009 at 11:41 am
    Also, see here for other ideas that may help:

    http://wiki.apache.org/lucene-java/ImproveSearchingSpeed

    I just updated that page with readOnly IndexReader & NIOFSDirectory.

    Mike
    On Fri, Mar 27, 2009 at 7:07 AM, Paul Taylor wrote:
    Hi

    I am trying to run the performance tests against lucene, and am suprised
    about the results.

    I have a test that creates a queue of queries, and a number of threads. The
    threads run concurrently getting the next query available, peforming a query
    on the index and taking the top hits. The index is 2GB in size, and was
    originally created froma database table of about 7 millions rows.

    I ran the test a number of times with 30 threads, and max memory of 3500mb I
    was processing 10,000 records in about 43 seconds ( 233 queries/second) ,
    the index was stored on a solid state drive running on a MacBook Pro (2.66
    Ghz Intel Core 2 Duo, 4GB DDR). I dont really have a view on whether this is
    a good result or not but I was keen to try a few other things to see if I
    could improve performance further, but all my efforts have had minimal
    effect.

    I tried creating a RAMDirectory based on the file index, once the index had
    been created (4 min 20 seconds) it again took
    I copied the index to a slower external convention hard drive and it still
    took 43 seconds.

    Reducing/increasing the memory allocated and the number of threads had
    minimal impact.

    The main thing Im suprised about is I was expecting a massive difference in
    holding the index in memory instead on disk

    thanks Paul



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Toke Eskildsen at Apr 1, 2009 at 10:04 am
    On Fri, 2009-03-27 at 12:07 +0100, Paul Taylor wrote:

    [2Gb index, 7 million documents(?)]
    I ran the test a number of times with 30 threads, and max memory of
    3500mb I was processing 10,000 records in about 43 seconds ( 233
    queries/second) , the index was stored on a solid state drive running on
    a MacBook Pro (2.66 Ghz Intel Core 2 Duo, 4GB DDR). I dont really have a
    view on whether this is a good result or not but I was keen to try a few
    other things to see if I could improve performance further, but all my
    efforts have had minimal effect.
    You might want to try reducing the number of threads all the way down to
    3 or 4 and queue pending searches instead, but I doubt it will change
    much - as far as I know, the SSDs in MacBooks are quite okay with regard
    to read-latency and with such a small index the system will probably
    cache most of it anyway.

    I can see elsewhere that you have upped the speed to 466 q/sec by
    switching to NIOFSDirectory, so my guess is that you're now CPU and
    memory speed bound.

    You could try the freeware tool visualVM that profiles running Java
    applications. It is extremely easy to use (just run it and select your
    application from a list) and it will show you where the CPU-time is
    used. Of course, if you're just using simple query analysis with Lucene
    supplied Analyzers, there's probably not much you can do about it. On
    the other hand, it might show you that you're spending a lot of time
    generating queries or similar outside-Lucene-work.
    The main thing Im suprised about is I was expecting a massive difference
    in holding the index in memory instead on disk
    Solid state Drives (and disk cache) rules. Our experiments shows very
    little performance increase going from SSD to RAMDirectory.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedMar 27, '09 at 11:07a
activeApr 1, '09 at 10:04a
posts10
users6
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase