FAQ
What are best practices for optimizing an Index?



We have several GB indexes in sizes between 2 and 8 GB. One index gets
updates about every 2 to 30 seconds. It takes less than a week to have
an index with several thousand files. What is a good practice for
scheduling optimization? Wait until a certain number of files is
created? Every n number of updates? Once a week or once a night?



I am interested in finding out what other people use for optimizing
strategies.



Andreas

Search Discussions

  • Grant Ingersoll at Apr 25, 2007 at 7:45 pm
    Related discussion at http://www.gossamer-threads.com/lists/lucene/
    java-dev/47895?search_string=optimize;#47895

    Also, search this archive and the java-dev archive for optimize.

    On Apr 25, 2007, at 3:03 PM, Andreas Guther wrote:

    What are best practices for optimizing an Index?



    We have several GB indexes in sizes between 2 and 8 GB. One index
    gets
    updates about every 2 to 30 seconds. It takes less than a week to
    have
    an index with several thousand files. What is a good practice for
    scheduling optimization? Wait until a certain number of files is
    created? Every n number of updates? Once a week or once a night?



    I am interested in finding out what other people use for optimizing
    strategies.



    Andreas



    --------------------------
    Grant Ingersoll
    Center for Natural Language Processing
    http://www.cnlp.org

    Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/
    LuceneFAQ



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Otis Gospodnetic at Apr 26, 2007 at 5:41 pm
    Andreas,

    Look at the javadoc for the mergeFactor (under IndexWriter javadoc). This will let you control how many index files are created in your index dir and how often index segments are merged. Set it to 1, and see what happens. But a low mergeFacor will also mean more merging and more IO, so you'll need to find the right balance there. You could also look into making your index a compound index if you are worried about the number of files.

    Optimizing should be done rarely, as it means the whole index will be re-written on disk. If you want to do that, do it during the slow part of the day. No updates to the index during that time either.

    Otis

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
    http://www.lucene-consulting.com/

    ----- Original Message ----
    From: Andreas Guther <Andreas.Guther@markettools.com>
    To: java-user@lucene.apache.org
    Sent: Wednesday, April 25, 2007 3:03:30 PM
    Subject: How often to optimize an Index?

    What are best practices for optimizing an Index?



    We have several GB indexes in sizes between 2 and 8 GB. One index gets
    updates about every 2 to 30 seconds. It takes less than a week to have
    an index with several thousand files. What is a good practice for
    scheduling optimization? Wait until a certain number of files is
    created? Every n number of updates? Once a week or once a night?



    I am interested in finding out what other people use for optimizing
    strategies.



    Andreas









    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedApr 25, '07 at 7:04p
activeApr 26, '07 at 5:41p
posts3
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase