FAQ
Dear all,

I use Lucene to index document content 6 field (int) and 1 file (string)

I log the index process. Log said that,

INFO [CONTENT-FILTER INDEX-TIMER] 2010-12-29 15:45:52,707 Index 55 item in
10576 miliseconds
INFO [CONTENT-FILTER INDEX-TIMER] 2010-12-29 15:46:13,378 Index 19 item in
20670 miliseconds
INFO [CONTENT-FILTER INDEX-TIMER] 2010-12-29 15:46:28,391 Index 35 item in
15013 miliseconds

Why Lucene take much time for just index small amount of document?

Could you give me the best performance of Lucence and How to improve
performance?

Here is my code
Document doc = new Document();
doc.add(new Field("system_id", data.systemId + "",
Field.Store.YES, Index.NOT_ANALYZED));
doc.add(new Field("app_id", data.appId + "",
Field.Store.YES,
Index.NOT_ANALYZED));
doc.add(new Field("object_id", data.objectId + "",
Field.Store.YES, Index.NOT_ANALYZED));
doc.add(new Field("content_id", data.contentId + "",
Field.Store.YES, Index.NOT_ANALYZED));
doc.add(new Field("owner_id", data.ownerId + "",
Field.Store.YES, Index.NOT_ANALYZED));
doc.add(new Field("to_user_id", data.toUserId + "",
Field.Store.YES, Index.NOT_ANALYZED));
doc.add(new Field("parent_id", data.parentId + "",
Field.Store.YES, Index.NOT_ANALYZED));
doc.add(new Field("content", data.content + "",
Field.Store.YES, Index.ANALYZED));
doc.add(new Field("id", System.nanoTime() + "",
Field.Store.YES, Index.NOT_ANALYZED));
iWriter.addDocument(doc);

Thank a lot for suppot.

Search Discussions

  • Anshum at Dec 29, 2010 at 9:28 am
    Lucene intermittently takes longer as
    1. It flushes the buffered docs from memory to the disk and
    2. It merges the smaller index segments to form a larger segment on regular
    intervals as per the index writer settings.
    You may have a look at various IndexWriter params in the javadoc on the
    lucene page, starting at
    http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexWriter.html#DEFAULT_RAM_BUFFER_SIZE_MB
    <http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexWriter.html#DEFAULT_RAM_BUFFER_SIZE_MB>
    --
    Anshum Gupta
    http://ai-cafe.blogspot.com

    On Wed, Dec 29, 2010 at 2:21 PM, King JKing wrote:

    Dear all,

    I use Lucene to index document content 6 field (int) and 1 file (string)

    I log the index process. Log said that,

    INFO [CONTENT-FILTER INDEX-TIMER] 2010-12-29 15:45:52,707 Index 55 item in
    10576 miliseconds
    INFO [CONTENT-FILTER INDEX-TIMER] 2010-12-29 15:46:13,378 Index 19 item in
    20670 miliseconds
    INFO [CONTENT-FILTER INDEX-TIMER] 2010-12-29 15:46:28,391 Index 35 item in
    15013 miliseconds

    Why Lucene take much time for just index small amount of document?

    Could you give me the best performance of Lucence and How to improve
    performance?

    Here is my code
    Document doc = new Document();
    doc.add(new Field("system_id", data.systemId + "",
    Field.Store.YES, Index.NOT_ANALYZED));
    doc.add(new Field("app_id", data.appId + "",
    Field.Store.YES,
    Index.NOT_ANALYZED));
    doc.add(new Field("object_id", data.objectId + "",
    Field.Store.YES, Index.NOT_ANALYZED));
    doc.add(new Field("content_id", data.contentId + "",
    Field.Store.YES, Index.NOT_ANALYZED));
    doc.add(new Field("owner_id", data.ownerId + "",
    Field.Store.YES, Index.NOT_ANALYZED));
    doc.add(new Field("to_user_id", data.toUserId + "",
    Field.Store.YES, Index.NOT_ANALYZED));
    doc.add(new Field("parent_id", data.parentId + "",
    Field.Store.YES, Index.NOT_ANALYZED));
    doc.add(new Field("content", data.content + "",
    Field.Store.YES, Index.ANALYZED));
    doc.add(new Field("id", System.nanoTime() + "",
    Field.Store.YES, Index.NOT_ANALYZED));
    iWriter.addDocument(doc);

    Thank a lot for suppot.
  • Ian Lea at Dec 29, 2010 at 9:33 am
    You should also make sure that it is lucene that is taking the time.
    You don't say where your data is coming from but it is often slower to
    read the the source data rather than to index it with lucene.

    See also http://wiki.apache.org/lucene-java/ImproveIndexingSpeed


    --
    Ian.

    On Wed, Dec 29, 2010 at 9:27 AM, Anshum wrote:
    Lucene intermittently takes longer as
    1. It flushes the buffered docs from memory to the disk and
    2. It merges the smaller index segments to form a larger segment on regular
    intervals as per the index writer settings.
    You may have a look at various IndexWriter params in the javadoc on the
    lucene page, starting at
    http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexWriter.html#DEFAULT_RAM_BUFFER_SIZE_MB
    <http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexWriter.html#DEFAULT_RAM_BUFFER_SIZE_MB>
    --
    Anshum Gupta
    http://ai-cafe.blogspot.com

    On Wed, Dec 29, 2010 at 2:21 PM, King JKing wrote:

    Dear all,

    I use Lucene to index document content 6 field (int) and 1 file (string)

    I log the index process. Log said that,

    INFO [CONTENT-FILTER INDEX-TIMER] 2010-12-29 15:45:52,707 Index 55 item in
    10576 miliseconds
    INFO [CONTENT-FILTER INDEX-TIMER] 2010-12-29 15:46:13,378 Index 19 item in
    20670 miliseconds
    INFO [CONTENT-FILTER INDEX-TIMER] 2010-12-29 15:46:28,391 Index 35 item in
    15013 miliseconds

    Why Lucene take much time for just index small amount of document?

    Could you give me the best performance of Lucence and How to improve
    performance?

    Here is my code
    Document doc = new Document();
    doc.add(new Field("system_id", data.systemId + "",
    Field.Store.YES, Index.NOT_ANALYZED));
    doc.add(new Field("app_id", data.appId + "",
    Field.Store.YES,
    Index.NOT_ANALYZED));
    doc.add(new Field("object_id", data.objectId + "",
    Field.Store.YES, Index.NOT_ANALYZED));
    doc.add(new Field("content_id", data.contentId + "",
    Field.Store.YES, Index.NOT_ANALYZED));
    doc.add(new Field("owner_id", data.ownerId + "",
    Field.Store.YES, Index.NOT_ANALYZED));
    doc.add(new Field("to_user_id", data.toUserId + "",
    Field.Store.YES, Index.NOT_ANALYZED));
    doc.add(new Field("parent_id", data.parentId + "",
    Field.Store.YES, Index.NOT_ANALYZED));
    doc.add(new Field("content", data.content + "",
    Field.Store.YES, Index.ANALYZED));
    doc.add(new Field("id", System.nanoTime() + "",
    Field.Store.YES, Index.NOT_ANALYZED));
    iWriter.addDocument(doc);

    Thank a lot for suppot.
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedDec 29, '10 at 8:52a
activeDec 29, '10 at 9:33a
posts3
users3
websitelucene.apache.org

3 users in discussion

Ian Lea: 1 post Anshum: 1 post King JKing: 1 post

People

Translate

site design / logo © 2022 Grokbase