Hi List,

It takes pretty long time to index documents using Lucene.Net. It
takes about 3 seconds to add thounsand documents to the index. I've
used Java Lucene in the past and according to my memories indexing
should be about 20 times faster.

Here's the relevant code:

IndexWriter index_writer = new IndexWriter("index", new
StandardAnalyzer(), true);
// index_writer.SetMergeFactor(10000);
// index_writer.SetMaxMergeDocs(10000);
// index_writer.SetMaxBufferedDocs(10000);
ExecuteSqlQuery("SELECT artist, title FROM songname");
int count = 0;
while (reader.Read()) {
if (count > 0 && count%1000 == 0) {
Console.WriteLine(count);
}
Document document = new Document();
document.Add(new Field("artist",
reader.GetString("artist"), Field.Store.YES, Field.Index.TOKENIZED));
document.Add(new Field("title",
reader.GetString("title"), Field.Store.YES, Field.Index.TOKENIZED));
index_writer.AddDocument(document);
count++;
}

When decommenting the commented lines indexing gets about 2x faster,
but it's not really significant.

I'd really appreciate your insights about this speed issue.

Thanks in advance!

Search Discussions

  • Ron Grabowski at Feb 16, 2009 at 10:05 pm
    What version of Lucene.net are you using? I found that when I built the latest version from source the index building was blazingly fast compared to the latest binaries on the website.



    ----- Original Message ----
    From: László Monda <laci@monda.hu>
    To: lucene-net-user@incubator.apache.org
    Sent: Monday, February 16, 2009 4:18:35 PM
    Subject: IndexWriter.AddDocument is slow

    Hi List,

    It takes pretty long time to index documents using Lucene.Net. It
    takes about 3 seconds to add thounsand documents to the index. I've
    used Java Lucene in the past and according to my memories indexing
    should be about 20 times faster.

    Here's the relevant code:

    IndexWriter index_writer = new IndexWriter("index", new
    StandardAnalyzer(), true);
    // index_writer.SetMergeFactor(10000);
    // index_writer.SetMaxMergeDocs(10000);
    // index_writer.SetMaxBufferedDocs(10000);
    ExecuteSqlQuery("SELECT artist, title FROM songname");
    int count = 0;
    while (reader.Read()) {
    if (count > 0 && count%1000 == 0) {
    Console.WriteLine(count);
    }
    Document document = new Document();
    document.Add(new Field("artist",
    reader.GetString("artist"), Field.Store.YES, Field.Index.TOKENIZED));
    document.Add(new Field("title",
    reader.GetString("title"), Field.Store.YES, Field.Index.TOKENIZED));
    index_writer.AddDocument(document);
    count++;
    }

    When decommenting the commented lines indexing gets about 2x faster,
    but it's not really significant.

    I'd really appreciate your insights about this speed issue.

    Thanks in advance!
  • Chintan Akhani at Dec 21, 2010 at 4:55 am
    Hi,

    Thanks for this post.

    I have one question related to Indexing and searching simulataneously. As i am
    indexing more than 5 Lacs records so during indexing those records I am not
    perfroming IndexWriter.close() or commit() operation due to performance issue,
    but mean while i want to perform search and delete operations on indexed
    documents but yet not committed to a physical location. Is it possible and if
    yed then please provide some code snippet for that.

    Thanks,
    Chintan

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouplucene-net-user @
categorieslucene
postedFeb 16, '09 at 9:19p
activeDec 21, '10 at 4:55a
posts3
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase