FAQ
Hi Paul:
Mostly of the time indexing big tables is spent on the table full
scan and network data transfer.
Please take a quick look at my OOW08 presentation about Oracle
Lucene integration:
http://docs.google.com/present/view?id=ddgw7sjp_156gf9hczxv
specially slides 13 and 14 which shows time involved during a
WikiPedia dump indexing inside an Oracle database.
Best regards, Marcelo.
On Thu, Oct 22, 2009 at 9:45 AM, Paul Taylor wrote:
I'm building a lucene index from a database, creating 1 about 1 million
documents, unsuprisingly this takes quite a long time.
I do this by sending a query  to the db over a range of ids , (10,000)
records
Add these results in Lucene
Then get next 10,0000 and so on.
When completed indexing I then call optimize()
I also set  indexWriter.setMaxBufferedDocs(1000) and
indexWriter.setMergeFactor(3000) but don't fully understand these values.
Each document contains about 10 small fields

I'm looking for some ways to improve performance.

This index writing is single threaded, is there a way I can multi-thread
writing to the indexing ?
I only call optimize() once at the end, is the best way to do it.
I'm going to run a profiler over the code, but are there any rules of thumbs
on the best values to set for MaxBufferedDocs and Mergefactor()

thanks Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


--
Marcelo F. Ochoa
http://marceloochoa.blogspot.com/
http://marcelo.ochoa.googlepages.com/home
______________
Want to integrate Lucene and Oracle?
http://marceloochoa.blogspot.com/2007/09/running-lucene-inside-your-oracle-jvm.html
Is Oracle 11g REST ready?
http://marceloochoa.blogspot.com/2008/02/is-oracle-11g-rest-ready.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 6 of 10 | next ›
Discussion Overview
groupjava-user @
categorieslucene
postedOct 22, '09 at 12:46p
activeOct 27, '09 at 10:52a
posts10
users8
websitelucene.apache.org

People

Translate

site design / logo © 2021 Grokbase