I’m using Lucene to index database records and text documents.
I want to provide efficient fuzzy queries over the data so I’m using a secondary
Lucene index for all of the distinct terms encountered in the primary index.
Each ‘document’ in the secondary index is a term from the primary index with
fields for its q-grams, phonetic key(s) and synonyms.
It’s easy to populate the secondary index after indexing all of the records and
text documents using an IndexReader. However, to keep the secondary index up to
date I need to recognise when new terms are encountered for the first time, but
even looking deep into Lucene code and stepping through the indexing process
hasn’t revealed where this occurs – I presume because it doesn’t happen in a
single place but rather once in the in-memory term cache, once when the cache is
flushed into a segment, and again when segments are optimised.
Is this correct? Can anyone suggest how to maintain a secondary index of terms?
Perhaps only when the main index is optimised?
Thanks, Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org