FAQ

Realtime Search

J. Delgado
Dec 26, 2008 at 6:48 pm
The addition of docs into tiny segments using the current data structures
seems the right way to go. Sometime back one of my engineers implemented
pseudo real-time using MultiSearcher by having an in-memory (RAM based)
"short-term" index that auto-merged into a disk-based "long term" index that
eventually get merged into "archive" indexes. Index optimization would take
place during these merges. The search we required was very time-sensitive
(searching last-minute breaking news wires). The advantage of having an
archive index is that very old documents in our applications were not
usually searched on unless archives were explicitely selected.

-- Joaquin
On Fri, Dec 26, 2008 at 10:20 AM, Doug Cutting wrote:

Michael McCandless wrote:
So then I think we should start with approach #2 (build real-time on
top of the Lucene core) and iterate from there. Newly added docs go
into a tiny segments, which IndexReader.reopen pulls in. Replaced or
deleted docs record the delete against the right SegmentReader (and
LUCENE-1314 lets reopen carry those pending deletes forward, in RAM).

I would take the simple approach first: use ordinary SegmentReader on
a RAMDirectory for the tiny segments. If that proves too slow, swap
in Memory/InstantiatedIndex for the tiny segments. If that proves too
slow, build a reader impl that reads from DocumentsWriter RAM buffer.
+1 This sounds like a good approach to me. I don't see any fundamental
reasons why we need different representations, and fewer implementations of
IndexWriter and IndexReader is generally better, unless they get way too
hairy. Mostly it seems that real-time can be done with our existing toolbox
of datastructures, but with some slightly different control structures.
Once we have the control structure in place then we should look at
optimizing data structures as needed.

Doug


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
reply

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions