If your index can fit in the IO cache, you should using a completely
different implementation...

You should be writing a sequential transaction log for add/update/
delete operations, and storing the entire index in memory
(RAMDirectory) - with periodic background flushes of the log.

If you are running multiple processes (in KS), who is invoking them
(inetd or similar?), if not, and users are on the system, you can't
control what will happen with the IO cache...

If you want performance use a server based implementation.

If you don't care about performance, then performance is not an
issue, so use the simplest approach (which is probably the current

Wasting time and resources trying to make the current implementation
"better" (and more complex) to accommodate a poor design is just a
waste of time and resources.

On Jan 9, 2009, at 3:30 PM, Marvin Humphrey wrote:
On Fri, Jan 09, 2009 at 08:11:31PM +0100, Karl Wettin wrote:

SSD is pretty close to RAM when it comes to seeking. Wouldn't that
mean that a bitset stored on an SSD would be more or less as fast
as a
bitset in RAM?
Provided that your index can fit in the system i/o cache and stay
there, you
get the speed of RAM regardless of the underlying permanent storage
There's no reason to wait on SSDs before implementing such a feature.

One thing we've contemplated in Lucy/KS is a FilterWriter, which
would write
out cached bitsets at index time. Adding that on would look like

public class MyArchitecture extends Architecture {
public ArrayList<SegDataWriter> segDataWriters(InvIndex invindex,
Segment segment) {
ArrayList<SegDataWriter> writers
= super.segDataWriters(invindex, segment);
writers.add(new FilterWriter(invindex, segment));
return writers;
public class MySchema extends Schema {
public Architecture architecture() { return new MyArchitecture
(); }
public MySchema() {
TextField textFieldSpec = new TextField(new PolyAnalyzer("en"));
specField("title", textFieldSpec);
specField("content", textFieldSpec);

IndexWriter writer = new IndexWriter(new MySchema().open("/path/

This isn't quite the same thing, because I believe you're talking
adaptively caching filters on the fly at search time. However, I
expect this
to work quite well when a finite set of filters is known in
advance, e.g. for
faceting categories.

Marvin Humphrey

To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 4 of 22 | next ›
Discussion Overview
groupdev @
postedJan 9, '09 at 7:12p
activeJan 19, '09 at 5:33p



site design / logo © 2021 Grokbase