well, why not just include a field that identifies each document's class?
Then, to search over
all classes, just don't mention the class field.

When you *do* want to restrict by class, include "AND class:blah" in your

This assumes you don't have a huge number of classes and it wouldn't be a
to have a clause like AND (class:cl1 OR class:cl2 OR class:cl3......). That
said, a
clause containing 100s of class:blah isn't a problem.

And if you want to get really fancy, create a Filter for each class and you
can just
combine the filters appropriately when you want to query and restrict to

Limiting the results to the top 10/20 per class is tricker, but see
for a way to intervene in the hit selection process. You could keep a simple
that counted by class and reject each doc that came through the hitcollector
class after it reached your limit. Beware of doing a Reader.get() on each
hit, you'd have to do some work with TermDocs/TermEnum.

All that said, I haven't really kept up on the faceting that's been talked
in the archive, so you may want to look at the searchable mail archive and
see what's up with that.

How big is your index? That is, how many documents and how many fields
(approx) are you talking about here? That'll influence whether you would
be better off keeping them all in one index or splitting them up. But if you
can keep them in a single index, your maintenance will be *much* easier.

On Sat, Jun 7, 2008 at 10:34 AM, Sascha Fahl wrote:

I am quite new to the lucene scene and I need your help :-)
There are several document classes. Lets say documents from class A, B, C,
D and E. What I need is the following:

1) I want to search over all classes together. So the query should hit
results from all different classes - ideally it is possible
to limit the results from each class to lets say 10 or 20 results per

2) I want to search over all classes seperately.

My first idea was to have one index per class and use a MultiReader for
searching over all classes and use an IndexReader for
searching over the classes seperately. Right now I have 3 questions:
1. Is that a good idea?
2. Is there a way to identify the index a result comes from?
3. Is it possible to limit the results to a number of hits from each

Thank you,

To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 4 | next ›
Discussion Overview
groupjava-user @
postedJun 8, '08 at 11:20a
activeJun 9, '08 at 7:58a



site design / logo © 2022 Grokbase