FAQ
Thank you very much.
From: "Erick Erickson" <erickerickson@gmail.com>
Reply-To: java-user@lucene.apache.org
To: java-user@lucene.apache.org
Subject: Re: Empty fields ...
Date: Wed, 19 Jul 2006 09:48:04 -0400

Try something like

TermDocs termDocs = reader.termDocs();
termDocs.seek(new Term("<relevant field name here>", ""));
while (termDocs.next()) {
bits.set(termDocs.doc());
}

I *think* (and I'm remembering things folks wrote, haven't done this
myself)
that the empty string for the Term matches all terms. If not, you might
have
to wrap in in an outer loop that loops through all the elements, something
like

bits = new BitSet(reader.maxDoc());

TermDocs termDocs = reader.termDocs();
FilteredTermEnum fEnum = new FilteredTermEnum(reader, new
Term(field, ""));

for (Term term = null; (term = fEnum.term()) != null; fEnum.next())
{
termDocs.seek(new Term(
field,
term.text()));

while (termDocs.next()) {
bits.set(termDocs.doc());
}
}



That said, it may be best for you to loop through each document and add
that
doc to the relevant filters if it had the fields you're interested in.
You'd
only be fetching each document once, so it'd only be one loop. I don't know
enough about relative efficiencies to make a call here, probably depends
upon how many docs you're dealing with. I'd stop at the first solution that
works with acceptable performance unless you expect your corpus to grow
significantly.... And since this is done in off hours, there's not a
pressing reason to go with the very most efficient solution unless it takes
a too long or you expect to have orders of magnitued more documents in your
index eventually.

Best
Erick
_________________________________________________________________
Is your PC infected? Get a FREE online computer virus scan from McAfee®
Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 4 of 10 | next ›
Discussion Overview
groupjava-user @
categorieslucene
postedJul 6, '06 at 7:53p
activeJul 22, '06 at 8:16a
posts10
users6
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase