I asked this one already on the user mailing list but maybe it's more
appropriate here:

As a simple example imagine every document in your index to have a
field "language" and "country". A tuple of language+country is what I call a

You want to search context-specific, i.e. language+country is always part of
the query (QueryFilter).

FuzzyTermEnum doesn't know about these contexts hence building a BooleanQuery
of all similar terms. E.g. "hello" means "hallo" in german - only one
character difference. But when searching in context english+USA I don't care
about german terms. So I don't want/need "hallo" in the BooleanQuery in this

So I came up with the idea to use reader.termDocs() instead of terms() in
FuzzyTermEnum. By means of a QueryFilter (it's BitSet respectively) for each
context I could determine whether a fuzzy term makes sense to be included in
the BooleanQuery or not.

This results (potentially) in a smaller BooleanQuery but I wonder whether this
approach will gain any mentionable performance advantage (maybe reduce IO?).

Thanks for feedback

To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 2 | next ›
Discussion Overview
groupdev @
postedNov 7, '07 at 9:57a
activeNov 7, '07 at 4:40p

1 user in discussion

Timo Nentwig: 2 posts



site design / logo © 2021 Grokbase