FAQ
Hi All:

I've only been looking at Lucene for about a week now. I'm using 1.3
RC2.

I am searching a moderately sized repository of documents containing
assembly language source code and I'm trying to implement both case
sensitive and case insensitive queries (that's what the users want).

I'm using the WhitespaceAnalyzer to index the files, thereby preserving
the case of the tokens in the index. I'm now trying to query that
index.

The Analyzer used for my QueryParser is either a standard
WhitespaceAnalyzer (for case sensitive) or a
"LowerCaseWhitespaceAnalyzer" whose underlying tokenizer simply extends
the WhitespaceTokenizer and overrides normalize to lowercase the
characters (for case insensitive).

I'm also setting the QueryParser's lowercaseWildcardTerms attribute
appropriately.

For case sensitive searches, I use a standard IndexSearcher, whose
IndexReader provides tokens directly from the index (which have their
case's preserved).

For case insensitive searches, I believe that I want to lowercase the
tokens from the index. Is this what the FilterIndexReader is for?
Should I extend that Reader and override its terms(...) methods to
provide lowercased terms?

Or am I just on the wrong path altogether?

Help?

Thanks
-Brent Schneeman

__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 1 | next ›
Discussion Overview
groupjava-user @
categorieslucene
postedOct 25, '03 at 1:00a
activeOct 25, '03 at 1:00a
posts1
users1
websitelucene.apache.org

1 user in discussion

Brent Schneeman: 1 post

People

Translate

site design / logo © 2022 Grokbase