Van Nguyen wrote:
I have a field in my index that is being tokenized using the
StandardAnalyzer. Let’s say that field was:
TOOLS FOR TRAILER
The word “FOR” is a stop word so it is not being indexed (based on the
StandardAnaylzyer). When someone types in TOOLS FOR TRAILER in, I have
a BooleanQuery search for:
+CONTENTS:tools +CONTENTS:for +CONTENTS:trailer
Which will result in no match because of the “AND” search on
“+CONTENTS:for”.
Do I have to have any logic to stripe the BooleanQuery of any stop words
used in the StandardAnalyzer?
I have a field in my index that is being tokenized using the
StandardAnalyzer. Let’s say that field was:
TOOLS FOR TRAILER
The word “FOR” is a stop word so it is not being indexed (based on the
StandardAnaylzyer). When someone types in TOOLS FOR TRAILER in, I have
a BooleanQuery search for:
+CONTENTS:tools +CONTENTS:for +CONTENTS:trailer
Which will result in no match because of the “AND” search on
“+CONTENTS:for”.
Do I have to have any logic to stripe the BooleanQuery of any stop words
used in the StandardAnalyzer?
the QueryParser you can just pass in a StandardAnalyzer when you create it:
QueryParser parser = new QueryParser("defaultField", new
StandardAnalyzer());
However if you are generating the BooleanQuery yourself you will want to
make sure that you run the text through the StandardAnalyzer, and
construct it based on the tokens that the analyzer emits, eg.
Analyzer a = new StandardAnalyzer();
TokenStream ts = a.tokenStream("fieldName", new StringReader(query));
Token t = ts.next();
while (null != t) {
String token = t.termText();
// build your query using these token
...
}
...
Either method will eliminate the stop words from your query string.
Hope that helps,
Ryan
Van
------------------------------------------------------------------------
United Rentals
Consider it done.™
800-UR-RENTS
unitedrentals.com
------------------------------------------------------------------------
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
---------------------------------------------------------------------------------------------------------------------------------------------
United Rentals
Consider it done.™
800-UR-RENTS
unitedrentals.com
------------------------------------------------------------------------
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org