Ok... just got confused because you mentioned XML. Unless you're
actually indexing the raw XML in some of your fields, the fact that
you're indexing XML documents as your source content is irrelevant to
your choice of Analyzer.
Choice of indexer really depends on your specific project requirements
and what level of querying functionality your client needs. For example,
I started off using the StandardAnalyzer because it incorporates some
very useful and sophisticated functionality. But I found that it was
removing many stop words that my client requested the ability to query
with, so I will end up using my own custom analyzer class primarily
based on the StandardAnalyzer but modifying the stop word list.
-----Original Message-----
From: Malcolm
Sent: Tuesday, November 01, 2005 11:19 AM
To:
[email protected]Subject: Re: Analysis
Hi,
I'm just asking for opinions on Analyzer's for the indexing.
For example Otis in his article uses the WhitespaceAnalyzer
and the Sandbox program uses the StandardAnalyzer.I am just
gauging opinions on the subject with regard to XML.
I'm using a mix of the Sandbox XMLDocumentHandlerSAX and a
bit extra. I originally started using Digester but found that
I preferred the Sandbox implementation.
Thanks,
Malcolm Clark
---------------------------------------------------------------------
To unsubscribe, e-mail:
[email protected]For additional commands, e-mail:
[email protected]---------------------------------------------------------------------
To unsubscribe, e-mail:
[email protected]For additional commands, e-mail:
[email protected]