Hi,
I have three questions about indexing:
1) I am indexing HTML documents, how can I do "stop
removal" before indexing, I dont want to index stop
words?
2) I can have an access to the terms in one document,
but how can I have access to the document name that
these terms has been appeared?
3) I want to find phrases at index level, e.x. find
frequency of phrases in the collection, also their
frequency in each document. How can I do it in Lucene,
is there any sample code?
Thanks
____________________________________________________________________________________
Be a PS3 game guru.
Get your game face on with the latest PS3 news and previews at Yahoo! Games.
http://videogames.yahoo.com/platform?platform=120121
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org