Grokbase Groups Lucene dev June 2004
FAQ
Hello,

Does Lucene support UNICODE search and indexing of UNICODE
data(especially..Devnagari unicode data)?
Does it make any difference between utf-8 & utf-16 unicode docs? Bcoz
java strings supports utf-16.

Bcoz i tried indexing(using indexFiles & indexHTML from lucene Demo)
devnagari uni data(utf-8 & utf-16) & seraching for query using tomcat,
but it shows only utf-8 files and also shows files which does not
contain query. Also It does not show summary of fetched docs in correct
format.

Also i have changed unicode range in HTMLparser.jj, StandardTokenizer.jj &
QueryParser.jj and analyzer while indexing and parsing query but it does
not reflect any changes in output.

shall i have to write my own analyzer for devnagari unicode data or
Standaranalyzer will work for any languages?

Or does it require more changes? Plz mention problems and solutions.

Thanks in advance
Satish Kagathara,
IIT Bombay.




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieslucene
postedJun 8, '04 at 6:52a
activeJun 8, '04 at 6:52a
posts1
users1
websitelucene.apache.org

1 user in discussion

Satish Kagathare: 1 post

People

Translate

site design / logo © 2022 Grokbase