FAQ
I am trying to implement a prefix query search where I want the
searching to be case insensitive but not tokenized (I want to preserve
exact phrases).



For now, I am storing both the exact phrase (as is, for retrieval) and
the string lower-cased (to search against) with no analyzers in the
index. When I search, I lower-case my query string and search against
my lower-cased index, I give the matching exact phrase back to the user.
This doesn't seem like the best approach but I can't seem to make it
work any other way. Any suggestions?



Thanks in advance.

Search Discussions

  • Chris Hostetter at Jun 6, 2007 at 6:25 pm
    : For now, I am storing both the exact phrase (as is, for retrieval) and
    : the string lower-cased (to search against) with no analyzers in the
    : index. When I search, I lower-case my query string and search against
    : my lower-cased index, I give the matching exact phrase back to the user.
    : This doesn't seem like the best approach but I can't seem to make it
    : work any other way. Any suggestions?

    1) if you only ever return the exact string, then you only need to store
    it and you don't need to store the lowercase version
    2) if you only ever want to do case-insensitive searching, you only need
    to index the lowercase field, and you dont' need to index the exact
    string.
    3) the KeywordTokenizer along with a LowerCaseFilter should take vare of
    everything you want, without needing to preprocess the input to lowercase
    it before using lucene.

    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]
  • Anna Putnam at Jun 7, 2007 at 11:10 pm
    Hoss,

    The KeywordTokenizer and LowerCaseFilter worked great and was exactly
    what I needed.

    Thanks!

    -----Original Message-----
    From: Chris Hostetter
    Sent: Wednesday, June 06, 2007 11:25 AM
    To: [email protected]
    Subject: Re: Case Insensitive but not Tokenized


    : For now, I am storing both the exact phrase (as is, for retrieval) and
    : the string lower-cased (to search against) with no analyzers in the
    : index. When I search, I lower-case my query string and search against
    : my lower-cased index, I give the matching exact phrase back to the
    user.
    : This doesn't seem like the best approach but I can't seem to make it
    : work any other way. Any suggestions?

    1) if you only ever return the exact string, then you only need to store
    it and you don't need to store the lowercase version
    2) if you only ever want to do case-insensitive searching, you only need
    to index the lowercase field, and you dont' need to index the exact
    string.
    3) the KeywordTokenizer along with a LowerCaseFilter should take vare of
    everything you want, without needing to preprocess the input to
    lowercase
    it before using lucene.

    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJun 6, '07 at 6:07p
activeJun 7, '07 at 11:10p
posts3
users2
websitelucene.apache.org

2 users in discussion

Anna Putnam: 2 posts Chris Hostetter: 1 post

People

Translate

site design / logo © 2023 Grokbase