FAQ
Hi there,

I'm trying to do a, from my point of view, simple thing.

I would like to do a search ignoring the case of the stored information
in the index...with the following code:

reader = IndexReader.open(indexDirectory);

Searcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new StandardAnalyzer();

//Created my own Query parse to handle ranges like filed:[1 TO 6]
QueryParser parser = new CustomQueryParser(FieldNames.CONTENTS, analyzer);
parser.setAllowLeadingWildcard(true);
parser.setLowercaseExpandedTerms(false);
Query query = parser.parse(queryLine);

TopDocs tmp = searcher.search(query, null, 20, sort);

To be more percisely...

I have a field which is called filename and contains a filename which
can of course be lowercase or upppercase or a mixture...

I would like to do the following:

+filename:/*scm*.doc

That should result in getting things like

/...SCMtest.doc
/...scmtest.doc
/...scm.doc
etc.

May be someone can give me hint how to solve this...

kind regards
Karl Heinz Marbaise
--
SoftwareEntwicklung Beratung Schulung Tel.: +49 (0) 2405 / 415 893
Dipl.Ing.(FH) Karl Heinz Marbaise ICQ#: 135949029
Hauptstrasse 177 USt.IdNr: DE191347579
52146 Würselen http://www.soebes.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Ian Lea at Jan 27, 2009 at 8:07 pm
    Hi


    Sounds like a job for RegexQuery. If you can't figure out how to use
    it Google will throw up some examples. You can downcase everything
    yourself or use an analyzer that does it or maybe use a case
    insensitive regexp.

    Depending on your file names you might want to avoid StandardAnalyzer.
    It is likely to split them. KeywordAnalyzer might be what you want.


    --
    Ian.

    On Tue, Jan 27, 2009 at 7:29 PM, Karl Heinz Marbaise wrote:
    Hi there,

    I'm trying to do a, from my point of view, simple thing.

    I would like to do a search ignoring the case of the stored information in
    the index...with the following code:

    reader = IndexReader.open(indexDirectory);

    Searcher searcher = new IndexSearcher(reader);
    Analyzer analyzer = new StandardAnalyzer();

    //Created my own Query parse to handle ranges like filed:[1 TO 6]
    QueryParser parser = new CustomQueryParser(FieldNames.CONTENTS, analyzer);
    parser.setAllowLeadingWildcard(true);
    parser.setLowercaseExpandedTerms(false);
    Query query = parser.parse(queryLine);

    TopDocs tmp = searcher.search(query, null, 20, sort);

    To be more percisely...

    I have a field which is called filename and contains a filename which can of
    course be lowercase or upppercase or a mixture...

    I would like to do the following:

    +filename:/*scm*.doc

    That should result in getting things like

    /...SCMtest.doc
    /...scmtest.doc
    /...scm.doc
    etc.

    May be someone can give me hint how to solve this...

    kind regards
    Karl Heinz Marbaise
    --
    SoftwareEntwicklung Beratung Schulung Tel.: +49 (0) 2405 / 415 893
    Dipl.Ing.(FH) Karl Heinz Marbaise ICQ#: 135949029
    Hauptstrasse 177 USt.IdNr: DE191347579
    52146 Würselen http://www.soebes.de

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Antony Bowesman at Jan 28, 2009 at 12:47 am

    Karl Heinz Marbaise wrote:

    I have a field which is called filename and contains a filename which
    can of course be lowercase or upppercase or a mixture...

    I would like to do the following:

    +filename:/*scm*.doc

    That should result in getting things like

    /...SCMtest.doc
    /...scmtest.doc
    /...scm.doc
    etc.

    May be someone can give me hint how to solve this...
    It's all down to the analyzer you use when you index that field and how you
    choose to tokenize it. If you want to always search case insensitively, then
    you should lower case the filename when indexing.

    Depending on how you implemented your query parser, if you have implemented
    wildcard query support, if it's anything like the standard QP, it will not put
    the query string through the analyzer, so a search for

    +filename:/*SCm*.doc

    would then not find anything, so you'd need to make sure you lower case all the
    filename field searches at some point.

    I use a custom analyzer for filenames, which lower cases and tokenizes by letter
    or digit or any custom chars and my query parser supports custom analyzers for
    getFieldQuery().

    If you want to keep the original filename, then just store the field as well as
    index it, then you can get the original back from the Document.

    Antony


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJan 27, '09 at 8:01p
activeJan 28, '09 at 12:47a
posts3
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase