FAQ
Hi,
I noticed a behavior with wildcard searches and like
to clarify.
From the FAQ
http://www.jguru.com/faq/view.jsp?EID=538312
in JGuru, Analyzer is not used for wildcard queries.
In my case I have a document which contains the word
IMPORTANT. I use PorterStemFiler + StandardAnalyzer
for indexing & searching. I am getting the document if
I search for the word IM*. But if analyzer is not used
then who does the conversion of the word to lowercase.

My code will look like this.

---
QueryParser qp=new QueryParser("title",
new MyAnalyzer());
Query q = qp.parse(text);
---


Though I pass the text in uppercase (IM*), when I
print the Query object I can see it in lowercase,
something like (title:im*)

I am using lucene-1.3-final. Can someone explain this?

Thanks & regards,
George







___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Search Discussions

  • Peter Pimley at Sep 9, 2004 at 11:51 am
    Hello everyone.

    I'm in the process of writing "my first lucene app", and I've got to the
    bit where I get my search results back (very exciting! ;).

    My documents are not stored in their original form by lucene, but in a
    seperate database. My lucene docs do however store the primary key, so
    that I can fetch the original version from the database to show the user
    (does that sound sane?)

    I see that the 'Hits' class has an id (int) method, which sounds
    interesting. The javadoc says "Returns the id for the nth document in
    this set.". However, I can't find any mention anywhere else about
    Document ids. Could anybody explain what this is?

    Many Thanks in Advance,
    Peter Pimley, Semantico


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  • Morus Walter at Sep 9, 2004 at 12:26 pm

    Peter Pimley writes:

    My documents are not stored in their original form by lucene, but in a
    seperate database. My lucene docs do however store the primary key, so
    that I can fetch the original version from the database to show the user
    (does that sound sane?) yes.
    I see that the 'Hits' class has an id (int) method, which sounds
    interesting. The javadoc says "Returns the id for the nth document in
    this set.". However, I can't find any mention anywhere else about
    Document ids. Could anybody explain what this is?
    It's lucenes internal id or document number which allows you to access
    the document and its stored fields.

    See
    IndexSearcher.doc(int i)
    or
    IndexReader.document(int n)

    The docs just don't name the parameter 'id'.

    Morus

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  • Peter Pimley at Sep 9, 2004 at 3:24 pm
    Oh, it's that simple. :)
    Thanks for that!

    Peter


    Morus Walter wrote:
    It's lucenes internal id or document number which allows you to access
    the document and its stored fields.

    See
    IndexSearcher.doc(int i)
    or
    IndexReader.document(int n)

    The docs just don't name the parameter 'id'.

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  • René Hackl at Sep 9, 2004 at 12:05 pm
    Hi George,

    I'm not sure about v1.3, but you may want to take a look
    at

    http://issues.apache.org/eyebrowse/ReadMsg?listName=lucene-user@jakarta.apache.org&msgNo=9342

    or

    http://issues.apache.org/eyebrowse/ReadMsg?listName=lucene-user@jakarta.apache.org&msgId=1806371

    cheers,
    René

    --
    NEU: Bis zu 10 GB Speicher für e-mails & Dateien!
    1 GB bereits bei GMX FreeMail http://www.gmx.net/de/go/mail


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  • Honey George at Sep 9, 2004 at 1:36 pm
    Thanks for links René,
    The mail is not exactly talking about my case because
    the StandardAnalyzer which I use does lowercase the
    input. So it is the same scenario as the FAQ entry.

    -George

    --- "René_Hackl" wrote:
    Hi George,

    I'm not sure about v1.3, but you may want to take a
    look
    at

    http://issues.apache.org/eyebrowse/ReadMsg?listName=lucene-user@jakarta.apache.org&msgNo=9342
    or

    http://issues.apache.org/eyebrowse/ReadMsg?listName=lucene-user@jakarta.apache.org&msgId=1806371
    cheers,
    René

    --
    NEU: Bis zu 10 GB Speicher für e-mails & Dateien!
    1 GB bereits bei GMX FreeMail
    http://www.gmx.net/de/go/mail


    ---------------------------------------------------------------------
    To unsubscribe, e-mail:
    lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail:
    lucene-user-help@jakarta.apache.org




    ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org
  • René Hackl at Sep 9, 2004 at 1:56 pm
    George,

    The QueryParser does toLowerCase() on WildcardQueries by default. Hence
    you'd need to follow Daniel's advice to use

    QueryParser's setLowercaseWildcardTerms(false)

    if you wanted IM* to stay IM*

    Cheers,
    René


    --
    Supergünstige DSL-Tarife + WLAN-Router für 0,- EUR*
    Jetzt zu GMX wechseln und sparen http://www.gmx.net/de/go/dsl


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
    For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedSep 9, '04 at 11:41a
activeSep 9, '04 at 3:24p
posts7
users4
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase