FAQ
Hi,

Below is a document in lucene
---------------------------------------------
Field Value
---------------------------------------------
ID:1
110_a:library and information
---------------------------------------------
I need to search for starts with logic, below are the search cases for the
above document

------------------------------------------------------------------------------
Query Result
------------------------------------------------------------------------------
110_a:l* ID - 1
110_a:library* ID - 1
110_a:library * No Results
110_a:library a* No Results
110_a:"library a*" No Results
------------------------------------------------------------------------------
here, if i apply single word for starts with search, it is found,
but if i add any space after the first word, it is not found

so, how to apply the query to search for starts with multiple words
--
View this message in context: http://www.nabble.com/how-to-search-for-starts-with-multiple-words-in-lucene-tp20697741p20697741.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Erick Erickson at Nov 26, 2008 at 1:47 pm
    Your problem here is probably tokenization at query time.

    Queries like 110_a:library a* would search field 110_a for
    library and your default field for a*. You might try
    +110a_:library +110a_:a*, but I doubt that's really
    what you want since there's no guarantee that the terms
    will be next to each other.

    Note that phrase queries l don't go through the wildcard
    parsers, so searching for "library a*" in quotes) won't do
    what you want.

    You might want to look at the SpanQuery family. It's unclear
    whether you'd expect a hit on something that started in the
    middle, but you can get around this by adding a synthetic
    token at the start of each field you index then adding that to
    each query. Something like:
    doc.add("field", "$ <original text>", blah blah)
    at index time and then add the "$" (or whatever) at query time
    if you require that the match never hit in the middle.

    Another possibility would be to index using something like
    KeywordAnalyzer, but this assumes that you never want
    to search for anything in that field that starts in the middle.

    Best
    Erick
    On Wed, Nov 26, 2008 at 4:48 AM, naveen.a wrote:


    Hi,

    Below is a document in lucene
    ---------------------------------------------
    Field Value
    ---------------------------------------------
    ID:1
    110_a:library and information
    ---------------------------------------------
    I need to search for starts with logic, below are the search cases for the
    above document


    ------------------------------------------------------------------------------
    Query Result

    ------------------------------------------------------------------------------
    110_a:l* ID - 1
    110_a:library* ID - 1
    110_a:library * No Results
    110_a:library a* No Results
    110_a:"library a*" No Results

    ------------------------------------------------------------------------------
    here, if i apply single word for starts with search, it is found,
    but if i add any space after the first word, it is not found

    so, how to apply the query to search for starts with multiple words
    --
    View this message in context:
    http://www.nabble.com/how-to-search-for-starts-with-multiple-words-in-lucene-tp20697741p20697741.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Naveen.a at Nov 28, 2008 at 10:11 am
    Hi,

    Thanks for your replies,

    please go to this link for the actual problem

    http://www.nabble.com/SpanFirstQuery-is-not-taking-wildcard-characters-(like-*)-as-a-logical-operator-for-the-preffix-td20719556.html#a20719556
    http://www.nabble.com/SpanFirstQuery-is-not-taking-wildcard-characters-(like-*)-as-a-logical-operator-for-the-preffix-td20719556.html#a20719556


    Erick Erickson wrote:
    Your problem here is probably tokenization at query time.

    Queries like 110_a:library a* would search field 110_a for
    library and your default field for a*. You might try
    +110a_:library +110a_:a*, but I doubt that's really
    what you want since there's no guarantee that the terms
    will be next to each other.

    Note that phrase queries l don't go through the wildcard
    parsers, so searching for "library a*" in quotes) won't do
    what you want.

    You might want to look at the SpanQuery family. It's unclear
    whether you'd expect a hit on something that started in the
    middle, but you can get around this by adding a synthetic
    token at the start of each field you index then adding that to
    each query. Something like:
    doc.add("field", "$ <original text>", blah blah)
    at index time and then add the "$" (or whatever) at query time
    if you require that the match never hit in the middle.

    Another possibility would be to index using something like
    KeywordAnalyzer, but this assumes that you never want
    to search for anything in that field that starts in the middle.

    Best
    Erick
    On Wed, Nov 26, 2008 at 4:48 AM, naveen.a wrote:


    Hi,

    Below is a document in lucene
    ---------------------------------------------
    Field Value
    ---------------------------------------------
    ID:1
    110_a:library and information
    ---------------------------------------------
    I need to search for starts with logic, below are the search cases for
    the
    above document


    ------------------------------------------------------------------------------
    Query Result

    ------------------------------------------------------------------------------
    110_a:l* ID - 1
    110_a:library* ID - 1
    110_a:library * No Results
    110_a:library a* No Results
    110_a:"library a*" No Results

    ------------------------------------------------------------------------------
    here, if i apply single word for starts with search, it is found,
    but if i add any space after the first word, it is not found

    so, how to apply the query to search for starts with multiple words
    --
    View this message in context:
    http://www.nabble.com/how-to-search-for-starts-with-multiple-words-in-lucene-tp20697741p20697741.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    --
    View this message in context: http://www.nabble.com/how-to-search-for-starts-with-multiple-words-in-lucene-tp20697741p20731826.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • AlexElba at Nov 26, 2008 at 7:30 pm
    Hi,
    I think you can achieve your goal using StandardAnalyzer during indexing and
    for search, and use WildcardQuery for Query I think it will work!!


    naveen.a wrote:
    Hi,

    Below is a document in lucene
    ---------------------------------------------
    Field Value
    ---------------------------------------------
    ID:1
    110_a:library and information
    ---------------------------------------------
    I need to search for starts with logic, below are the search cases for the
    above document

    ------------------------------------------------------------------------------
    Query Result
    ------------------------------------------------------------------------------
    110_a:l* ID - 1
    110_a:library* ID - 1
    110_a:library * No Results
    110_a:library a* No Results
    110_a:"library a*" No Results
    ------------------------------------------------------------------------------
    here, if i apply single word for starts with search, it is found,
    but if i add any space after the first word, it is not found

    so, how to apply the query to search for starts with multiple words
    --
    View this message in context: http://www.nabble.com/how-to-search-for-starts-with-multiple-words-in-lucene-tp20697741p20707534.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedNov 26, '08 at 9:48a
activeNov 28, '08 at 10:11a
posts4
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase