FAQ
Hi,

Merry Christmas and a Happy New Year to you all.

I have an index with few fields. Title, Description, Author etc.
For a search query "business development", the equivalent lucene query I
build is:

*(TITLE: business^9.00 OR AUTHOR: business^3.00 OR
DESCRIPTION:business^1.00) AND (TITLE: development^9.00 OR AUTHOR:
development^3.00 OR DESCRIPTION:development^1.00)
*
The expected result is to get all documents having both *business* and *
development* in the *TITLE* on top, followed by any one in TITLE and the
other in AUTHOR(if available) followed by both in AUTHOR and so on..., till
we get all docs having all the terms appearing anywhere in the document. But
the results are completely different from the expectation.

Please reply if you'd require any other information.

This might be a trivial issue for the pros. I have pulled my hair at it for
a while.
Any help with what's wrong here would be highly appreciated.

Thanks
Vikash.
--
If there's no way out, then let's make one.

Search Discussions

  • Prabin meitei at Dec 26, 2008 at 5:58 am
    try to construct a query like :
    BooleanQuery mainquery = new BooleanQuery();
    BooleanQuery query1 = new BooleanQuery();
    BooleanQuery query2 = new BooleanQuery();
    BooleanQuery query3 = new BooleanQuery();

    queryParser = new QueryParser("title", new StandardAnalyzer());
    query1 = queryParser.parse("business development");
    query1.setBoost(3.0);
    mainquery.add(query1, Occur.SHOULD);

    queryParser = new QueryParser("Author", new StandardAnalyzer());
    query2 = queryParser.parse("business development");
    query2.setBoost(2.0);
    mainquery.add(query2, Occur.SHOULD);

    queryParser = new QueryParser("Description", new StandardAnalyzer());
    query3 = queryParser.parse("business development");
    mainquery.add(query3, Occur.SHOULD);

    use mainquery for your searching

    Prabin
    toostep.com
    On Fri, Dec 26, 2008 at 11:16 AM, vikas bucha wrote:

    Hi,

    Merry Christmas and a Happy New Year to you all.

    I have an index with few fields. Title, Description, Author etc.
    For a search query "business development", the equivalent lucene query I
    build is:

    *(TITLE: business^9.00 OR AUTHOR: business^3.00 OR
    DESCRIPTION:business^1.00) AND (TITLE: development^9.00 OR AUTHOR:
    development^3.00 OR DESCRIPTION:development^1.00)
    *
    The expected result is to get all documents having both *business* and *
    development* in the *TITLE* on top, followed by any one in TITLE and the
    other in AUTHOR(if available) followed by both in AUTHOR and so on..., till
    we get all docs having all the terms appearing anywhere in the document.
    But
    the results are completely different from the expectation.

    Please reply if you'd require any other information.

    This might be a trivial issue for the pros. I have pulled my hair at it for
    a while.
    Any help with what's wrong here would be highly appreciated.

    Thanks
    Vikash.
    --
    If there's no way out, then let's make one.
  • Erick Erickson at Dec 26, 2008 at 7:02 pm
    Several points:

    1> Prabin's suggestion is, I think, equivalent to something like:
    title:(business AND author)^9.0 OR
    author:(business AND author)^3.0
    OR description:(business AND author)^1.0.
    Either way may give you results more in line with what
    you expect. (note, the syntax here is suspect, but you get the idea)
    I'm not sure what the boosting in your original clause
    really does, but it seems suspect... but then I'm no
    scoring expert.

    2> If you haven't already, get a copy of Luke and
    run a few of these scenarios through the "explain" function.
    That'll show you a lot about how the queries are scored. And
    Query.explain is your friend too.

    3> Boosts are a modifier of the score, so expecting the
    result sets to be simply predictable will probably surprise you.
    Boosting will *tend* to sort things the way you want, but
    a document whose description score is 10000 and title score is
    1 will still sort above other title scores (numbers pulled out of thin
    air). That said, since your title and author fields will very probably
    be shorter than description, they'll tend to sort first anyway
    I suspect.

    Best
    Erick
    On Fri, Dec 26, 2008 at 12:46 AM, vikas bucha wrote:

    Hi,

    Merry Christmas and a Happy New Year to you all.

    I have an index with few fields. Title, Description, Author etc.
    For a search query "business development", the equivalent lucene query I
    build is:

    *(TITLE: business^9.00 OR AUTHOR: business^3.00 OR
    DESCRIPTION:business^1.00) AND (TITLE: development^9.00 OR AUTHOR:
    development^3.00 OR DESCRIPTION:development^1.00)
    *
    The expected result is to get all documents having both *business* and *
    development* in the *TITLE* on top, followed by any one in TITLE and the
    other in AUTHOR(if available) followed by both in AUTHOR and so on..., till
    we get all docs having all the terms appearing anywhere in the document.
    But
    the results are completely different from the expectation.

    Please reply if you'd require any other information.

    This might be a trivial issue for the pros. I have pulled my hair at it for
    a while.
    Any help with what's wrong here would be highly appreciated.

    Thanks
    Vikash.
    --
    If there's no way out, then let's make one.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedDec 26, '08 at 5:46a
activeDec 26, '08 at 7:02p
posts3
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase