FAQ
How can I manipulate the score depending on the combination of query
terms containing in the result document? Not a single term is important.
That could be boosted. Important is the combination of terms.

The user searches for the terms A, B, C and D.
Of-course, the document containing all terms has the highest score. The
document containing just the terms B and C has a higher score than the
document containing the terms A and B.

A+B+C+D > B+C > A+B

I know the boosting combinations at query time.

Has anybody an idea how to do this?

Thanks. Sören

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Search Discussions

  • Chris Hostetter at Nov 13, 2006 at 4:04 am
    : The user searches for the terms A, B, C and D.
    : Of-course, the document containing all terms has the highest score. The
    : document containing just the terms B and C has a higher score than the
    : document containing the terms A and B.
    :
    : A+B+C+D > B+C > A+B
    :
    : I know the boosting combinations at query time.

    that's a pretty specific and not all together intuitive ranking... can you
    elaborate on your actual use case? ... why is B+C better then A+B ? .. are
    these rules specific to a known list of terms, or is a general rule
    relating to how you parse the users input?

    off the top of my head, i would suggest building one big BooleanQuery and
    putting each of the permutations you care about in it as subqueries with
    boosts that corripsond to their importance. you'll probably want to
    disable the coord, and depending on how you want things to work if a doc
    matches your "A+B" clause *and* matches your "B+C" clause you may want to
    use a DisjunctionMaxQuery with a 0.0f tiebreaker value instead of a
    BooleanQuery.




    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]
  • Soeren Pekrul at Nov 13, 2006 at 9:46 am

    Chris Hostetter wrote:
    that's a pretty specific and not all together intuitive ranking... can you
    elaborate on your actual use case? ... why is B+C better then A+B ? .. are
    these rules specific to a known list of terms, or is a general rule
    relating to how you parse the users input?
    The original user query was a Boolean query:
    +(A B) +(C D)

    It is possible that this query is to restrict. So I would like to give
    the user to the hits matching his original query additional hits.
    off the top of my head, i would suggest building one big BooleanQuery and
    putting each of the permutations you care about in it as subqueries with
    boosts that corripsond to their importance. you'll probably want to
    disable the coord, and depending on how you want things to work if a doc
    matches your "A+B" clause *and* matches your "B+C" clause you may want to
    use a DisjunctionMaxQuery with a 0.0f tiebreaker value instead of a
    BooleanQuery.
    My first idea was sub classing TopDocCollector and overriding the
    collect function. In this function I wanted to ask for terms of the
    current document, calculate the score and call the collect function of
    the base class with the new score as argument. I afraid it takes to much
    time.

    Boolean queries for each interesting combination with a corresponding
    boost value should be faster.

    Thank you, Hoss.

    Sören

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedNov 9, '06 at 9:25a
activeNov 13, '06 at 9:46a
posts3
users2
websitelucene.apache.org

2 users in discussion

Soeren Pekrul: 2 posts Chris Hostetter: 1 post

People

Translate

site design / logo © 2023 Grokbase