FAQ
How can I manipulate the score depending on the combination of query
terms containing in the result document? Not a single term is important.
That could be boosted. Important is the combination of terms.

The user searches for the terms A, B, C and D.
Of-course, the document containing all terms has the highest score. The
document containing just the terms B and C has a higher score than the
document containing the terms A and B.

A+B+C+D > B+C > A+B

I know the boosting combinations at query time.

Has anybody an idea how to do this?

Thanks. Sören

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

## Search Discussions

•  at Nov 13, 2006 at 4:04 am ⇧
: The user searches for the terms A, B, C and D.
: Of-course, the document containing all terms has the highest score. The
: document containing just the terms B and C has a higher score than the
: document containing the terms A and B.
:
: A+B+C+D > B+C > A+B
:
: I know the boosting combinations at query time.

that's a pretty specific and not all together intuitive ranking... can you
elaborate on your actual use case? ... why is B+C better then A+B ? .. are
these rules specific to a known list of terms, or is a general rule
relating to how you parse the users input?

off the top of my head, i would suggest building one big BooleanQuery and
putting each of the permutations you care about in it as subqueries with
boosts that corripsond to their importance. you'll probably want to
disable the coord, and depending on how you want things to work if a doc
matches your "A+B" clause *and* matches your "B+C" clause you may want to
use a DisjunctionMaxQuery with a 0.0f tiebreaker value instead of a
BooleanQuery.

-Hoss

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
•  at Nov 13, 2006 at 9:46 am ⇧

Chris Hostetter wrote:
that's a pretty specific and not all together intuitive ranking... can you
elaborate on your actual use case? ... why is B+C better then A+B ? .. are
these rules specific to a known list of terms, or is a general rule
relating to how you parse the users input?
The original user query was a Boolean query:
+(A B) +(C D)

It is possible that this query is to restrict. So I would like to give
the user to the hits matching his original query additional hits.
off the top of my head, i would suggest building one big BooleanQuery and
putting each of the permutations you care about in it as subqueries with
boosts that corripsond to their importance. you'll probably want to
disable the coord, and depending on how you want things to work if a doc
matches your "A+B" clause *and* matches your "B+C" clause you may want to
use a DisjunctionMaxQuery with a 0.0f tiebreaker value instead of a
BooleanQuery.
My first idea was sub classing TopDocCollector and overriding the
collect function. In this function I wanted to ask for terms of the
current document, calculate the score and call the collect function of
the base class with the new score as argument. I afraid it takes to much
time.

Boolean queries for each interesting combination with a corresponding
boost value should be faster.

Thank you, Hoss.

Sören

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

## Related Discussions

Discussion Overview
 group java-user categories lucene posted Nov 9, '06 at 9:25a active Nov 13, '06 at 9:46a posts 3 users 2 website lucene.apache.org

### 2 users in discussion

Content

People

Support

Translate

site design / logo © 2023 Grokbase