So, can we say that if you have something that gives you the "how many query terms matched" info, will that satisfy your requirement?
Query: term1 term2
Doc1: term1 term2 => n=2 => %100
Doc2: term1 term2 term3 term4 => n=2 => %100
Doc3: term1 term1 term3 => n=1 => %50
Doc4: term2 term3 term4 => n=1 => %50
If yes Explanation will you give that info in coord part. For example coord(1/3) means one query term matched and there are total 3 query terms.
Here is an example Explanation:
0.013397463 = (MATCH) product of:
0.040192388 = (MATCH) sum of:
0.040192388 = (MATCH) weight(pagetext:para in 34930), product of:
0.46250778 = queryWeight(pagetext:para), product of:
3.1780937 = idf(docFreq=5546, maxDocs=48977)
0.14552994 = queryNorm
0.086901 = (MATCH) fieldWeight(pagetext:para in 34930), product of:
1.0 = tf(termFreq(pagetext:para)=1)
3.1780937 = idf(docFreq=5546, maxDocs=48977)
0.02734375 = fieldNorm(field=pagetext, doc=34930)
0.33333334 = coord(1/3)
--- On Mon, 1/3/11, Amr ElAdawy wrote:
From: Amr ElAdawy <
[email protected]>
Subject: Re: Search Score percentage, Should not be relative to the highest score
To:
[email protected]Date: Monday, January 3, 2011, 3:09 PM
Consider the following.
Query: term1 term2
Doc1: term1 term2
Doc2: term1 term2 term3 term4
Doc3: term1 term1 term3
Doc4: term3 term4
For the above documents, Doc1 and Doc2 will b exact match (
as they contain
all the terms in the search Query). Doc3 is partially match
as it contains
term1 only (we neglect the term frequency tf always 1
The score percentage ( calculated by Lucene in Hits.java
line 133) and will
be
Doc1: 100%
Doc2: 100%
Doc3: 80%
This is not a problem at all, the problem occurs when there
is no exact
matching document as following:
Query: term1 term2
Doc1: term1 term3
Doc2: term2 term3 term4
Doc3: term1 term1 term3
Doc4: term3 term4
The score will be calculated as
Doc1: 100%
Doc2: 100%
Doc3: 50%
You can see that Doc1 and Doc2 got 100% despite that they
are not exact
match. but as they got the highest score, Lucene considers
them 100% match.
This is my problem
All I need is to make the percentage correct in the second
case so it will
be something as
Doc1: 50%
Doc2: 50%
Doc3: 30%
I hope I made myself clear.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Search-Score-percentage-Should-not-be-relative-to-the-highest-score-tp2183420p2184613.htmlSent from the Lucene - Java Users mailing list archive at
Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail:
[email protected]For additional commands, e-mail:
[email protected]---------------------------------------------------------------------
To unsubscribe, e-mail:
[email protected]For additional commands, e-mail:
[email protected]