Thank you both for the assistance.
I ended up going the tf(float) override route rather than sloppyFreq.
I want to keep the ability to specify how far part something is
allowed to be, but from what I understood of Doron's response, I might
lose that if I overrode sloppyFreq.

Because my application is a matching tool rather than a searching
tool, it is okay for a term or phrase that matches multiple times to
have the same score as a single match. Multiple matches don't mean
anything good in my application.
On 5/30/07, Doron Cohen wrote:
Chris Hostetter <hossman_lucene@fucit.org> wrote on 29/05/2007 12:51:38:
: I've found that trying to specify a near query using something like:
: actor_name_mv:"Foster, Jody"~2
: matches "Foster, Jody" with a tf score of 1, but it matches "Jody
: Foster" with a tf score of .577 The phraseFreq in the first case is 1
: and the phraseFreq in the second is 1/3.

as i recall, phraseFreq is passed to the tf(float) function of your
similarity, you can get differnet behavior between the tf() for phrase
queries and simple term queries by having different tf impls
for tf(float)
- used for phrases; and tf(int) - used for terms. by default, tf(int)
calls tf(float)

so you could make tf(float) round up to hte nearest int, and then
fractional phraseFreqs should score the same as exact phraseFreqs. do
some testing of cases where the phrase matches more then once on the same
field though ... it may not be what you expect ( i believe the
phraseFreqs are summed before calling tf() so there is no way to tell the
differnce between 1 exact match with a phraseFreq of "1" and 2 sloppy
matches each with phraseFreqs of 0.5.
Yes they are summed before calling tf(). Would perhaps be
better to override Similarity.sloppyFreq(int) to return 1
(when searching those queries) - this would actually mean
that the sloppiness degree is ignored. It would not be symmetric
though, in the sense that eg query "A B"~3, while it would
score the same these docs: "A B"; "B A"; "A X B"; "B X A",
it would find match "A X Y Z B" but not "B Z Y X A". In
other words, this would not be equivalent to having
SpanQuery's inOrder = false.


To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 4 of 6 | next ›
Discussion Overview
groupjava-user @
postedMay 29, '07 at 3:38p
activeMay 31, '07 at 12:31a



site design / logo © 2022 Grokbase