Grokbase Groups Lucene dev July 2010
FAQ
[ https://issues.apache.org/jira/browse/LUCENE-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jingkei Ly updated LUCENE-2557:
-------------------------------

Attachment: LUCENE-2557.patch

I've had a crack at implementing a fix, based on suggestions in LUCENE-329. It takes the IDF of the term used in the FuzzyQuery if it exists in the index and uses that as the IDF. If the term is not in the index it uses the average IDF of all the terms.

It is implemented as a rewrite method similar to TopTermsBoostOnlyBooleanQueryRewrite from LUCENE-124, although it required modifying TopTermsBooleanQueryRewrite a little bit.
FuzzyQuery - fuzzy terms and misspellings are ranked higher than exact matches
------------------------------------------------------------------------------

Key: LUCENE-2557
URL: https://issues.apache.org/jira/browse/LUCENE-2557
Project: Lucene - Java
Issue Type: Bug
Components: Query/Scoring
Affects Versions: 3.0.2
Reporter: Jingkei Ly
Attachments: idf-scoring-test-case.patch, LUCENE-2557.patch


The FuzzyQuery often causes misspellings to be ranked higher than the exact match, which seems to be an undesirable property generally.
For example, in an index of surnames, if I search using a FuzzyQuery for "smith", the misspellings such as "smiith", or "smiht" would appear near the top of the search results ahead of documents that match "smith".
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 5 of 15 | next ›
Discussion Overview
groupdev @
categorieslucene
postedJul 23, '10 at 4:23p
activeJul 26, '10 at 4:26p
posts15
users1
websitelucene.apache.org

1 user in discussion

Mark Harwood (JIRA): 15 posts

People

Translate

site design / logo © 2021 Grokbase