FAQ
[ https://issues.apache.org/jira/browse/LUCENE-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661302#action_12661302 ]

Otis Gospodnetic commented on LUCENE-1513:
------------------------------------------

I feel like I missed some FastSS discussion on the list.... was there one?

I took a quick look at the paper and the code. Is the following the general idea:
# index "fuzzy"/"misspelled" terms in addition to the normal terms (=> larger index, slower indexing). How much fuzziness one wants to allow or handle is decided at index time.
# rewrite the query to include variations/misspellings of each terms and use that to search (=> more clauses, slower than normal search, but faster than the "normal" fuzzy query whose speed depends on the number of indexed terms)
?

Quick code comments:
* Need to add ASL
* Need to replace tabs with 2 spaces and formatting in FuzzyHitCollector
* No @author
* Unit test if possible
* Should FastSSwC not be able to take a variable K?
* Should variables named after types (e.g. "set" in public static String getNeighborhoodString(Set<String> set) { ) be renamed, so they describe what's in them instead? (easier to understand API?)

fastss fuzzyquery
-----------------

Key: LUCENE-1513
URL: https://issues.apache.org/jira/browse/LUCENE-1513
Project: Lucene - Java
Issue Type: New Feature
Components: contrib/*
Reporter: Robert Muir
Priority: Minor
Attachments: fastSSfuzzy.zip


code for doing fuzzyqueries with fastssWC algorithm.
FuzzyIndexer: given a lucene field, it enumerates all terms and creates an auxiliary offline index for fuzzy queries.
FastFuzzyQuery: similar to fuzzy query except it queries the auxiliary index to retrieve a candidate list. this list is then verified with levenstein algorithm.
sorry but the code is a bit messy... what I'm actually using is very different from this so its pretty much untested. but at least you can see whats going on or fix it up.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 4 of 17 | next ›
Discussion Overview
groupjava-dev @
categorieslucene
postedJan 6, '09 at 6:03p
activeJan 7, '09 at 1:30a
posts17
users3
websitelucene.apache.org

People

Translate

site design / logo © 2021 Grokbase