Grokbase Groups Lucene dev June 2016
[ ]

Michael McCandless commented on LUCENE-6336:

We could explore field collapsing / grouping, but that's maybe somewhat tricky to do with early termination (see LUCENE-7341) and it's somewhat wasteful ... it seems better to dedup once at indexing time? And if it's a simple wrapper around the dictionary, other suggesters could just use that too
AnalyzingInfixSuggester needs duplicate handling

Key: LUCENE-6336
Project: Lucene - Core
Issue Type: Bug
Affects Versions: 4.10.3, 5.0
Reporter: Jan Høydahl
Labels: lookup, suggester
Attachments: LUCENE-6336.patch

Spinoff from LUCENE-5833 but else unrelated.
Using {{AnalyzingInfixSuggester}} which is backed by a Lucene index and stores payload and score together with the suggest text.
I did some testing with Solr, producing the DocumentDictionary from an index with multiple documents containing the same text, but with random weights between 0-100. Then I got duplicate identical suggestions sorted by weight:
---etc all the way down to 0---
I also reproduced the same behavior in AnalyzingInfixSuggester directly. So there is a need for some duplicate removal here, either while building the local suggest index or during lookup. Only the highest weight suggestion for a given term should be returned.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
postedJun 18, '16 at 9:49a
activeJun 18, '16 at 9:49a

1 user in discussion

Michael McCandless (JIRA): 1 post



site design / logo © 2019 Grokbase