Grokbase Groups Lucene dev April 2011
FAQ
[ https://issues.apache.org/jira/browse/LUCENE-3038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021847#comment-13021847 ]

Robert Muir commented on LUCENE-3038:
-------------------------------------

This is a duplicate of LUCENE-3022 (as you are using this onlyLongestMatch=true option).

Can we discuss over there please so this discussion is all in one place? Thanks for creating the issue.
DictionaryCompoundWordTokenFilter fails to create some tokens for final parts of words
--------------------------------------------------------------------------------------

Key: LUCENE-3038
URL: https://issues.apache.org/jira/browse/LUCENE-3038
Project: Lucene - Java
Issue Type: Bug
Components: Analysis
Affects Versions: 3.1, 4.0
Reporter: Filip Svendsen
Fix For: 3.1, 4.0

Attachments: LUCENE-3038.patch


DictionaryCompoundWordTokenFilter: Due to an off-by-one error, a word component placed last in a compound word, will not get a token if its length is equal to the minimal sub-word length.
Example:
min sub-word length: 4
Dictionary: {"alfa", "beta"}
word: "alfabeta"
Created tokens: {"alfabeta", "alfa"}
Expected tokens: {"alfabeta", "alfa", "beta"}
I have a patch with a testcase that fails on versions 3.1 and 4.0 (probably for everything between as well, and for previous versions), along with a bugfix.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 4 | next ›
Discussion Overview
groupdev @
categorieslucene
postedApr 19, '11 at 9:48p
activeApr 19, '11 at 11:04p
posts4
users1
websitelucene.apache.org

1 user in discussion

Robert Muir (JIRA): 4 posts

People

Translate

site design / logo © 2021 Grokbase