FAQ
Hello.

Currently I'm trying to find something like an analyzer to solve the
problem.

Actually, what I need is next: search on a query string step-by-step,
trimming last char on each step. Small example:

In index we've: abc, abcdef, xyz
When search on abcdefgh the most relevant result should be abcdef, while
searching on abcde the best one is abc.

Thanks.

Sincerely,
Artyom Sokolov

Search Discussions

  • AlexElba at Apr 21, 2009 at 10:48 pm
    try to use RegexQuery



    Artyom Sokolov wrote:
    Hello.

    Currently I'm trying to find something like an analyzer to solve the
    problem.

    Actually, what I need is next: search on a query string step-by-step,
    trimming last char on each step. Small example:

    In index we've: abc, abcdef, xyz
    When search on abcdefgh the most relevant result should be abcdef, while
    searching on abcde the best one is abc.

    Thanks.

    Sincerely,
    Artyom Sokolov
    --
    View this message in context: http://www.nabble.com/Appropriate-analyzer-tp23164855p23166323.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Chris Hostetter at Apr 28, 2009 at 10:36 pm
    : try to use RegexQuery

    Except that his input string is longer then the terms he wants to match
    on.

    It sounds like what you are looking for is essentially a simplified use
    case of the "longest matching sub-phrase" problem...
    http://www.nabble.com/Dictionary-lookup-possibilities-to22977277.html#a23087470

    ...except that you have the special case where (unless you simplified your
    example) you only care about the "longest matching prefix"

    you could write an anaylyser that splits the input on each character, and
    then concats it's offset

    Input: abcdef
    Output: a_1, b_2, c_3, d_4, e_5, f_6

    ...in which case you use that analyzer when indexing; but at query time
    you use that anlyzer to build a BooleanQuery (instead of a PhraseQuery
    like QueryParser would do by default) and now a search for "abcdef" will
    match "abcde" with a higher score then "abcd" but it won't match "bcdef"
    at all.

    Out of curiousity: what's your specific use case? I've never heard of
    anyone wanting to match on something character-by-character like this
    (usually it's the reverse: people want "abcd" to match "abcde")

    : > Actually, what I need is next: search on a query string step-by-step,
    : > trimming last char on each step. Small example:
    : >
    : > In index we've: abc, abcdef, xyz
    : > When search on abcdefgh the most relevant result should be abcdef, while
    : > searching on abcde the best one is abc.



    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Erick Erickson at Apr 22, 2009 at 12:45 pm
    *If* your terms are simple (that is, not wildcarded), you may get
    some joy from TermEnum. The idea here would be to find the
    longest term *already in your index* that satisfies your need and
    use that to form a simple TermQuery....

    Essentially using TernEnum.skipTo on successively shorter
    strings until it returned true...

    Best
    Erick
    On Tue, Apr 21, 2009 at 4:57 PM, Artyom Sokolov wrote:

    Hello.

    Currently I'm trying to find something like an analyzer to solve the
    problem.

    Actually, what I need is next: search on a query string step-by-step,
    trimming last char on each step. Small example:

    In index we've: abc, abcdef, xyz
    When search on abcdefgh the most relevant result should be abcdef, while
    searching on abcde the best one is abc.

    Thanks.

    Sincerely,
    Artyom Sokolov

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedApr 21, '09 at 8:58p
activeApr 28, '09 at 10:36p
posts4
users4
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase