FAQ
Hello,

I use the termsComponent to fix some autocomplete on my website. I use the
prefix and get the following results:

searching for manch:

manchester city(10)
manchester united(2)

When a user search for ches i want the following results:

chesterfield united(13)
manchester united(2)

I want to search in the middle of words. How can i fix that? I have tried
the NgramsFilter on index time
but i doesn't seems to work with the termsComponent.

My current configuration:

<fieldType name="suggestion" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldtype>



--
View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878694.html
Sent from the Solr - User mailing list archive at Nabble.com.

Search Discussions

  • Grijesh at Apr 29, 2011 at 10:51 am
    NGram will work for you if you want to search in middle of the word .You can
    also look for wildcard search for that.
    NGram will increase the size of index while wildcard queries are slow.


    -----Thanx:
    Grijesh
    www.gettinhahead.co.in --
    View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878748.html
    Sent from the Solr - User mailing list archive at Nabble.com.
  • roySolr at Apr 29, 2011 at 11:49 am
    Ok, i try NGrams. My configuration looks like this:

    <fieldType name="suggestion" class="solr.TextField"
    positionIncrementGap="100">
    <analyzer type="index">
    <charFilter class="solr.HTMLStripCharFilterFactory"/>
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.NGramFilterFactory" minGramSize="1"
    maxGramSize="15" />
    </analyzer>
    <analyzer type="query">
    <charFilter class="solr.HTMLStripCharFilterFactory"/>
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
    </fieldType>

    <field name="suggestionField" type="suggestion" indexed="true"
    stored="true"/>

    i try to run the query:

    http://localhost:8983/solr/terms?terms.fl=suggestionField&terms.prefix=chest

    Result:
    chest
    cheste
    chester


    The result is not what i expected. I think the query is not ok?..--
    View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878877.html
    Sent from the Solr - User mailing list archive at Nabble.com.
  • Lboutros at Apr 29, 2011 at 11:55 am
    you could use EdgeNGramFilterFactory :

    http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory

    And you should mix front and back ngram process in your analyzer :

    <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15"
    side="front"/>
    <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15"
    side="back"/>

    is it better ?

    Ludovic.
    -----Jouve
    France.--
    View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878891.html
    Sent from the Solr - User mailing list archive at Nabble.com.
  • Quentin Proust at Apr 29, 2011 at 12:03 pm
    You can do it without NGram with a query like this :

    http://localhost:8983/solr/terms?terms=true&terms.fl=suggestionField&terms.regex=(.*)chest(.*)&terms.regex.flag=case_insensitive
    In my case, I had to encode (.*) so replace it with %28.*%29 if needed.
    It use a regex. I don't know if it has an impact on performance.
    2011/4/29 lboutros <boutrosl@gmail.com>
    you could use EdgeNGramFilterFactory :


    http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory

    And you should mix front and back ngram process in your analyzer :

    <filter class="solr.EdgeNGramFilterFactory" minGramSize="2"
    maxGramSize="15"
    side="front"/>
    <filter class="solr.EdgeNGramFilterFactory" minGramSize="2"
    maxGramSize="15"
    side="back"/>

    is it better ?

    Ludovic.
    -----Jouve
    France.--
    View this message in context:
    http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878891.html
    Sent from the Solr - User mailing list archive at Nabble.com.


    --
    ----------------------------------------------------------------
    Quentin Proust
    Email : q.proust@gmail.com
    Tel : 06.78.81.15.94
    http://www.linkedin.com/in/quentinproust
    ----------------------------------------------------------------
  • roySolr at Apr 29, 2011 at 12:16 pm
    terms.regex doesn´t work for me. Prefix works fine. I use SOLR 1.4.. Is it
    compatible?--
    View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878948.html
    Sent from the Solr - User mailing list archive at Nabble.com.
  • Quentin Proust at Apr 29, 2011 at 2:06 pm
    @roySolr : terms.regex exits from Solr 3.1. Doesn't seem compatible.

    @ramires : Did you try with space in your regex. Something like
    terms.regex=(.*) book (.*) <-- I put space before and after book. If it
    doesn't work, try to replace space with %20. I didn't try so I don't know if
    it work.

    2011/4/29 roySolr <royrutten1989@gmail.com>
    terms.regex doesn´t work for me. Prefix works fine. I use SOLR 1.4.. Is it
    compatible?--
    View this message in context:
    http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878948.html
    Sent from the Solr - User mailing list archive at Nabble.com.


    --
    ----------------------------------------------------------------
    Quentin Proust
    Email : q.proust@gmail.com
    Tel : 06.78.81.15.94
    http://www.linkedin.com/in/quentinproust
    ----------------------------------------------------------------
  • Ramires at Apr 29, 2011 at 2:21 pm
    hi
    I tried before both %20 and " " terms it didn`t work. Also regex=(.*)(book)
    delete spaces and merge results like

    thebook
    asbook
    atbook
    songbook
    yearbook--
    View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2879375.html
    Sent from the Solr - User mailing list archive at Nabble.com.
  • Grijesh at Apr 29, 2011 at 5:21 pm
    solr-1.4 version does not support terms.regex .So you need to upgrade your
    version to solr-3.1.
    -----Thanx:
    Grijesh
    www.gettinhahead.co.in --
    View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2880040.html
    Sent from the Solr - User mailing list archive at Nabble.com.
  • Ramires at May 2, 2011 at 6:16 am
    I've already use nutch trunk 4.0. I have problem with space.

    --
    View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2888940.html
    Sent from the Solr - User mailing list archive at Nabble.com.
  • Ramires at Apr 29, 2011 at 1:28 pm
    hi
    I have question about regex terms. I try to find terms before and after
    word'ing but can't sand blank char. how can I send through ??

    terms?terms=true&terms.fl=content&terms.regex=(.*)(
    book)&terms.regex.flag=case_insensitive&terms.limit=50--
    View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2879192.html
    Sent from the Solr - User mailing list archive at Nabble.com.
  • roySolr at Apr 29, 2011 at 12:04 pm
    The words are now splitted in the index(nGram). It looks like this:

    m
    ma
    man
    manc
    manch
    manche
    manches
    manchest
    mancheste
    manchester

    The termsComponent does not see it as one word(manchester). It gives me the
    results back in NGrams(m,ma,man etc)....--
    View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878916.html
    Sent from the Solr - User mailing list archive at Nabble.com.
  • Grijesh at Apr 29, 2011 at 11:57 am
    Hello ,
    If you are using NGram then do not use TermsComponent, Query normally like
    http://localhost:8983/solr/select?q=suggestionField:chest

    It will give you the desired suggestions
    -----Thanx:
    Grijesh
    www.gettinhahead.co.in --
    View this message in context: http://lucene.472066.n3.nabble.com/Autocomplete-terms-middle-of-words-tp2878694p2878894.html
    Sent from the Solr - User mailing list archive at Nabble.com.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupsolr-user @
categorieslucene
postedApr 29, '11 at 10:18a
activeMay 2, '11 at 6:16a
posts13
users5
websitelucene.apache.org...

People

Translate

site design / logo © 2022 Grokbase