FAQ
How does one search for words with characters like # and +. I have tried
searching solr with "#test" and "\#test" but all my results always come up
with "test" and not "#test". Is this some kind of configuration option I
need to set in solr?

--
- sent from my mobile
6176064373

Search Discussions

  • François Schiettecatte at Jul 22, 2011 at 2:44 pm
    Check your analyzers to make sure that these characters are not getting stripped out in the tokenization process, the url for 3.3 is somewhere along the lines of:

    http://localhost/solr/admin/analysis.jsp?highlight=on

    And you should be indeed be searching on "\#test".

    François
    On Jul 22, 2011, at 10:34 AM, Jason Toy wrote:

    How does one search for words with characters like # and +. I have tried
    searching solr with "#test" and "\#test" but all my results always come up
    with "test" and not "#test". Is this some kind of configuration option I
    need to set in solr?

    --
    - sent from my mobile
    6176064373
  • Shawn Heisey at Jul 22, 2011 at 2:50 pm

    On 7/22/2011 8:34 AM, Jason Toy wrote:
    How does one search for words with characters like # and +. I have tried
    searching solr with "#test" and "\#test" but all my results always come up
    with "test" and not "#test". Is this some kind of configuration option I
    need to set in solr?
    I would guess that your analysis chain (in schema.xml) includes
    something that removes and/or splits terms at non-alphanumeric
    characters. There are a several components that do this, but
    WordDelimiterFilter is the one that comes to mind most readily. I've
    never used the StandardTokenizer, but I believe it might do something
    similar.

    Thanks,
    Shawn
  • François Schiettecatte at Jul 22, 2011 at 2:57 pm
    Adding to my previous reply, I just did a quick check on the 'text_en' and 'text_en_splitting' field types and they both strip leading '#'.

    Cheers

    François
    On Jul 22, 2011, at 10:49 AM, Shawn Heisey wrote:
    On 7/22/2011 8:34 AM, Jason Toy wrote:
    How does one search for words with characters like # and +. I have tried
    searching solr with "#test" and "\#test" but all my results always come up
    with "test" and not "#test". Is this some kind of configuration option I
    need to set in solr?
    I would guess that your analysis chain (in schema.xml) includes something that removes and/or splits terms at non-alphanumeric characters. There are a several components that do this, but WordDelimiterFilter is the one that comes to mind most readily. I've never used the StandardTokenizer, but I believe it might do something similar.

    Thanks,
    Shawn

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupsolr-user @
categorieslucene
postedJul 22, '11 at 2:35p
activeJul 22, '11 at 2:57p
posts4
users3
websitelucene.apache.org...

People

Translate

site design / logo © 2022 Grokbase