FAQ
Hi All,

I have a question regarding the indexing and searching for german
characters. For eg. when I search for the word "müller" also I want to
search for the word "mueller". How to achieve this in lucene.

Thanks,
supriya

--
Mit freundlichen Grüßen / Regards

Supriya Kumar Shyamal

Software Developer
tel +49 (30) 443 50 99 -22
fax +49 (30) 443 50 99 -99
email supriya.shyamal@artnology.com
___________________________
artnology GmbH
Milastr. 4
10437 Berlin
___________________________

http://www.artnology.com
__________________________________________________________________________

News / Aktuelle Projekte:
* artnology gewinnt Ausschreibung des Bundesministeriums des Innern:
Softwarelösung für die Verwaltung der Sammlung zeitgenössischer
Kunstwerke zur kulturellen Repräsentation des Bundes.

Projektreferenzen:
* Globaler eShop und Corporate-Site für Springer: www.springeronline.com
* E-Detailing-Portal für Novartis: www.interaktiv.novartis.de
* Service-Center-Plattform für Biogen: www.ms-life.de
* eCRM-System für Grünenthal: www.gruenenthal.com

___________________________________________________________________________


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Mile Rosu at Jun 20, 2006 at 12:41 pm
    Hello Supriya,

    One possibility would be to search for both müller and mueller from the interface. It means you should "normalize" in some way the search query you are doing. This solution would not affect the content of the existing index (no reindexing needed).

    Greets,
    Mile

    -----Original Message-----
    From: Supriya Kumar Shyamal
    Sent: Tuesday, June 20, 2006 3:09 PM
    To: java-user@lucene.apache.org
    Subject: How to search for europian word with and without special characters

    Hi All,

    I have a question regarding the indexing and searching for german
    characters. For eg. when I search for the word "müller" also I want to
    search for the word "mueller". How to achieve this in lucene.

    Thanks,
    supriya

    --
    Mit freundlichen Grüßen / Regards

    Supriya Kumar Shyamal

    Software Developer
    tel +49 (30) 443 50 99 -22
    fax +49 (30) 443 50 99 -99
    email supriya.shyamal@artnology.com
    ___________________________
    artnology GmbH
    Milastr. 4
    10437 Berlin
    ___________________________

    http://www.artnology.com
    __________________________________________________________________________

    News / Aktuelle Projekte:
    * artnology gewinnt Ausschreibung des Bundesministeriums des Innern:
    Softwarelösung für die Verwaltung der Sammlung zeitgenössischer
    Kunstwerke zur kulturellen Repräsentation des Bundes.

    Projektreferenzen:
    * Globaler eShop und Corporate-Site für Springer: www.springeronline.com
    * E-Detailing-Portal für Novartis: www.interaktiv.novartis.de
    * Service-Center-Plattform für Biogen: www.ms-life.de
    * eCRM-System für Grünenthal: www.gruenenthal.com

    ___________________________________________________________________________


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Otis Gospodnetic at Jun 20, 2006 at 3:47 pm
    I think you'll want to write your own Analyzer + Tokenizer, detect tokens with umlauts, and then emit two tokens at the same position (think of them as synonyms), one being the original one with the umlaut, and the other one with the umlaut transformed according to the rules (e.g. ü -> ue). Hm, I wonder if GermanAnalyzer already does this... maybe, have a look.

    Otis

    ----- Original Message ----
    From: Supriya Kumar Shyamal <supriya.shyamal@artnology.com>
    To: java-user@lucene.apache.org
    Sent: Tuesday, June 20, 2006 8:09:18 AM
    Subject: How to search for europian word with and without special characters

    Hi All,

    I have a question regarding the indexing and searching for german
    characters. For eg. when I search for the word "müller" also I want to
    search for the word "mueller". How to achieve this in lucene.

    Thanks,
    supriya

    --
    Mit freundlichen Grüßen / Regards

    Supriya Kumar Shyamal

    Software Developer
    tel +49 (30) 443 50 99 -22
    fax +49 (30) 443 50 99 -99
    email supriya.shyamal@artnology.com
    ___________________________
    artnology GmbH
    Milastr. 4
    10437 Berlin
    ___________________________

    http://www.artnology.com
    __________________________________________________________________________

    News / Aktuelle Projekte:
    * artnology gewinnt Ausschreibung des Bundesministeriums des Innern:
    Softwarelösung für die Verwaltung der Sammlung zeitgenössischer
    Kunstwerke zur kulturellen Repräsentation des Bundes.

    Projektreferenzen:
    * Globaler eShop und Corporate-Site für Springer: www.springeronline.com
    * E-Detailing-Portal für Novartis: www.interaktiv.novartis.de
    * Service-Center-Plattform für Biogen: www.ms-life.de
    * eCRM-System für Grünenthal: www.gruenenthal.com

    ___________________________________________________________________________


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org





    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Chris Hostetter at Jun 20, 2006 at 7:45 pm
    take a look at the ISOLatin1AccentFilter .. it doesn't seem to do exactly
    what you want (replacing "ü" with "ue" .. it just uses "u") but it should
    give you an idea of what you can do.

    There was also a discussion recently about how you can use a modified
    version of this Filter at index time to get both versions of the word
    indexed with the same position...

    http://www.nabble.com/Question-about-special-characters-t1676416.html#a4545641



    : Date: Tue, 20 Jun 2006 14:09:18 +0200
    : From: Supriya Kumar Shyamal <supriya.shyamal@artnology.com>
    : Reply-To: java-user@lucene.apache.org
    : To: java-user@lucene.apache.org
    : Subject: How to search for europian word with and without special
    : characters
    :
    : Hi All,
    :
    : I have a question regarding the indexing and searching for german
    : characters. For eg. when I search for the word "müller" also I want to
    : search for the word "mueller". How to achieve this in lucene.
    :
    : Thanks,
    : supriya
    :
    : --
    : Mit freundlichen Grüßen / Regards
    :
    : Supriya Kumar Shyamal
    :
    : Software Developer
    : tel +49 (30) 443 50 99 -22
    : fax +49 (30) 443 50 99 -99
    : email supriya.shyamal@artnology.com
    : ___________________________
    : artnology GmbH
    : Milastr. 4
    : 10437 Berlin
    : ___________________________
    :
    : http://www.artnology.com
    : __________________________________________________________________________
    :
    : News / Aktuelle Projekte:
    : * artnology gewinnt Ausschreibung des Bundesministeriums des Innern:
    : Softwarelösung für die Verwaltung der Sammlung zeitgenössischer
    : Kunstwerke zur kulturellen Repräsentation des Bundes.
    :
    : Projektreferenzen:
    : * Globaler eShop und Corporate-Site für Springer: www.springeronline.com
    : * E-Detailing-Portal für Novartis: www.interaktiv.novartis.de
    : * Service-Center-Plattform für Biogen: www.ms-life.de
    : * eCRM-System für Grünenthal: www.gruenenthal.com
    :
    : ___________________________________________________________________________
    :
    :
    : ---------------------------------------------------------------------
    : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    : For additional commands, e-mail: java-user-help@lucene.apache.org
    :



    -Hoss


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJun 20, '06 at 12:09p
activeJun 20, '06 at 7:45p
posts4
users4
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase