FAQ
Hi all,

I am trying to index documents by phrases (multiple words) in the text, and
want to get around the StandardAnalyzer for this field. (however, I will
still
use standardAnalyzer for the other fields in the same document).

so, how should I approach it? is there a way to construct a field by
directly
providing the terms without analyzer? (by tokenstream?) which class can
construct a set of keywords into a tokenstream?


Thank you.

Yuhan

Search Discussions

  • Anshum at Feb 15, 2011 at 4:15 am
    Hi Yuhan,

    By what I understand you are trying to construct an index and add a field to
    it which would not be analyzed. Am I correct? You could simply declare that
    field as
    new Field(...... , Index.NOT_ANALYZED)
    http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/document/Field.html

    Also, if you want to analyze it using an analyzer that is different from the
    one that you'd use for the other fields, you could use the perfieldanalyzer
    <snip>
    analyzer = new PerFieldAnalyzerWrapper(new StandardAnalyzer(
    Version.LUCENE_30));
    analyzer.addAnalyzer("anotherfield", new KeywordAnalyzer());
    </snip>

    In the above snip, I instantiate an analyzer which by default would use the
    StandardAnalyzer but for 'anotherfield' would use KeywordAnalyzer.

    Hope this helps you.

    --
    Anshum Gupta
    http://ai-cafe.blogspot.com

    On Tue, Feb 15, 2011 at 2:19 AM, Yuhan Zhang wrote:

    Hi all,

    I am trying to index documents by phrases (multiple words) in the text, and
    want to get around the StandardAnalyzer for this field. (however, I will
    still
    use standardAnalyzer for the other fields in the same document).

    so, how should I approach it? is there a way to construct a field by
    directly
    providing the terms without analyzer? (by tokenstream?) which class can
    construct a set of keywords into a tokenstream?


    Thank you.

    Yuhan
  • Ian Lea at Feb 15, 2011 at 9:39 am
    Or maybe WhitespaceAnalyzer. That would split a set of keywords into
    a token stream if the set looked like "word1 word2 word3", without any
    other processing on the keywords. Use via PerFieldAnalyzerWrapper as
    Anshum says.

    --
    Ian.

    On Tue, Feb 15, 2011 at 4:14 AM, Anshum wrote:
    Hi Yuhan,

    By what I understand you are trying to construct an index and add a field to
    it which would not be analyzed. Am I correct?  You could simply declare that
    field as
    new Field(...... , Index.NOT_ANALYZED)
    http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/document/Field.html

    Also, if you want to analyze it using an analyzer that is different from the
    one that you'd use for the other fields, you could use the perfieldanalyzer
    <snip>
    analyzer = new PerFieldAnalyzerWrapper(new StandardAnalyzer(
    Version.LUCENE_30));
    analyzer.addAnalyzer("anotherfield", new KeywordAnalyzer());
    </snip>

    In the above snip, I instantiate an analyzer which by default would use the
    StandardAnalyzer but for 'anotherfield' would use KeywordAnalyzer.

    Hope this helps you.

    --
    Anshum Gupta
    http://ai-cafe.blogspot.com

    On Tue, Feb 15, 2011 at 2:19 AM, Yuhan Zhang wrote:

    Hi all,

    I am trying to index documents by phrases (multiple words) in the text, and
    want to get around the StandardAnalyzer for this field. (however, I will
    still
    use standardAnalyzer for the other fields in the same document).

    so, how should I approach it? is there a way to construct a field by
    directly
    providing the terms without analyzer? (by tokenstream?) which class can
    construct a set of keywords into a tokenstream?


    Thank you.

    Yuhan
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedFeb 14, '11 at 8:49p
activeFeb 15, '11 at 9:39a
posts3
users3
websitelucene.apache.org

3 users in discussion

Yuhan Zhang: 1 post Ian Lea: 1 post Anshum: 1 post

People

Translate

site design / logo © 2023 Grokbase