Grokbase Groups Lucene dev March 2009
FAQ
Brazilian Analyzer doesn't remove stopwords when uppercase is given
-------------------------------------------------------------------

Key: LUCENE-1576
URL: https://issues.apache.org/jira/browse/LUCENE-1576
Project: Lucene - Java
Issue Type: Bug
Components: contrib/analyzers
Affects Versions: 2.3.3, 2.4.2, 2.9, 3.0
Environment: not applicable
Reporter: Douglas Campos


The order of filters matter here, just need to apply lowercase token filter before removing stopwords

result = new StopFilter( result, stoptable );
result = new BrazilianStemFilter( result, excltable );
// Convert to lowercase after stemming!
result = new LowerCaseFilter( result );

Lowercase must come before BrazilianStemFilter

At the end of day I'll attach a patch, it's straightforward

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Search Discussions

  • Adriano Crestani (JIRA) at Mar 27, 2009 at 5:19 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689973#action_12689973 ]

    Adriano Crestani commented on LUCENE-1576:
    ------------------------------------------

    FYI, this topic was already discussed on this thread: http://markmail.org/thread/5wjjl6jx4yoxake5
    Brazilian Analyzer doesn't remove stopwords when uppercase is given
    -------------------------------------------------------------------

    Key: LUCENE-1576
    URL: https://issues.apache.org/jira/browse/LUCENE-1576
    Project: Lucene - Java
    Issue Type: Bug
    Components: contrib/analyzers
    Affects Versions: 2.3.3, 2.4.2, 2.9, 3.0
    Environment: not applicable
    Reporter: Douglas Campos
    Original Estimate: 0.25h
    Remaining Estimate: 0.25h

    The order of filters matter here, just need to apply lowercase token filter before removing stopwords
    result = new StopFilter( result, stoptable );
    result = new BrazilianStemFilter( result, excltable );
    // Convert to lowercase after stemming!
    result = new LowerCaseFilter( result );
    Lowercase must come before BrazilianStemFilter
    At the end of day I'll attach a patch, it's straightforward
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Douglas Campos (JIRA) at Mar 27, 2009 at 5:25 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689975#action_12689975 ]

    Douglas Campos commented on LUCENE-1576:
    ----------------------------------------

    After reading this discussion, the next step is to provide the patches, right?
    Brazilian Analyzer doesn't remove stopwords when uppercase is given
    -------------------------------------------------------------------

    Key: LUCENE-1576
    URL: https://issues.apache.org/jira/browse/LUCENE-1576
    Project: Lucene - Java
    Issue Type: Bug
    Components: contrib/analyzers
    Affects Versions: 2.3.3, 2.4.2, 2.9, 3.0
    Environment: not applicable
    Reporter: Douglas Campos
    Original Estimate: 0.25h
    Remaining Estimate: 0.25h

    The order of filters matter here, just need to apply lowercase token filter before removing stopwords
    result = new StopFilter( result, stoptable );
    result = new BrazilianStemFilter( result, excltable );
    // Convert to lowercase after stemming!
    result = new LowerCaseFilter( result );
    Lowercase must come before BrazilianStemFilter
    At the end of day I'll attach a patch, it's straightforward
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Mar 27, 2009 at 6:57 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12690005#action_12690005 ]

    Michael McCandless commented on LUCENE-1576:
    --------------------------------------------

    No need for a patch -- I see it in the thread. Thanks!
    Brazilian Analyzer doesn't remove stopwords when uppercase is given
    -------------------------------------------------------------------

    Key: LUCENE-1576
    URL: https://issues.apache.org/jira/browse/LUCENE-1576
    Project: Lucene - Java
    Issue Type: Bug
    Components: contrib/analyzers
    Affects Versions: 2.3.3, 2.4.2, 2.9, 3.0
    Environment: not applicable
    Reporter: Douglas Campos
    Fix For: 2.9

    Original Estimate: 0.25h
    Remaining Estimate: 0.25h

    The order of filters matter here, just need to apply lowercase token filter before removing stopwords
    result = new StopFilter( result, stoptable );
    result = new BrazilianStemFilter( result, excltable );
    // Convert to lowercase after stemming!
    result = new LowerCaseFilter( result );
    Lowercase must come before BrazilianStemFilter
    At the end of day I'll attach a patch, it's straightforward
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Mar 27, 2009 at 6:57 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michael McCandless reassigned LUCENE-1576:
    ------------------------------------------

    Assignee: Michael McCandless
    Brazilian Analyzer doesn't remove stopwords when uppercase is given
    -------------------------------------------------------------------

    Key: LUCENE-1576
    URL: https://issues.apache.org/jira/browse/LUCENE-1576
    Project: Lucene - Java
    Issue Type: Bug
    Components: contrib/analyzers
    Affects Versions: 2.3.3, 2.4.2, 2.9, 3.0
    Environment: not applicable
    Reporter: Douglas Campos
    Assignee: Michael McCandless
    Fix For: 2.9

    Original Estimate: 0.25h
    Remaining Estimate: 0.25h

    The order of filters matter here, just need to apply lowercase token filter before removing stopwords
    result = new StopFilter( result, stoptable );
    result = new BrazilianStemFilter( result, excltable );
    // Convert to lowercase after stemming!
    result = new LowerCaseFilter( result );
    Lowercase must come before BrazilianStemFilter
    At the end of day I'll attach a patch, it's straightforward
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Mar 27, 2009 at 6:57 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michael McCandless updated LUCENE-1576:
    ---------------------------------------

    Fix Version/s: 2.9
    Brazilian Analyzer doesn't remove stopwords when uppercase is given
    -------------------------------------------------------------------

    Key: LUCENE-1576
    URL: https://issues.apache.org/jira/browse/LUCENE-1576
    Project: Lucene - Java
    Issue Type: Bug
    Components: contrib/analyzers
    Affects Versions: 2.3.3, 2.4.2, 2.9, 3.0
    Environment: not applicable
    Reporter: Douglas Campos
    Assignee: Michael McCandless
    Fix For: 2.9

    Original Estimate: 0.25h
    Remaining Estimate: 0.25h

    The order of filters matter here, just need to apply lowercase token filter before removing stopwords
    result = new StopFilter( result, stoptable );
    result = new BrazilianStemFilter( result, excltable );
    // Convert to lowercase after stemming!
    result = new LowerCaseFilter( result );
    Lowercase must come before BrazilianStemFilter
    At the end of day I'll attach a patch, it's straightforward
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Michael McCandless (JIRA) at Mar 27, 2009 at 7:05 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michael McCandless resolved LUCENE-1576.
    ----------------------------------------

    Resolution: Fixed

    Thanks!
    Brazilian Analyzer doesn't remove stopwords when uppercase is given
    -------------------------------------------------------------------

    Key: LUCENE-1576
    URL: https://issues.apache.org/jira/browse/LUCENE-1576
    Project: Lucene - Java
    Issue Type: Bug
    Components: contrib/analyzers
    Affects Versions: 2.3.3, 2.4.2, 2.9, 3.0
    Environment: not applicable
    Reporter: Douglas Campos
    Assignee: Michael McCandless
    Fix For: 2.9

    Original Estimate: 0.25h
    Remaining Estimate: 0.25h

    The order of filters matter here, just need to apply lowercase token filter before removing stopwords
    result = new StopFilter( result, stoptable );
    result = new BrazilianStemFilter( result, excltable );
    // Convert to lowercase after stemming!
    result = new LowerCaseFilter( result );
    Lowercase must come before BrazilianStemFilter
    At the end of day I'll attach a patch, it's straightforward
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieslucene
postedMar 27, '09 at 4:55p
activeMar 27, '09 at 7:05p
posts7
users1
websitelucene.apache.org

1 user in discussion

Michael McCandless (JIRA): 7 posts

People

Translate

site design / logo © 2021 Grokbase