FAQ
Lazy Atomic Loading Stopwords in SmartCN
-----------------------------------------

Key: LUCENE-1965
URL: https://issues.apache.org/jira/browse/LUCENE-1965
Project: Lucene - Java
Issue Type: Improvement
Components: contrib/analyzers
Affects Versions: 2.9
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Fix For: 3.0
Attachments: LUCENE-1965.patch

The default constructor in SmartChineseAnalyzer loads the default (jar embedded) stopwords each time the constructor is invoked.
This should be atomically loaded only once in an unmodifiable set.



--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Search Discussions

  • Simon Willnauer (JIRA) at Oct 8, 2009 at 6:33 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Simon Willnauer updated LUCENE-1965:
    ------------------------------------

    Attachment: LUCENE-1965.patch

    attached patch
    Lazy Atomic Loading Stopwords in SmartCN
    -----------------------------------------

    Key: LUCENE-1965
    URL: https://issues.apache.org/jira/browse/LUCENE-1965
    Project: Lucene - Java
    Issue Type: Improvement
    Components: contrib/analyzers
    Affects Versions: 2.9
    Reporter: Simon Willnauer
    Assignee: Simon Willnauer
    Fix For: 3.0

    Attachments: LUCENE-1965.patch


    The default constructor in SmartChineseAnalyzer loads the default (jar embedded) stopwords each time the constructor is invoked.
    This should be atomically loaded only once in an unmodifiable set.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Simon Willnauer (JIRA) at Oct 8, 2009 at 6:35 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Simon Willnauer updated LUCENE-1965:
    ------------------------------------

    Priority: Trivial (was: Major)
    Lazy Atomic Loading Stopwords in SmartCN
    -----------------------------------------

    Key: LUCENE-1965
    URL: https://issues.apache.org/jira/browse/LUCENE-1965
    Project: Lucene - Java
    Issue Type: Improvement
    Components: contrib/analyzers
    Affects Versions: 2.9
    Reporter: Simon Willnauer
    Assignee: Simon Willnauer
    Priority: Trivial
    Fix For: 3.0

    Attachments: LUCENE-1965.patch


    The default constructor in SmartChineseAnalyzer loads the default (jar embedded) stopwords each time the constructor is invoked.
    This should be atomically loaded only once in an unmodifiable set.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert Muir (JIRA) at Oct 8, 2009 at 6:43 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763621#action_12763621 ]

    Robert Muir commented on LUCENE-1965:
    -------------------------------------

    Simon, everything is ok, but i have one comment:

    the new test: testChineseStopWordsNull, I think this is a duplicate of the one above. here is the context:
    {code}
    /*
    * Punctuation is handled in a strange way if you disable stopwords
    * In this example the IDEOGRAPHIC FULL STOP is converted into a comma.
    * if you don't supply (true) to the constructor, or use a different stopwords list,
    * then punctuation is indexed.
    */
    public void testChineseStopWordsOff() throws Exception {
    Analyzer ca = new SmartChineseAnalyzer(false); /* doesnt load stopwords */
    String sentence = "我购买了道具和服装。";
    String result[] = { "我", "购买", "了", "道具", "和", "服装", "," };
    assertAnalyzesTo(ca, sentence, result);


    }

    public void testChineseStopWordsNull() throws IOException{
    Analyzer ca = new SmartChineseAnalyzer(false); /* sets stopwords to empty set */
    String sentence = "我购买了道具和服装。";
    String result[] = { "我", "购买", "了", "道具", "和", "服装", "," };
    assertAnalyzesTo(ca, sentence, result);
    assertAnalyzesToReuse(ca, sentence, result);
    }
    {code}
    Lazy Atomic Loading Stopwords in SmartCN
    -----------------------------------------

    Key: LUCENE-1965
    URL: https://issues.apache.org/jira/browse/LUCENE-1965
    Project: Lucene - Java
    Issue Type: Improvement
    Components: contrib/analyzers
    Affects Versions: 2.9
    Reporter: Simon Willnauer
    Assignee: Simon Willnauer
    Priority: Trivial
    Fix For: 3.0

    Attachments: LUCENE-1965.patch


    The default constructor in SmartChineseAnalyzer loads the default (jar embedded) stopwords each time the constructor is invoked.
    This should be atomically loaded only once in an unmodifiable set.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Simon Willnauer (JIRA) at Oct 8, 2009 at 7:09 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Simon Willnauer updated LUCENE-1965:
    ------------------------------------

    Attachment: LUCENE-1965.patch

    Thanks robert, good catch! I was adding one test with null in the constructor but I missed to finish it apparently.
    I merged it into testChineseStopWordsOff().

    Patch attached.

    Lazy Atomic Loading Stopwords in SmartCN
    -----------------------------------------

    Key: LUCENE-1965
    URL: https://issues.apache.org/jira/browse/LUCENE-1965
    Project: Lucene - Java
    Issue Type: Improvement
    Components: contrib/analyzers
    Affects Versions: 2.9
    Reporter: Simon Willnauer
    Assignee: Simon Willnauer
    Priority: Trivial
    Fix For: 3.0

    Attachments: LUCENE-1965.patch, LUCENE-1965.patch


    The default constructor in SmartChineseAnalyzer loads the default (jar embedded) stopwords each time the constructor is invoked.
    This should be atomically loaded only once in an unmodifiable set.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert Muir (JIRA) at Oct 8, 2009 at 7:13 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763646#action_12763646 ]

    Robert Muir commented on LUCENE-1965:
    -------------------------------------

    Simon, cool. I like it now, think its a good improvement, same as with Persian and Arabic, thanks :)
    Lazy Atomic Loading Stopwords in SmartCN
    -----------------------------------------

    Key: LUCENE-1965
    URL: https://issues.apache.org/jira/browse/LUCENE-1965
    Project: Lucene - Java
    Issue Type: Improvement
    Components: contrib/analyzers
    Affects Versions: 2.9
    Reporter: Simon Willnauer
    Assignee: Simon Willnauer
    Priority: Trivial
    Fix For: 3.0

    Attachments: LUCENE-1965.patch, LUCENE-1965.patch


    The default constructor in SmartChineseAnalyzer loads the default (jar embedded) stopwords each time the constructor is invoked.
    This should be atomically loaded only once in an unmodifiable set.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Simon Willnauer (JIRA) at Oct 8, 2009 at 7:34 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Simon Willnauer closed LUCENE-1965.
    -----------------------------------

    Resolution: Fixed

    commited in r823285

    thx robert for reviewing
    Lazy Atomic Loading Stopwords in SmartCN
    -----------------------------------------

    Key: LUCENE-1965
    URL: https://issues.apache.org/jira/browse/LUCENE-1965
    Project: Lucene - Java
    Issue Type: Improvement
    Components: contrib/analyzers
    Affects Versions: 2.9
    Reporter: Simon Willnauer
    Assignee: Simon Willnauer
    Priority: Trivial
    Fix For: 3.0

    Attachments: LUCENE-1965.patch, LUCENE-1965.patch


    The default constructor in SmartChineseAnalyzer loads the default (jar embedded) stopwords each time the constructor is invoked.
    This should be atomically loaded only once in an unmodifiable set.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieslucene
postedOct 8, '09 at 6:33p
activeOct 8, '09 at 7:34p
posts7
users1
websitelucene.apache.org

1 user in discussion

Simon Willnauer (JIRA): 7 posts

People

Translate

site design / logo © 2021 Grokbase