FAQ
make it easier to access default stopwords for language analyzers
-----------------------------------------------------------------

Key: LUCENE-1967
URL: https://issues.apache.org/jira/browse/LUCENE-1967
Project: Lucene - Java
Issue Type: Improvement
Components: contrib/analyzers
Reporter: Robert Muir
Priority: Minor


DM Smith made the following comment: (sometimes it is hard to dig out the stop set from the analyzers)

Looking around, some of these analyzers have very different ways of storing the default list.
One idea is to consider generalizing something like what Simon did with LUCENE-1965, LUCENE-1962,
and having all stopwords lists stored as .txt files in resources folder.

{code}
/**
* Returns an unmodifiable instance of the default stop-words set.
* @return an unmodifiable instance of the default stop-words set.
*/
public static Set<String> getDefaultStopSet()
{code}


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Search Discussions

  • Simon Willnauer (JIRA) at Oct 9, 2009 at 2:00 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Simon Willnauer reassigned LUCENE-1967:
    ---------------------------------------

    Assignee: Simon Willnauer
    make it easier to access default stopwords for language analyzers
    -----------------------------------------------------------------

    Key: LUCENE-1967
    URL: https://issues.apache.org/jira/browse/LUCENE-1967
    Project: Lucene - Java
    Issue Type: Improvement
    Components: contrib/analyzers
    Reporter: Robert Muir
    Assignee: Simon Willnauer
    Priority: Minor

    DM Smith made the following comment: (sometimes it is hard to dig out the stop set from the analyzers)
    Looking around, some of these analyzers have very different ways of storing the default list.
    One idea is to consider generalizing something like what Simon did with LUCENE-1965, LUCENE-1962,
    and having all stopwords lists stored as .txt files in resources folder.
    {code}
    /**
    * Returns an unmodifiable instance of the default stop-words set.
    * @return an unmodifiable instance of the default stop-words set.
    */
    public static Set<String> getDefaultStopSet()
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Simon Willnauer (JIRA) at Oct 9, 2009 at 2:00 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764025#action_12764025 ]

    Simon Willnauer commented on LUCENE-1967:
    -----------------------------------------

    Thanks Robert for bringing this up in a general context. I will take care of it soon.
    make it easier to access default stopwords for language analyzers
    -----------------------------------------------------------------

    Key: LUCENE-1967
    URL: https://issues.apache.org/jira/browse/LUCENE-1967
    Project: Lucene - Java
    Issue Type: Improvement
    Components: contrib/analyzers
    Reporter: Robert Muir
    Assignee: Simon Willnauer
    Priority: Minor

    DM Smith made the following comment: (sometimes it is hard to dig out the stop set from the analyzers)
    Looking around, some of these analyzers have very different ways of storing the default list.
    One idea is to consider generalizing something like what Simon did with LUCENE-1965, LUCENE-1962,
    and having all stopwords lists stored as .txt files in resources folder.
    {code}
    /**
    * Returns an unmodifiable instance of the default stop-words set.
    * @return an unmodifiable instance of the default stop-words set.
    */
    public static Set<String> getDefaultStopSet()
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Robert Muir (JIRA) at Jan 8, 2010 at 5:33 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798087#action_12798087 ]

    Robert Muir commented on LUCENE-1967:
    -------------------------------------

    Simon, can i close this? I think you have fixed it with LUCENE-2034
    make it easier to access default stopwords for language analyzers
    -----------------------------------------------------------------

    Key: LUCENE-1967
    URL: https://issues.apache.org/jira/browse/LUCENE-1967
    Project: Lucene - Java
    Issue Type: Improvement
    Components: contrib/analyzers
    Reporter: Robert Muir
    Assignee: Simon Willnauer
    Priority: Minor

    DM Smith made the following comment: (sometimes it is hard to dig out the stop set from the analyzers)
    Looking around, some of these analyzers have very different ways of storing the default list.
    One idea is to consider generalizing something like what Simon did with LUCENE-1965, LUCENE-1962,
    and having all stopwords lists stored as .txt files in resources folder.
    {code}
    /**
    * Returns an unmodifiable instance of the default stop-words set.
    * @return an unmodifiable instance of the default stop-words set.
    */
    public static Set<String> getDefaultStopSet()
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org
  • Simon Willnauer (JIRA) at Jan 8, 2010 at 7:14 pm
    [ https://issues.apache.org/jira/browse/LUCENE-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Simon Willnauer closed LUCENE-1967.
    -----------------------------------

    Resolution: Fixed

    incorporated in LUCENE-2034
    make it easier to access default stopwords for language analyzers
    -----------------------------------------------------------------

    Key: LUCENE-1967
    URL: https://issues.apache.org/jira/browse/LUCENE-1967
    Project: Lucene - Java
    Issue Type: Improvement
    Components: contrib/analyzers
    Reporter: Robert Muir
    Assignee: Simon Willnauer
    Priority: Minor

    DM Smith made the following comment: (sometimes it is hard to dig out the stop set from the analyzers)
    Looking around, some of these analyzers have very different ways of storing the default list.
    One idea is to consider generalizing something like what Simon did with LUCENE-1965, LUCENE-1962,
    and having all stopwords lists stored as .txt files in resources folder.
    {code}
    /**
    * Returns an unmodifiable instance of the default stop-words set.
    * @return an unmodifiable instance of the default stop-words set.
    */
    public static Set<String> getDefaultStopSet()
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-dev-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieslucene
postedOct 9, '09 at 1:56p
activeJan 8, '10 at 7:14p
posts5
users1
websitelucene.apache.org

1 user in discussion

Simon Willnauer (JIRA): 5 posts

People

Translate

site design / logo © 2021 Grokbase