FAQ
Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
--------------------------------------------------------------------

Key: LUCENE-3562
URL: https://issues.apache.org/jira/browse/LUCENE-3562
Project: Lucene - Java
Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Fix For: 4.0


We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
docsAndPositions) that use a saved thread-private TermsEnum to do the
lookups.

But on apps that send many threads through Lucene, and/or have many
segments, this can add up to a lot of RAM, especially if the codecs
impl holds onto stuff.

Also, Terms has a close method (closes the CloseableThreadLocal) which
must be called, but we fail to do so in some places.

These saved enums are the cause of the recent OOME in TestNRTManager
(TestNRTManager.testNRTManager -seed
2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
-nightly).

Really sharing these enums is a holdover from before Lucene queries
would share state (ie, save the TermState from the first pass, and use
it later to pull enums, get docFreq, etc.). It's not helpful anymore,
and it can use gobbs of RAM, so I'd like to remove it.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Search Discussions

  • Michael McCandless (Updated) (JIRA) at Nov 5, 2011 at 8:57 pm
    [ https://issues.apache.org/jira/browse/LUCENE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michael McCandless updated LUCENE-3562:
    ---------------------------------------

    Attachment: LUCENE-3562.patch

    Patch.
    Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
    --------------------------------------------------------------------

    Key: LUCENE-3562
    URL: https://issues.apache.org/jira/browse/LUCENE-3562
    Project: Lucene - Java
    Issue Type: Improvement
    Reporter: Michael McCandless
    Assignee: Michael McCandless
    Fix For: 4.0

    Attachments: LUCENE-3562.patch


    We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
    docsAndPositions) that use a saved thread-private TermsEnum to do the
    lookups.
    But on apps that send many threads through Lucene, and/or have many
    segments, this can add up to a lot of RAM, especially if the codecs
    impl holds onto stuff.
    Also, Terms has a close method (closes the CloseableThreadLocal) which
    must be called, but we fail to do so in some places.
    These saved enums are the cause of the recent OOME in TestNRTManager
    (TestNRTManager.testNRTManager -seed
    2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
    -nightly).
    Really sharing these enums is a holdover from before Lucene queries
    would share state (ie, save the TermState from the first pass, and use
    it later to pull enums, get docFreq, etc.). It's not helpful anymore,
    and it can use gobbs of RAM, so I'd like to remove it.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Uwe Schindler (Commented) (JIRA) at Nov 5, 2011 at 10:19 pm
    [ https://issues.apache.org/jira/browse/LUCENE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144852#comment-13144852 ]

    Uwe Schindler commented on LUCENE-3562:
    ---------------------------------------

    +1
    Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
    --------------------------------------------------------------------

    Key: LUCENE-3562
    URL: https://issues.apache.org/jira/browse/LUCENE-3562
    Project: Lucene - Java
    Issue Type: Improvement
    Reporter: Michael McCandless
    Assignee: Michael McCandless
    Fix For: 4.0

    Attachments: LUCENE-3562.patch


    We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
    docsAndPositions) that use a saved thread-private TermsEnum to do the
    lookups.
    But on apps that send many threads through Lucene, and/or have many
    segments, this can add up to a lot of RAM, especially if the codecs
    impl holds onto stuff.
    Also, Terms has a close method (closes the CloseableThreadLocal) which
    must be called, but we fail to do so in some places.
    These saved enums are the cause of the recent OOME in TestNRTManager
    (TestNRTManager.testNRTManager -seed
    2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
    -nightly).
    Really sharing these enums is a holdover from before Lucene queries
    would share state (ie, save the TermState from the first pass, and use
    it later to pull enums, get docFreq, etc.). It's not helpful anymore,
    and it can use gobbs of RAM, so I'd like to remove it.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Simon Willnauer (Commented) (JIRA) at Nov 14, 2011 at 10:19 am
    [ https://issues.apache.org/jira/browse/LUCENE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149531#comment-13149531 ]

    Simon Willnauer commented on LUCENE-3562:
    -----------------------------------------

    mike I think you should commit this - patch looks good to me
    Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
    --------------------------------------------------------------------

    Key: LUCENE-3562
    URL: https://issues.apache.org/jira/browse/LUCENE-3562
    Project: Lucene - Java
    Issue Type: Improvement
    Reporter: Michael McCandless
    Assignee: Michael McCandless
    Fix For: 4.0

    Attachments: LUCENE-3562.patch


    We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
    docsAndPositions) that use a saved thread-private TermsEnum to do the
    lookups.
    But on apps that send many threads through Lucene, and/or have many
    segments, this can add up to a lot of RAM, especially if the codecs
    impl holds onto stuff.
    Also, Terms has a close method (closes the CloseableThreadLocal) which
    must be called, but we fail to do so in some places.
    These saved enums are the cause of the recent OOME in TestNRTManager
    (TestNRTManager.testNRTManager -seed
    2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
    -nightly).
    Really sharing these enums is a holdover from before Lucene queries
    would share state (ie, save the TermState from the first pass, and use
    it later to pull enums, get docFreq, etc.). It's not helpful anymore,
    and it can use gobbs of RAM, so I'd like to remove it.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Michael McCandless (Updated) (JIRA) at Nov 17, 2011 at 4:43 pm
    [ https://issues.apache.org/jira/browse/LUCENE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michael McCandless updated LUCENE-3562:
    ---------------------------------------

    Attachment: LUCENE-3562.patch

    New patch; also cuts over MultiPhraseQuery to save the TermStates from weight -> scorer, and optimizes BlockTree's TermsEnum to reduce cost of init + seekExact only usages.

    I think it's ready!
    Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
    --------------------------------------------------------------------

    Key: LUCENE-3562
    URL: https://issues.apache.org/jira/browse/LUCENE-3562
    Project: Lucene - Java
    Issue Type: Improvement
    Reporter: Michael McCandless
    Assignee: Michael McCandless
    Fix For: 4.0

    Attachments: LUCENE-3562.patch, LUCENE-3562.patch


    We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
    docsAndPositions) that use a saved thread-private TermsEnum to do the
    lookups.
    But on apps that send many threads through Lucene, and/or have many
    segments, this can add up to a lot of RAM, especially if the codecs
    impl holds onto stuff.
    Also, Terms has a close method (closes the CloseableThreadLocal) which
    must be called, but we fail to do so in some places.
    These saved enums are the cause of the recent OOME in TestNRTManager
    (TestNRTManager.testNRTManager -seed
    2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
    -nightly).
    Really sharing these enums is a holdover from before Lucene queries
    would share state (ie, save the TermState from the first pass, and use
    it later to pull enums, get docFreq, etc.). It's not helpful anymore,
    and it can use gobbs of RAM, so I'd like to remove it.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Robert Muir (Commented) (JIRA) at Nov 17, 2011 at 5:33 pm
    [ https://issues.apache.org/jira/browse/LUCENE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152151#comment-13152151 ]

    Robert Muir commented on LUCENE-3562:
    -------------------------------------

    +1
    Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
    --------------------------------------------------------------------

    Key: LUCENE-3562
    URL: https://issues.apache.org/jira/browse/LUCENE-3562
    Project: Lucene - Java
    Issue Type: Improvement
    Reporter: Michael McCandless
    Assignee: Michael McCandless
    Fix For: 4.0

    Attachments: LUCENE-3562.patch, LUCENE-3562.patch


    We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
    docsAndPositions) that use a saved thread-private TermsEnum to do the
    lookups.
    But on apps that send many threads through Lucene, and/or have many
    segments, this can add up to a lot of RAM, especially if the codecs
    impl holds onto stuff.
    Also, Terms has a close method (closes the CloseableThreadLocal) which
    must be called, but we fail to do so in some places.
    These saved enums are the cause of the recent OOME in TestNRTManager
    (TestNRTManager.testNRTManager -seed
    2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
    -nightly).
    Really sharing these enums is a holdover from before Lucene queries
    would share state (ie, save the TermState from the first pass, and use
    it later to pull enums, get docFreq, etc.). It's not helpful anymore,
    and it can use gobbs of RAM, so I'd like to remove it.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org
  • Michael McCandless (Resolved) (JIRA) at Nov 17, 2011 at 5:37 pm
    [ https://issues.apache.org/jira/browse/LUCENE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Michael McCandless resolved LUCENE-3562.
    ----------------------------------------

    Resolution: Fixed
    Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
    --------------------------------------------------------------------

    Key: LUCENE-3562
    URL: https://issues.apache.org/jira/browse/LUCENE-3562
    Project: Lucene - Java
    Issue Type: Improvement
    Reporter: Michael McCandless
    Assignee: Michael McCandless
    Fix For: 4.0

    Attachments: LUCENE-3562.patch, LUCENE-3562.patch


    We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
    docsAndPositions) that use a saved thread-private TermsEnum to do the
    lookups.
    But on apps that send many threads through Lucene, and/or have many
    segments, this can add up to a lot of RAM, especially if the codecs
    impl holds onto stuff.
    Also, Terms has a close method (closes the CloseableThreadLocal) which
    must be called, but we fail to do so in some places.
    These saved enums are the cause of the recent OOME in TestNRTManager
    (TestNRTManager.testNRTManager -seed
    2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
    -nightly).
    Really sharing these enums is a holdover from before Lucene queries
    would share state (ie, save the TermState from the first pass, and use
    it later to pull enums, get docFreq, etc.). It's not helpful anymore,
    and it can use gobbs of RAM, so I'd like to remove it.
    --
    This message is automatically generated by JIRA.
    If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
    For more information on JIRA, see: http://www.atlassian.com/software/jira



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
    For additional commands, e-mail: dev-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieslucene
postedNov 5, '11 at 8:55p
activeNov 17, '11 at 5:37p
posts7
users1
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase