FAQ
I am initiating a simple search and after profiling the my application using
NetBeans. I see a constant heap consumption and eventually a server (tomcat)
crash due to "out of memory" error. The thread count also keeps on
increasing and most of the threads in "wait" state.

Please let me know what am I doing wrong here so that I can avoid server
crash. I am using Lucene 2.4.0.


IndexSearcher indexSearcher =
IndexSearcherFactory.getInstance().getIndexSearcher();

//Create the query and search
QueryParser queryParser = new QueryParser("contents", new
StandardAnalyzer());
Query query = queryParser.parse(searchCriteria);


TermsFilter categoryFilter = null;

// Create the filter if it is needed.
if (filter != null) {
Term aTerm = new Term(Constants.WATCH_LIST_TYPE_TERM);
categoryFilter = new TermsFilter();
for (int i = 0; i < filter.length; i++) {
aTerm = aTerm.createTerm(filter[i]);
categoryFilter.addTerm(aTerm);
}
}

// Create sort criteria
SortField [] sortFields = new SortField[2];
SortField watchList = new SortField(Constants.WATCH_LIST_TYPE_TERM,
SortField.STRING);
SortField score = SortField.FIELD_SCORE;
if (sortByWatchList) {
sortFields[0] = watchList;
sortFields[1] = score;
} else {
sortFields[1] = watchList;
sortFields[0] = score;

}
Sort sort = new Sort(sortFields);

// Collect results
TopDocs topDocs = indexSearcher.search(query, categoryFilter,
Constants.MAX_HITS, sort);
ScoreDoc scoreDoc[] = topDocs.scoreDocs;
int numDocs = scoreDoc.length;
if (numDocs > 0) results = scoreDoc;

--
View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22663917.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Michael McCandless at Mar 23, 2009 at 6:08 pm
    Are you not closing the IndexSearcher?

    Mike

    Chetan Shah wrote:
    I am initiating a simple search and after profiling the my
    application using
    NetBeans. I see a constant heap consumption and eventually a server
    (tomcat)
    crash due to "out of memory" error. The thread count also keeps on
    increasing and most of the threads in "wait" state.

    Please let me know what am I doing wrong here so that I can avoid
    server
    crash. I am using Lucene 2.4.0.


    IndexSearcher indexSearcher =
    IndexSearcherFactory.getInstance().getIndexSearcher();

    //Create the query and search
    QueryParser queryParser = new QueryParser("contents", new
    StandardAnalyzer());
    Query query = queryParser.parse(searchCriteria);


    TermsFilter categoryFilter = null;

    // Create the filter if it is needed.
    if (filter != null) {
    Term aTerm = new Term(Constants.WATCH_LIST_TYPE_TERM);
    categoryFilter = new TermsFilter();
    for (int i = 0; i < filter.length; i++) {
    aTerm = aTerm.createTerm(filter[i]);
    categoryFilter.addTerm(aTerm);
    }
    }

    // Create sort criteria
    SortField [] sortFields = new SortField[2];
    SortField watchList = new SortField(Constants.WATCH_LIST_TYPE_TERM,
    SortField.STRING);
    SortField score = SortField.FIELD_SCORE;
    if (sortByWatchList) {
    sortFields[0] = watchList;
    sortFields[1] = score;
    } else {
    sortFields[1] = watchList;
    sortFields[0] = score;

    }
    Sort sort = new Sort(sortFields);

    // Collect results
    TopDocs topDocs = indexSearcher.search(query, categoryFilter,
    Constants.MAX_HITS, sort);
    ScoreDoc scoreDoc[] = topDocs.scoreDocs;
    int numDocs = scoreDoc.length;
    if (numDocs > 0) results = scoreDoc;

    --
    View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22663917.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Chetan Shah at Mar 23, 2009 at 6:21 pm
    No, I have a singleton from where I get my searcher and it is kept through
    out the application.



    Michael McCandless-2 wrote:

    Are you not closing the IndexSearcher?

    Mike

    Chetan Shah wrote:
    I am initiating a simple search and after profiling the my
    application using
    NetBeans. I see a constant heap consumption and eventually a server
    (tomcat)
    crash due to "out of memory" error. The thread count also keeps on
    increasing and most of the threads in "wait" state.

    Please let me know what am I doing wrong here so that I can avoid
    server
    crash. I am using Lucene 2.4.0.


    IndexSearcher indexSearcher =
    IndexSearcherFactory.getInstance().getIndexSearcher();

    //Create the query and search
    QueryParser queryParser = new QueryParser("contents", new
    StandardAnalyzer());
    Query query = queryParser.parse(searchCriteria);


    TermsFilter categoryFilter = null;

    // Create the filter if it is needed.
    if (filter != null) {
    Term aTerm = new Term(Constants.WATCH_LIST_TYPE_TERM);
    categoryFilter = new TermsFilter();
    for (int i = 0; i < filter.length; i++) {
    aTerm = aTerm.createTerm(filter[i]);
    categoryFilter.addTerm(aTerm);
    }
    }

    // Create sort criteria
    SortField [] sortFields = new SortField[2];
    SortField watchList = new SortField(Constants.WATCH_LIST_TYPE_TERM,
    SortField.STRING);
    SortField score = SortField.FIELD_SCORE;
    if (sortByWatchList) {
    sortFields[0] = watchList;
    sortFields[1] = score;
    } else {
    sortFields[1] = watchList;
    sortFields[0] = score;

    }
    Sort sort = new Sort(sortFields);

    // Collect results
    TopDocs topDocs = indexSearcher.search(query, categoryFilter,
    Constants.MAX_HITS, sort);
    ScoreDoc scoreDoc[] = topDocs.scoreDocs;
    int numDocs = scoreDoc.length;
    if (numDocs > 0) results = scoreDoc;

    --
    View this message in context:
    http://www.nabble.com/Memory-Leak--tp22663917p22663917.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    --
    View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22666060.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Chetan Shah at Mar 23, 2009 at 6:23 pm
    After reading this forum post :
    http://www.nabble.com/Lucene-Memory-Leak-tt19276999.html#a19364866

    I created a Singleton For Standard Analyzer too. But the problem still
    persists.

    I have 2 singletons now. 1 for Standard Analyzer and other for
    IndexSearcher.

    The code is as follows :

    package watchlistsearch.core;

    import java.io.IOException;

    import org.apache.lucene.search.IndexSearcher;
    import org.apache.lucene.store.Directory;
    import org.apache.lucene.store.RAMDirectory;

    import watchlistsearch.utils.Constants;

    public class IndexSearcherFactory {

    private static IndexSearcherFactory instance = null;

    private IndexSearcher indexSearcher;

    private IndexSearcherFactory() {

    }

    public static IndexSearcherFactory getInstance() {

    if (IndexSearcherFactory.instance == null) {
    IndexSearcherFactory.instance = new IndexSearcherFactory();
    }

    return IndexSearcherFactory.instance;

    }

    public IndexSearcher getIndexSearcher() throws IOException {

    if (this.indexSearcher == null) {
    Directory directory = new RAMDirectory(Constants.INDEX_DIRECTORY);
    indexSearcher = new IndexSearcher(directory);
    }

    return this.indexSearcher;
    }

    }



    package watchlistsearch.core;

    import java.io.IOException;

    import org.apache.log4j.Logger;
    import org.apache.lucene.analysis.standard.StandardAnalyzer;


    ---------------------------------------------------------------

    public class AnalyzerFactory {

    private static AnalyzerFactory instance = null;

    private StandardAnalyzer standardAnalyzer;

    Logger logger = Logger.getLogger(AnalyzerFactory.class);

    private AnalyzerFactory() {

    }

    public static AnalyzerFactory getInstance() {

    if (AnalyzerFactory.instance == null) {
    AnalyzerFactory.instance = new AnalyzerFactory();
    }

    return AnalyzerFactory.instance;

    }

    public StandardAnalyzer getStandardAnalyzer() throws IOException {

    if (this.standardAnalyzer == null) {
    this.standardAnalyzer = new StandardAnalyzer();
    logger.debug("StandardAnalyzer Initialized..");

    }

    return this.standardAnalyzer;
    }

    }

    --
    View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22666121.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Mar 23, 2009 at 7:00 pm
    Hmm... after how many queries do you see the crash?

    Can you post the full OOME stack trace?

    You're using a RAMDirectory to hold the entire index... how large is
    your index?

    Mike

    Chetan Shah wrote:
    After reading this forum post :
    http://www.nabble.com/Lucene-Memory-Leak-tt19276999.html#a19364866

    I created a Singleton For Standard Analyzer too. But the problem still
    persists.

    I have 2 singletons now. 1 for Standard Analyzer and other for
    IndexSearcher.

    The code is as follows :

    package watchlistsearch.core;

    import java.io.IOException;

    import org.apache.lucene.search.IndexSearcher;
    import org.apache.lucene.store.Directory;
    import org.apache.lucene.store.RAMDirectory;

    import watchlistsearch.utils.Constants;

    public class IndexSearcherFactory {

    private static IndexSearcherFactory instance = null;

    private IndexSearcher indexSearcher;

    private IndexSearcherFactory() {

    }

    public static IndexSearcherFactory getInstance() {

    if (IndexSearcherFactory.instance == null) {
    IndexSearcherFactory.instance = new IndexSearcherFactory();
    }

    return IndexSearcherFactory.instance;

    }

    public IndexSearcher getIndexSearcher() throws IOException {

    if (this.indexSearcher == null) {
    Directory directory = new RAMDirectory(Constants.INDEX_DIRECTORY);
    indexSearcher = new IndexSearcher(directory);
    }

    return this.indexSearcher;
    }

    }



    package watchlistsearch.core;

    import java.io.IOException;

    import org.apache.log4j.Logger;
    import org.apache.lucene.analysis.standard.StandardAnalyzer;


    ---------------------------------------------------------------

    public class AnalyzerFactory {

    private static AnalyzerFactory instance = null;

    private StandardAnalyzer standardAnalyzer;

    Logger logger = Logger.getLogger(AnalyzerFactory.class);

    private AnalyzerFactory() {

    }

    public static AnalyzerFactory getInstance() {

    if (AnalyzerFactory.instance == null) {
    AnalyzerFactory.instance = new AnalyzerFactory();
    }

    return AnalyzerFactory.instance;

    }

    public StandardAnalyzer getStandardAnalyzer() throws IOException {

    if (this.standardAnalyzer == null) {
    this.standardAnalyzer = new StandardAnalyzer();
    logger.debug("StandardAnalyzer Initialized..");

    }

    return this.standardAnalyzer;
    }

    }

    --
    View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22666121.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Chetan Shah at Mar 23, 2009 at 7:37 pm
    The stack trace is attached.
    http://www.nabble.com/file/p22667542/dump dump


    The file size of
    _30.cfx - 1462KB
    _32.cfs - 3432KB
    _30.cfs - 645KB






    Michael McCandless-2 wrote:

    Hmm... after how many queries do you see the crash?

    Can you post the full OOME stack trace?

    You're using a RAMDirectory to hold the entire index... how large is
    your index?

    Mike

    Chetan Shah wrote:
    After reading this forum post :
    http://www.nabble.com/Lucene-Memory-Leak-tt19276999.html#a19364866

    I created a Singleton For Standard Analyzer too. But the problem still
    persists.

    I have 2 singletons now. 1 for Standard Analyzer and other for
    IndexSearcher.

    The code is as follows :

    package watchlistsearch.core;

    import java.io.IOException;

    import org.apache.lucene.search.IndexSearcher;
    import org.apache.lucene.store.Directory;
    import org.apache.lucene.store.RAMDirectory;

    import watchlistsearch.utils.Constants;

    public class IndexSearcherFactory {

    private static IndexSearcherFactory instance = null;

    private IndexSearcher indexSearcher;

    private IndexSearcherFactory() {

    }

    public static IndexSearcherFactory getInstance() {

    if (IndexSearcherFactory.instance == null) {
    IndexSearcherFactory.instance = new IndexSearcherFactory();
    }

    return IndexSearcherFactory.instance;

    }

    public IndexSearcher getIndexSearcher() throws IOException {

    if (this.indexSearcher == null) {
    Directory directory = new RAMDirectory(Constants.INDEX_DIRECTORY);
    indexSearcher = new IndexSearcher(directory);
    }

    return this.indexSearcher;
    }

    }



    package watchlistsearch.core;

    import java.io.IOException;

    import org.apache.log4j.Logger;
    import org.apache.lucene.analysis.standard.StandardAnalyzer;


    ---------------------------------------------------------------

    public class AnalyzerFactory {

    private static AnalyzerFactory instance = null;

    private StandardAnalyzer standardAnalyzer;

    Logger logger = Logger.getLogger(AnalyzerFactory.class);

    private AnalyzerFactory() {

    }

    public static AnalyzerFactory getInstance() {

    if (AnalyzerFactory.instance == null) {
    AnalyzerFactory.instance = new AnalyzerFactory();
    }

    return AnalyzerFactory.instance;

    }

    public StandardAnalyzer getStandardAnalyzer() throws IOException {

    if (this.standardAnalyzer == null) {
    this.standardAnalyzer = new StandardAnalyzer();
    logger.debug("StandardAnalyzer Initialized..");

    }

    return this.standardAnalyzer;
    }

    }

    --
    View this message in context:
    http://www.nabble.com/Memory-Leak--tp22663917p22666121.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    --
    View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22667542.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Matthew Hall at Mar 23, 2009 at 7:48 pm
    Perhaps this is a simple question, but looking at your stack trace, I'm
    not seeing where it was set during the tomcat initialization, so here goes:

    Are you setting up the jvm's heap size during your Tomcat initialization
    somewhere?

    If not, that very well could be part of your issue, as the standard JVM
    heapsize varies from platform to platform, so your windows based
    installation of tomcat simply might not have enough JVM Heap available
    to completely instantiate your RAMDirectory.

    So, to start what is your heap currently set at for tomcat?

    Secondly, if you try to increase it to a more reasonable value (say 512M
    or 1G) do you still run into this issue?

    Matt

    Chetan Shah wrote:
    The stack trace is attached.
    http://www.nabble.com/file/p22667542/dump dump


    The file size of
    _30.cfx - 1462KB
    _32.cfs - 3432KB
    _30.cfs - 645KB


    The source code of WatchListHTMLUtilities.getHTMLTitle is as follows :

    File f = new File(htmlFileName);
    FileInputStream fis = new FileInputStream(f);
    org.apache.lucene.demo.html.HTMLParser parser = new HTMLParser(fis);
    String title = parser.getTitle();
    fis.close();
    fis = null;
    f = null;
    return title;





    Michael McCandless-2 wrote:
    Hmm... after how many queries do you see the crash?

    Can you post the full OOME stack trace?

    You're using a RAMDirectory to hold the entire index... how large is
    your index?

    Mike

    Chetan Shah wrote:

    After reading this forum post :
    http://www.nabble.com/Lucene-Memory-Leak-tt19276999.html#a19364866

    I created a Singleton For Standard Analyzer too. But the problem still
    persists.

    I have 2 singletons now. 1 for Standard Analyzer and other for
    IndexSearcher.

    The code is as follows :

    package watchlistsearch.core;

    import java.io.IOException;

    import org.apache.lucene.search.IndexSearcher;
    import org.apache.lucene.store.Directory;
    import org.apache.lucene.store.RAMDirectory;

    import watchlistsearch.utils.Constants;

    public class IndexSearcherFactory {

    private static IndexSearcherFactory instance = null;

    private IndexSearcher indexSearcher;

    private IndexSearcherFactory() {

    }

    public static IndexSearcherFactory getInstance() {

    if (IndexSearcherFactory.instance == null) {
    IndexSearcherFactory.instance = new IndexSearcherFactory();
    }

    return IndexSearcherFactory.instance;

    }

    public IndexSearcher getIndexSearcher() throws IOException {

    if (this.indexSearcher == null) {
    Directory directory = new RAMDirectory(Constants.INDEX_DIRECTORY);
    indexSearcher = new IndexSearcher(directory);
    }

    return this.indexSearcher;
    }

    }



    package watchlistsearch.core;

    import java.io.IOException;

    import org.apache.log4j.Logger;
    import org.apache.lucene.analysis.standard.StandardAnalyzer;


    ---------------------------------------------------------------

    public class AnalyzerFactory {

    private static AnalyzerFactory instance = null;

    private StandardAnalyzer standardAnalyzer;

    Logger logger = Logger.getLogger(AnalyzerFactory.class);

    private AnalyzerFactory() {

    }

    public static AnalyzerFactory getInstance() {

    if (AnalyzerFactory.instance == null) {
    AnalyzerFactory.instance = new AnalyzerFactory();
    }

    return AnalyzerFactory.instance;

    }

    public StandardAnalyzer getStandardAnalyzer() throws IOException {

    if (this.standardAnalyzer == null) {
    this.standardAnalyzer = new StandardAnalyzer();
    logger.debug("StandardAnalyzer Initialized..");

    }

    return this.standardAnalyzer;
    }

    }

    --
    View this message in context:
    http://www.nabble.com/Memory-Leak--tp22663917p22666121.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Chetan Shah at Mar 23, 2009 at 8:20 pm
    I am using the default heap size which according to Netbeans is around 65MB.

    If the RAM directory was not initialized correctly, how am I getting valid
    search results? I am able to execute searches for quite some time before I
    get OOME.

    Makes Sense? Or Maybe I am missing something, please let me know.



    Matthew Hall-7 wrote:
    Perhaps this is a simple question, but looking at your stack trace, I'm
    not seeing where it was set during the tomcat initialization, so here
    goes:

    Are you setting up the jvm's heap size during your Tomcat initialization
    somewhere?

    If not, that very well could be part of your issue, as the standard JVM
    heapsize varies from platform to platform, so your windows based
    installation of tomcat simply might not have enough JVM Heap available
    to completely instantiate your RAMDirectory.

    So, to start what is your heap currently set at for tomcat?

    Secondly, if you try to increase it to a more reasonable value (say 512M
    or 1G) do you still run into this issue?

    Matt

    Chetan Shah wrote:
    The stack trace is attached.
    http://www.nabble.com/file/p22667542/dump dump


    The file size of
    _30.cfx - 1462KB
    _32.cfs - 3432KB
    _30.cfs - 645KB


    The source code of WatchListHTMLUtilities.getHTMLTitle is as follows :

    File f = new File(htmlFileName);
    FileInputStream fis = new FileInputStream(f);
    org.apache.lucene.demo.html.HTMLParser parser = new HTMLParser(fis);
    String title = parser.getTitle();
    fis.close();
    fis = null;
    f = null;
    return title;





    Michael McCandless-2 wrote:
    Hmm... after how many queries do you see the crash?

    Can you post the full OOME stack trace?

    You're using a RAMDirectory to hold the entire index... how large is
    your index?

    Mike

    Chetan Shah wrote:

    After reading this forum post :
    http://www.nabble.com/Lucene-Memory-Leak-tt19276999.html#a19364866

    I created a Singleton For Standard Analyzer too. But the problem still
    persists.

    I have 2 singletons now. 1 for Standard Analyzer and other for
    IndexSearcher.

    The code is as follows :

    package watchlistsearch.core;

    import java.io.IOException;

    import org.apache.lucene.search.IndexSearcher;
    import org.apache.lucene.store.Directory;
    import org.apache.lucene.store.RAMDirectory;

    import watchlistsearch.utils.Constants;

    public class IndexSearcherFactory {

    private static IndexSearcherFactory instance = null;

    private IndexSearcher indexSearcher;

    private IndexSearcherFactory() {

    }

    public static IndexSearcherFactory getInstance() {

    if (IndexSearcherFactory.instance == null) {
    IndexSearcherFactory.instance = new IndexSearcherFactory();
    }

    return IndexSearcherFactory.instance;

    }

    public IndexSearcher getIndexSearcher() throws IOException {

    if (this.indexSearcher == null) {
    Directory directory = new RAMDirectory(Constants.INDEX_DIRECTORY);
    indexSearcher = new IndexSearcher(directory);
    }

    return this.indexSearcher;
    }

    }



    package watchlistsearch.core;

    import java.io.IOException;

    import org.apache.log4j.Logger;
    import org.apache.lucene.analysis.standard.StandardAnalyzer;


    ---------------------------------------------------------------

    public class AnalyzerFactory {

    private static AnalyzerFactory instance = null;

    private StandardAnalyzer standardAnalyzer;

    Logger logger = Logger.getLogger(AnalyzerFactory.class);

    private AnalyzerFactory() {

    }

    public static AnalyzerFactory getInstance() {

    if (AnalyzerFactory.instance == null) {
    AnalyzerFactory.instance = new AnalyzerFactory();
    }

    return AnalyzerFactory.instance;

    }

    public StandardAnalyzer getStandardAnalyzer() throws IOException {

    if (this.standardAnalyzer == null) {
    this.standardAnalyzer = new StandardAnalyzer();
    logger.debug("StandardAnalyzer Initialized..");

    }

    return this.standardAnalyzer;
    }

    }

    --
    View this message in context:
    http://www.nabble.com/Memory-Leak--tp22663917p22666121.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    --
    View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22668265.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Mar 23, 2009 at 8:28 pm
    Is there anything else in this JRE?

    65 MB ought to be plenty for what you are trying to do w/ just Lucene,
    I think.

    Though to differentiate whether "you are not giving enough RAM to
    Lucene" vs "you truly have a memory leak", you should try increasing
    the heap size to something absurdly big (256 MB?), then see if you can
    get the OOME again. If you do get OOME then it's a real leak, and I
    think next step after that is to get a heap dump to see what's using
    all the RAM.

    Mike

    Chetan Shah wrote:
    I am using the default heap size which according to Netbeans is
    around 65MB.

    If the RAM directory was not initialized correctly, how am I getting
    valid
    search results? I am able to execute searches for quite some time
    before I
    get OOME.

    Makes Sense? Or Maybe I am missing something, please let me know.



    Matthew Hall-7 wrote:
    Perhaps this is a simple question, but looking at your stack trace,
    I'm
    not seeing where it was set during the tomcat initialization, so here
    goes:

    Are you setting up the jvm's heap size during your Tomcat
    initialization
    somewhere?

    If not, that very well could be part of your issue, as the standard
    JVM
    heapsize varies from platform to platform, so your windows based
    installation of tomcat simply might not have enough JVM Heap
    available
    to completely instantiate your RAMDirectory.

    So, to start what is your heap currently set at for tomcat?

    Secondly, if you try to increase it to a more reasonable value (say
    512M
    or 1G) do you still run into this issue?

    Matt

    Chetan Shah wrote:
    The stack trace is attached.
    http://www.nabble.com/file/p22667542/dump dump


    The file size of
    _30.cfx - 1462KB
    _32.cfs - 3432KB
    _30.cfs - 645KB


    The source code of WatchListHTMLUtilities.getHTMLTitle is as
    follows :

    File f = new File(htmlFileName);
    FileInputStream fis = new FileInputStream(f);
    org.apache.lucene.demo.html.HTMLParser parser = new
    HTMLParser(fis);
    String title = parser.getTitle();
    fis.close();
    fis = null;
    f = null;
    return title;





    Michael McCandless-2 wrote:
    Hmm... after how many queries do you see the crash?

    Can you post the full OOME stack trace?

    You're using a RAMDirectory to hold the entire index... how large
    is
    your index?

    Mike

    Chetan Shah wrote:

    After reading this forum post :
    http://www.nabble.com/Lucene-Memory-Leak-tt19276999.html#a19364866

    I created a Singleton For Standard Analyzer too. But the problem
    still
    persists.

    I have 2 singletons now. 1 for Standard Analyzer and other for
    IndexSearcher.

    The code is as follows :

    package watchlistsearch.core;

    import java.io.IOException;

    import org.apache.lucene.search.IndexSearcher;
    import org.apache.lucene.store.Directory;
    import org.apache.lucene.store.RAMDirectory;

    import watchlistsearch.utils.Constants;

    public class IndexSearcherFactory {

    private static IndexSearcherFactory instance = null;

    private IndexSearcher indexSearcher;

    private IndexSearcherFactory() {

    }

    public static IndexSearcherFactory getInstance() {

    if (IndexSearcherFactory.instance == null) {
    IndexSearcherFactory.instance = new IndexSearcherFactory();
    }

    return IndexSearcherFactory.instance;

    }

    public IndexSearcher getIndexSearcher() throws IOException {

    if (this.indexSearcher == null) {
    Directory directory = new
    RAMDirectory(Constants.INDEX_DIRECTORY);
    indexSearcher = new IndexSearcher(directory);
    }

    return this.indexSearcher;
    }

    }



    package watchlistsearch.core;

    import java.io.IOException;

    import org.apache.log4j.Logger;
    import org.apache.lucene.analysis.standard.StandardAnalyzer;


    ---------------------------------------------------------------

    public class AnalyzerFactory {

    private static AnalyzerFactory instance = null;

    private StandardAnalyzer standardAnalyzer;

    Logger logger = Logger.getLogger(AnalyzerFactory.class);

    private AnalyzerFactory() {

    }

    public static AnalyzerFactory getInstance() {

    if (AnalyzerFactory.instance == null) {
    AnalyzerFactory.instance = new AnalyzerFactory();
    }

    return AnalyzerFactory.instance;

    }

    public StandardAnalyzer getStandardAnalyzer() throws
    IOException {

    if (this.standardAnalyzer == null) {
    this.standardAnalyzer = new StandardAnalyzer();
    logger.debug("StandardAnalyzer Initialized..");

    }

    return this.standardAnalyzer;
    }

    }

    --
    View this message in context:
    http://www.nabble.com/Memory-Leak--tp22663917p22666121.html
    Sent from the Lucene - Java Users mailing list archive at
    Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    --
    View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22668265.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Chetan Shah at Mar 24, 2009 at 5:22 pm
    After some more researching I discovered that the following code snippet
    seems to be the culprit. I have to call this to get the "title" of the
    indexed html page. And this is called 10 times as my I display 10 results on
    a page.

    Any Suggestions on how to achieve this without the OOME issue.


    File f = new File(htmlFileName);
    FileInputStream fis = new FileInputStream(f);
    HTMLParser parser = new HTMLParser(fis);
    String title = parser.getTitle();
    /* following was added to for my sanity :) */
    parser = null;
    fis.close();
    fis = null;
    f = null;
    /* till here */
    return title;


    Chetan Shah wrote:
    I am initiating a simple search and after profiling the my application
    using NetBeans. I see a constant heap consumption and eventually a server
    (tomcat) crash due to "out of memory" error. The thread count also keeps
    on increasing and most of the threads in "wait" state.

    Please let me know what am I doing wrong here so that I can avoid server
    crash. I am using Lucene 2.4.0.


    IndexSearcher indexSearcher =
    IndexSearcherFactory.getInstance().getIndexSearcher();

    //Create the query and search
    QueryParser queryParser = new QueryParser("contents", new
    StandardAnalyzer());
    Query query = queryParser.parse(searchCriteria);


    TermsFilter categoryFilter = null;

    // Create the filter if it is needed.
    if (filter != null) {
    Term aTerm = new Term(Constants.WATCH_LIST_TYPE_TERM);
    categoryFilter = new TermsFilter();
    for (int i = 0; i < filter.length; i++) {
    aTerm = aTerm.createTerm(filter[i]);
    categoryFilter.addTerm(aTerm);
    }
    }

    // Create sort criteria
    SortField [] sortFields = new SortField[2];
    SortField watchList = new SortField(Constants.WATCH_LIST_TYPE_TERM,
    SortField.STRING);
    SortField score = SortField.FIELD_SCORE;
    if (sortByWatchList) {
    sortFields[0] = watchList;
    sortFields[1] = score;
    } else {
    sortFields[1] = watchList;
    sortFields[0] = score;

    }
    Sort sort = new Sort(sortFields);

    // Collect results
    TopDocs topDocs = indexSearcher.search(query, categoryFilter,
    Constants.MAX_HITS, sort);
    ScoreDoc scoreDoc[] = topDocs.scoreDocs;
    int numDocs = scoreDoc.length;
    if (numDocs > 0) results = scoreDoc;
    --
    View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22685294.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Mar 24, 2009 at 6:03 pm
    Odd. I don't know of any memory leaks w/ the demo HTMLParser, hmm
    though it's doing some fairly scary stuff in its getReader() method.
    EG it spawns a new thread every time you run it. And, it's parsing
    the entire HTML document even though you only want the title.

    You may want to switch to better supported HTMLParsers, eg NekoHTML.

    Plus, it would be better if you extracted the title during indexing,
    and stored in the document, than doing all this work at search time.
    You want CPU at search time to be minimized (think of all the
    electricity...).

    But: if you increase the HEAP do you still eventually hit OOME?

    Mike

    Chetan Shah wrote:
    After some more researching I discovered that the following code snippet
    seems to be the culprit. I have to call this to get the "title" of the
    indexed html page. And this is called 10 times as my I display 10 results on
    a page.

    Any Suggestions on how to achieve this without the OOME issue.


    File f = new File(htmlFileName);
    FileInputStream fis = new FileInputStream(f);
    HTMLParser parser = new HTMLParser(fis);
    String title = parser.getTitle();
    /* following was added to for my sanity :) */
    parser = null;
    fis.close();
    fis = null;
    f = null;
    /* till here */
    return title;


    Chetan Shah wrote:
    I am initiating a simple search and after profiling the my application
    using NetBeans. I see a constant heap consumption and eventually a server
    (tomcat) crash due to "out of memory" error. The thread count also keeps
    on increasing and most of the threads in "wait" state.

    Please let me know what am I doing wrong here so that I can avoid server
    crash. I am using Lucene 2.4.0.


    IndexSearcher indexSearcher =
    IndexSearcherFactory.getInstance().getIndexSearcher();

    //Create the query and search
    QueryParser queryParser = new QueryParser("contents", new
    StandardAnalyzer());
    Query query = queryParser.parse(searchCriteria);


    TermsFilter categoryFilter = null;

    // Create the filter if it is needed.
    if (filter != null) {
    Term aTerm = new Term(Constants.WATCH_LIST_TYPE_TERM);
    categoryFilter = new TermsFilter();
    for (int i = 0; i < filter.length; i++) {
    aTerm = aTerm.createTerm(filter[i]);
    categoryFilter.addTerm(aTerm);
    }
    }

    // Create sort criteria
    SortField [] sortFields = new SortField[2];
    SortField watchList = new SortField(Constants.WATCH_LIST_TYPE_TERM,
    SortField.STRING);
    SortField score = SortField.FIELD_SCORE;
    if (sortByWatchList) {
    sortFields[0] = watchList;
    sortFields[1] = score;
    } else {
    sortFields[1] = watchList;
    sortFields[0] = score;

    }
    Sort sort = new Sort(sortFields);

    // Collect results
    TopDocs topDocs = indexSearcher.search(query, categoryFilter,
    Constants.MAX_HITS, sort);
    ScoreDoc scoreDoc[] = topDocs.scoreDocs;
    int numDocs = scoreDoc.length;
    if (numDocs > 0) results = scoreDoc;
    --
    View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22685294.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Chetan Shah at Mar 24, 2009 at 6:21 pm
    Highly appreciate your replies Michael.

    No, I don't hit OOME if I comment out the call to getHTMLTitle. The heap
    behaves perfectly.

    I completely agree with you, the thread count goes haywire the moment I call
    the HTMLParser.getTitle(). I have seen a thread count of like 600 before my
    I hit OOME (with the getTitle() call on) and 90% of those threads are in
    wait state. They are not doing anything but just sitting there forever, I am
    sure they are consuming the heap and never giving it back.

    Does my hypothesis make sense?








    Michael McCandless-2 wrote:
    Odd. I don't know of any memory leaks w/ the demo HTMLParser, hmm
    though it's doing some fairly scary stuff in its getReader() method.
    EG it spawns a new thread every time you run it. And, it's parsing
    the entire HTML document even though you only want the title.

    You may want to switch to better supported HTMLParsers, eg NekoHTML.

    Plus, it would be better if you extracted the title during indexing,
    and stored in the document, than doing all this work at search time.
    You want CPU at search time to be minimized (think of all the
    electricity...).

    But: if you increase the HEAP do you still eventually hit OOME?

    Mike

    Chetan Shah wrote:
    After some more researching I discovered that the following code snippet
    seems to be the culprit. I have to call this to get the "title" of the
    indexed html page. And this is called 10 times as my I display 10 results
    on
    a page.

    Any Suggestions on how to achieve this without the OOME issue.


    File f = new File(htmlFileName);
    FileInputStream fis = new FileInputStream(f);
    HTMLParser parser = new HTMLParser(fis);
    String title = parser.getTitle();
    /* following was added to for my sanity :) */
    parser = null;
    fis.close();
    fis = null;
    f = null;
    /* till here */
    return title;


    Chetan Shah wrote:
    I am initiating a simple search and after profiling the my application
    using NetBeans. I see a constant heap consumption and eventually a
    server
    (tomcat) crash due to "out of memory" error. The thread count also keeps
    on increasing and most of the threads in "wait" state.

    Please let me know what am I doing wrong here so that I can avoid server
    crash. I am using Lucene 2.4.0.


    IndexSearcher indexSearcher =
    IndexSearcherFactory.getInstance().getIndexSearcher();

    //Create the query and search
    QueryParser queryParser = new
    QueryParser("contents", new
    StandardAnalyzer());
    Query query = queryParser.parse(searchCriteria);


    TermsFilter categoryFilter = null;

    // Create the filter if it is needed.
    if (filter != null) {
    Term aTerm = new
    Term(Constants.WATCH_LIST_TYPE_TERM);
    categoryFilter = new TermsFilter();
    for (int i = 0; i < filter.length; i++) {
    aTerm =
    aTerm.createTerm(filter[i]);
    categoryFilter.addTerm(aTerm);
    }
    }

    // Create sort criteria
    SortField [] sortFields = new SortField[2];
    SortField watchList = new
    SortField(Constants.WATCH_LIST_TYPE_TERM,
    SortField.STRING);
    SortField score = SortField.FIELD_SCORE;
    if (sortByWatchList) {
    sortFields[0] = watchList;
    sortFields[1] = score;
    } else {
    sortFields[1] = watchList;
    sortFields[0] = score;

    }
    Sort sort = new Sort(sortFields);

    // Collect results
    TopDocs topDocs = indexSearcher.search(query,
    categoryFilter,
    Constants.MAX_HITS, sort);
    ScoreDoc scoreDoc[] = topDocs.scoreDocs;
    int numDocs = scoreDoc.length;
    if (numDocs > 0) results = scoreDoc;
    --
    View this message in context:
    http://www.nabble.com/Memory-Leak--tp22663917p22685294.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    --
    View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22686500.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Mar 24, 2009 at 6:34 pm
    Actually, I was hoping you could try leaving the getHTML calls in, but
    increase the heap size of your Tomcat instance.

    Ie, to be sure there really is a leak vs you're just not giving the
    JRE enough memory.

    I do like your hypothesis, but looking at HTMLParser it seems like the
    thread should exit after parsing the HTML. Or, maybe there's
    something about the particular HTML documents you're parsing? I just
    tested this test case:

    public void testHTMLParserLeak() throws Exception {
    for(int i=0;i<100000;i++) {
    InputStream is = new
    ByteArrayInputStream("<title>Here</title>".getBytes());
    HTMLParser parser = new HTMLParser(is);
    String title = parser.getTitle();
    assertEquals("Here", title);
    is.close();
    }
    }

    And it runs fine and memory seems stable. Can you try that test case,
    but swap in some of your own HTML docs?

    Also: can you run "kill -QUIT" on your app to get a full thread dump?
    (Hmm I think you may be on windows; I'm not sure what the equivalent
    operation is).

    Mike

    Chetan Shah wrote:
    Highly appreciate your replies Michael.

    No, I don't hit OOME if I comment out the call to getHTMLTitle. The heap
    behaves perfectly.

    I completely agree with you, the thread count goes haywire the moment I call
    the HTMLParser.getTitle(). I have seen a thread count of like 600 before my
    I hit OOME (with the getTitle() call on) and 90% of those threads are in
    wait state. They are not doing anything but just sitting there forever, I am
    sure they are consuming the heap and never giving it back.

    Does my hypothesis make sense?








    Michael McCandless-2 wrote:
    Odd.  I don't know of any memory leaks w/ the demo HTMLParser, hmm
    though it's doing some fairly scary stuff in its getReader() method.
    EG it spawns a new thread every time you run it.  And, it's parsing
    the entire HTML document even though you only want the title.

    You may want to switch to better supported HTMLParsers, eg NekoHTML.

    Plus, it would be better if you extracted the title during indexing,
    and stored in the document, than doing all this work at search time.
    You want CPU at search time to be minimized (think of all the
    electricity...).

    But: if you increase the HEAP do you still eventually hit OOME?

    Mike

    Chetan Shah wrote:
    After some more researching I discovered that the following code snippet
    seems to be the culprit. I have to call this to get the "title" of the
    indexed html page. And this is called 10 times as my I display 10 results
    on
    a page.

    Any Suggestions on how to achieve this without the OOME issue.


    File f = new File(htmlFileName);
    FileInputStream fis = new FileInputStream(f);
    HTMLParser parser = new HTMLParser(fis);
    String title = parser.getTitle();
    /* following was added to for my sanity :) */
    parser = null;
    fis.close();
    fis = null;
    f = null;
    /* till here */
    return title;


    Chetan Shah wrote:
    I am initiating a simple search and after profiling the my application
    using NetBeans. I see a constant heap consumption and eventually a
    server
    (tomcat) crash due to "out of memory" error. The thread count also keeps
    on increasing and most of the threads in "wait" state.

    Please let me know what am I doing wrong here so that I can avoid server
    crash. I am using Lucene 2.4.0.


    IndexSearcher indexSearcher =
    IndexSearcherFactory.getInstance().getIndexSearcher();

    //Create the query and search
    QueryParser queryParser = new
    QueryParser("contents", new
    StandardAnalyzer());
    Query query = queryParser.parse(searchCriteria);


    TermsFilter categoryFilter = null;

    // Create the filter if it is needed.
    if (filter != null) {
    Term aTerm = new
    Term(Constants.WATCH_LIST_TYPE_TERM);
    categoryFilter = new TermsFilter();
    for (int i = 0; i < filter.length; i++) {
    aTerm =
    aTerm.createTerm(filter[i]);
    categoryFilter.addTerm(aTerm);
    }
    }

    // Create sort criteria
    SortField [] sortFields = new SortField[2];
    SortField watchList = new
    SortField(Constants.WATCH_LIST_TYPE_TERM,
    SortField.STRING);
    SortField score = SortField.FIELD_SCORE;
    if (sortByWatchList) {
    sortFields[0] = watchList;
    sortFields[1] = score;
    } else {
    sortFields[1] = watchList;
    sortFields[0] = score;

    }
    Sort sort = new Sort(sortFields);

    // Collect results
    TopDocs topDocs = indexSearcher.search(query,
    categoryFilter,
    Constants.MAX_HITS, sort);
    ScoreDoc scoreDoc[] = topDocs.scoreDocs;
    int numDocs = scoreDoc.length;
    if (numDocs > 0) results = scoreDoc;
    --
    View this message in context:
    http://www.nabble.com/Memory-Leak--tp22663917p22685294.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    --
    View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22686500.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Paul Smith at Mar 24, 2009 at 9:04 pm

    No, I don't hit OOME if I comment out the call to getHTMLTitle. The
    heap
    behaves perfectly.

    I completely agree with you, the thread count goes haywire the
    moment I call
    the HTMLParser.getTitle(). I have seen a thread count of like 600
    before my
    I hit OOME (with the getTitle() call on) and 90% of those threads
    are in
    wait state. They are not doing anything but just sitting there
    forever, I am
    sure they are consuming the heap and never giving it back.

    Just FYI, on Linux platforms (and I think Windows) the default stack
    size for a thread is 1Mb. 600 extra threads is 600Mb of virtual
    address space, that's outside the heap though so is unlikely to be the
    cause of an actual OutOfMemoryError (if that is actually what you're
    seeing, it's not a different sort of memory error is it?). Even if
    you fix the OOM condition, but still have 600 threads lying around
    you're on your way to a serious problem on a 32-bit Operating system
    which usually causes a process a horrible death when it's virtual size
    reaches the magically 3Gb mark. It' only takes 3000 threads (only x5
    more than you have) even without any _heap_ space utilising the
    virtual address space before you reach the cliff with the jagged rocks
    of process death below.

    Hope that helps too.

    cheers,

    Paul
  • Chetan Shah at Mar 26, 2009 at 12:37 pm
    Ok. I was able to conclude that the I am getting OOME due to my usage of HTML
    Parser to get the HTML title and HTML text. I display 10 results per page
    and therefore end up calling the org.apache.lucene.demo.html.HTMLParser 10
    times.

    I modified my code to store the title and html summary in the index itself
    and found out that the OOME problem is gone.

    I tested this with 256MB heap size.

    Thank you all for your valuable advice and help.



    Chetan Shah wrote:
    I am initiating a simple search and after profiling the my application
    using NetBeans. I see a constant heap consumption and eventually a server
    (tomcat) crash due to "out of memory" error. The thread count also keeps
    on increasing and most of the threads in "wait" state.

    Please let me know what am I doing wrong here so that I can avoid server
    crash. I am using Lucene 2.4.0.


    IndexSearcher indexSearcher =
    IndexSearcherFactory.getInstance().getIndexSearcher();

    //Create the query and search
    QueryParser queryParser = new QueryParser("contents", new
    StandardAnalyzer());
    Query query = queryParser.parse(searchCriteria);


    TermsFilter categoryFilter = null;

    // Create the filter if it is needed.
    if (filter != null) {
    Term aTerm = new Term(Constants.WATCH_LIST_TYPE_TERM);
    categoryFilter = new TermsFilter();
    for (int i = 0; i < filter.length; i++) {
    aTerm = aTerm.createTerm(filter[i]);
    categoryFilter.addTerm(aTerm);
    }
    }

    // Create sort criteria
    SortField [] sortFields = new SortField[2];
    SortField watchList = new SortField(Constants.WATCH_LIST_TYPE_TERM,
    SortField.STRING);
    SortField score = SortField.FIELD_SCORE;
    if (sortByWatchList) {
    sortFields[0] = watchList;
    sortFields[1] = score;
    } else {
    sortFields[1] = watchList;
    sortFields[0] = score;

    }
    Sort sort = new Sort(sortFields);

    // Collect results
    TopDocs topDocs = indexSearcher.search(query, categoryFilter,
    Constants.MAX_HITS, sort);
    ScoreDoc scoreDoc[] = topDocs.scoreDocs;
    int numDocs = scoreDoc.length;
    if (numDocs > 0) results = scoreDoc;
    --
    View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22721161.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Mar 26, 2009 at 4:21 pm
    OK thanks for bringing closure.

    Mike
    On Thu, Mar 26, 2009 at 8:37 AM, Chetan Shah wrote:

    Ok. I was able to conclude that the I am getting OOME due to my usage of HTML
    Parser to get the HTML title and HTML text. I display 10 results per page
    and therefore end up calling the org.apache.lucene.demo.html.HTMLParser 1
    times.

    I modified my code to store the title and html summary in the index itself
    and found out that the OOME problem is gone.

    I tested this with 256MB heap size.

    Thank you all for your valuable advice and help.



    Chetan Shah wrote:
    I am initiating a simple search and after profiling the my application
    using NetBeans. I see a constant heap consumption and eventually a server
    (tomcat) crash due to "out of memory" error. The thread count also keeps
    on increasing and most of the threads in "wait" state.

    Please let me know what am I doing wrong here so that I can avoid server
    crash. I am using Lucene 2.4.0.


    IndexSearcher indexSearcher =
    IndexSearcherFactory.getInstance().getIndexSearcher();

    //Create the query and search
    QueryParser queryParser = new QueryParser("contents", new
    StandardAnalyzer());
    Query query = queryParser.parse(searchCriteria);


    TermsFilter categoryFilter = null;

    // Create the filter if it is needed.
    if (filter != null) {
    Term aTerm = new Term(Constants.WATCH_LIST_TYPE_TERM);
    categoryFilter = new TermsFilter();
    for (int i = 0; i < filter.length; i++) {
    aTerm = aTerm.createTerm(filter[i]);
    categoryFilter.addTerm(aTerm);
    }
    }

    // Create sort criteria
    SortField [] sortFields = new SortField[2];
    SortField watchList = new SortField(Constants.WATCH_LIST_TYPE_TERM,
    SortField.STRING);
    SortField score = SortField.FIELD_SCORE;
    if (sortByWatchList) {
    sortFields[0] = watchList;
    sortFields[1] = score;
    } else {
    sortFields[1] = watchList;
    sortFields[0] = score;

    }
    Sort sort = new Sort(sortFields);

    // Collect results
    TopDocs topDocs = indexSearcher.search(query, categoryFilter,
    Constants.MAX_HITS, sort);
    ScoreDoc scoreDoc[] = topDocs.scoreDocs;
    int numDocs = scoreDoc.length;
    if (numDocs > 0) results = scoreDoc;
    --
    View this message in context: http://www.nabble.com/Memory-Leak--tp22663917p22721161.html
    Sent from the Lucene - Java Users mailing list archive at Nabble.com.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedMar 23, '09 at 5:19p
activeMar 26, '09 at 4:21p
posts16
users4
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase