FAQ
Hello Friends;

Recently, I have problem with lucene search - memory problem on the basis that indexed file is so big. (I have indexed some kinds of information and this indexed file's size is nearly more than 40 gigabyte. )

I search the lucene indexed file with org.apache.lucene.search.Searcher.search(query, null, offset + limit, new Sort(new SortField("time", SortField.LONG, true)));
(This provides to find (offset + limit) records to back.)

I use searching by range. For example, in web page I firstly search records which are in [0, 100] range then second page [100, 200]
I have nearly 200,000 records at all. When I go to last page which means records between 200,000 -100, 200,0, there is a memory problem(I have 4gb ram on running machine) in jvm( out of memory error).

Is there a way to overcome this memory problem?

Thanks

--
ilkay POLAT  Software Engineer
TURKEY

Gsm : (+90) 532 542 36 71
E-mail : ilkay_polat@yahoo.com

Search Discussions

  • Findbestopensource at Jul 14, 2010 at 12:00 pm
    Certainly it will. Either you need to increase your memory OR refine your
    query. Eventhough you display paginated result. The first couple of pages
    will display fine and going towards last may face problem. This is because,
    200,000 objects is created and iterated, 190,900 objects are skipped and
    last100 objects are returned. The memory is consumed in creating these
    objects.

    Regards
    Aditya
    www.findbestopensource.com


    On Wed, Jul 14, 2010 at 4:14 PM, ilkay polat wrote:

    Hello Friends;

    Recently, I have problem with lucene search - memory problem on the basis
    that indexed file is so big. (I have indexed some kinds of information and
    this indexed file's size is nearly more than 40 gigabyte. )

    I search the lucene indexed file with
    org.apache.lucene.search.Searcher.search(query, null, offset + limit, new
    Sort(new SortField("time", SortField.LONG, true)));
    (This provides to find (offset + limit) records to back.)

    I use searching by range. For example, in web page I firstly search records
    which are in [0, 100] range then second page [100, 200]
    I have nearly 200,000 records at all. When I go to last page which means
    records between 200,000 -100, 200,0, there is a memory problem(I have 4gb
    ram on running machine) in jvm( out of memory error).

    Is there a way to overcome this memory problem?

    Thanks

    --
    ilkay POLAT Software Engineer
    TURKEY

    Gsm : (+90) 532 542 36 71
    E-mail : ilkay_polat@yahoo.com

  • Ilkay polat at Jul 14, 2010 at 12:55 pm
    Hi,
    We have hardware restrictions(Max RAM can be  8GB). So, unfortunately,  increasing memory can not be option for us for today's situation.

    Yes, as you said that problem is faced when goes to last pages of search screen because of using search method which is find top n records. In other way, this is meaning "searching all the thinngs returns all".

    I am now researching whether there is a way which consumes time instead of memory in this search mechanism in lucene? Any other ideas?

    Thanks

    --- On Wed, 7/14/10, findbestopensource wrote:

    From: findbestopensource <findbestopensource@gmail.com>
    Subject: Re: Out of memory problem in search
    To: java-user@lucene.apache.org
    Date: Wednesday, July 14, 2010, 2:59 PM

    Certainly it will. Either you need to increase your memory OR refine your
    query. Eventhough you display paginated result. The first couple of pages
    will display fine and going towards last may face problem. This is because,
    200,000 objects is created and iterated, 190,900 objects are skipped and
    last100 objects are returned. The memory is consumed in creating these
    objects.

    Regards
    Aditya
    www.findbestopensource.com


    On Wed, Jul 14, 2010 at 4:14 PM, ilkay polat wrote:

    Hello Friends;

    Recently, I have problem with lucene search - memory problem on the basis
    that indexed file is so big. (I have indexed some kinds of information and
    this indexed file's size is nearly more than 40 gigabyte. )

    I search the lucene indexed file with
    org.apache.lucene.search.Searcher.search(query, null, offset + limit, new
    Sort(new SortField("time", SortField.LONG, true)));
    (This provides to find (offset + limit) records to back.)

    I use searching by range. For example, in web page I firstly search records
    which are in [0, 100] range then second page [100, 200]
    I have nearly 200,000 records at all. When I go to last page which means
    records between 200,000 -100, 200,0, there is a memory problem(I have 4gb
    ram on running machine) in jvm( out of memory error).

    Is there a way to overcome this memory problem?

    Thanks

    --
    ilkay POLAT   Software Engineer
    TURKEY

    Gsm : (+90) 532 542 36 71
    E-mail : ilkay_polat@yahoo.com

  • Ilkay polat at Jul 14, 2010 at 1:29 pm
    I have also  confused about the memory management of lucene.

    Where is this out of memory problem is mainly arised from Reason-1 or Reason-2 reason?

    Reason-1 : Problem is sourced from searching is done in big indexed file (nearly 40 GB) If there is 100(small number of records) records returned from search in 60 GB indexed file, problem will again arised.
    OR
    Reason-2 : Problem is sourced from finding so many records(nearly 200,000 records), so in memory 200, 000 java object in heap? If file's sizeis 10 GB(small file size ) but returned records are so many, problem will again arised.

    Is there any document which tells the general memory management issues in searching in lucene?

    Thanks


    ilkay POLAT   Software Engineer  Gsm : (+90) 532 542 36 71
    E-mail : ilkay_polat@yahoo.com

    --- On Wed, 7/14/10, ilkay polat wrote:

    From: ilkay polat <ilkay_polat@yahoo.com>
    Subject: Re: Out of memory problem in search
    To: java-user@lucene.apache.org
    Date: Wednesday, July 14, 2010, 3:54 PM

    Hi,
    We have hardware restrictions(Max RAM can be  8GB). So, unfortunately,  increasing memory can not be option for us for today's situation.

    Yes, as you said that problem is faced when goes to last pages of search screen because of using search method which is find top n records. In other way, this is meaning "searching all the thinngs returns all".

    I am now researching whether there is a way which consumes time instead of memory in this search mechanism in lucene? Any other ideas?

    Thanks

    --- On Wed, 7/14/10, findbestopensource wrote:

    From: findbestopensource <findbestopensource@gmail.com>
    Subject: Re: Out of memory problem in search
    To: java-user@lucene.apache.org
    Date: Wednesday, July 14, 2010, 2:59 PM

    Certainly it will. Either you need to increase your memory OR refine your
    query. Eventhough you display paginated result. The first couple of pages
    will display fine and going towards last may face problem. This is because,
    200,000 objects is created and iterated, 190,900 objects are skipped and
    last100 objects are returned. The memory is consumed in creating these
    objects.

    Regards
    Aditya
    www.findbestopensource.com


    On Wed, Jul 14, 2010 at 4:14 PM, ilkay polat wrote:

    Hello Friends;

    Recently, I have problem with lucene search - memory problem on the basis
    that indexed file is so big. (I have indexed some kinds of information and
    this indexed file's size is nearly more than 40 gigabyte. )

    I search the lucene indexed file with
    org.apache.lucene.search.Searcher.search(query, null, offset + limit, new
    Sort(new SortField("time", SortField.LONG, true)));
    (This provides to find (offset + limit) records to back.)

    I use searching by range. For example, in web page I firstly search records
    which are in [0, 100] range then second page [100, 200]
    I have nearly 200,000 records at all. When I go to last page which means
    records between 200,000 -100, 200,0, there is a memory problem(I have 4gb
    ram on running machine) in jvm( out of memory error).

    Is there a way to overcome this memory problem?

    Thanks

    --
    ilkay POLAT   Software Engineer
    TURKEY

    Gsm : (+90) 532 542 36 71
    E-mail : ilkay_polat@yahoo.com

  • Erick Erickson at Jul 15, 2010 at 12:52 am
    This doesn't make sense to me. Are you saying that you only have 200,000
    documents in your index? Because keeping a score for 200K documents should
    consume a relatively trivial amount of memory. The fact that you're sorting
    by time is a red flag, but it's only a long, so 200K documents shouldn't
    strain memory due to sorting either. The critical thing here isn't
    necessarily the size of your index, but the number of documents in that
    index and the number of unique values you're sorting by. By the way, what
    happens if you don't sort?

    Since it doesn't make sense to me, that must mean I don't understand the
    problem very thoroughly. Could you provide some index characteristics?
    Saying it's 40G leaves a lot open to speculation. That could be 39G of
    stored text which is mostly irrelevant for searching. Or it could be
    entirely indexed, tokenized data which would be a different thing. How many
    documents do you have in your index? What does your query look like?

    You can get an idea of the amount of your index holding indexed tokens by
    NOT storing any of the fields, just indexing them (Field.Store.NO)

    What version of Lucene are you using? How do you start your process? If you
    start the application with java's default memory, that's not very much (64M
    if memory serves). You may be using nowhere near your hardware limits. Try
    specifying -Xmx512M and/or the -server option.

    Best
    Erick
    On Wed, Jul 14, 2010 at 9:27 AM, ilkay polat wrote:

    I have also confused about the memory management of lucene.

    Where is this out of memory problem is mainly arised from Reason-1 or
    Reason-2 reason?

    Reason-1 : Problem is sourced from searching is done in big indexed file
    (nearly 40 GB) If there is 100(small number of records) records returned
    from search in 60 GB indexed file, problem will again arised.
    OR
    Reason-2 : Problem is sourced from finding so many records(nearly 200,000
    records), so in memory 200, 000 java object in heap? If file's sizeis 10
    GB(small file size ) but returned records are so many, problem will again
    arised.

    Is there any document which tells the general memory management issues in
    searching in lucene?

    Thanks


    ilkay POLAT Software Engineer Gsm : (+90) 532 542 36 71
    E-mail : ilkay_polat@yahoo.com

    --- On Wed, 7/14/10, ilkay polat wrote:

    From: ilkay polat <ilkay_polat@yahoo.com>
    Subject: Re: Out of memory problem in search
    To: java-user@lucene.apache.org
    Date: Wednesday, July 14, 2010, 3:54 PM

    Hi,
    We have hardware restrictions(Max RAM can be 8GB). So, unfortunately,
    increasing memory can not be option for us for today's situation.

    Yes, as you said that problem is faced when goes to last pages of search
    screen because of using search method which is find top n records. In other
    way, this is meaning "searching all the thinngs returns all".

    I am now researching whether there is a way which consumes time instead of
    memory in this search mechanism in lucene? Any other ideas?

    Thanks

    --- On Wed, 7/14/10, findbestopensource wrote:

    From: findbestopensource <findbestopensource@gmail.com>
    Subject: Re: Out of memory problem in search
    To: java-user@lucene.apache.org
    Date: Wednesday, July 14, 2010, 2:59 PM

    Certainly it will. Either you need to increase your memory OR refine your
    query. Eventhough you display paginated result. The first couple of pages
    will display fine and going towards last may face problem. This is because,
    200,000 objects is created and iterated, 190,900 objects are skipped and
    last100 objects are returned. The memory is consumed in creating these
    objects.

    Regards
    Aditya
    www.findbestopensource.com


    On Wed, Jul 14, 2010 at 4:14 PM, ilkay polat wrote:

    Hello Friends;

    Recently, I have problem with lucene search - memory problem on the basis
    that indexed file is so big. (I have indexed some kinds of information and
    this indexed file's size is nearly more than 40 gigabyte. )

    I search the lucene indexed file with
    org.apache.lucene.search.Searcher.search(query, null, offset + limit, new
    Sort(new SortField("time", SortField.LONG, true)));
    (This provides to find (offset + limit) records to back.)

    I use searching by range. For example, in web page I firstly search records
    which are in [0, 100] range then second page [100, 200]
    I have nearly 200,000 records at all. When I go to last page which means
    records between 200,000 -100, 200,0, there is a memory problem(I have 4gb
    ram on running machine) in jvm( out of memory error).

    Is there a way to overcome this memory problem?

    Thanks

    --
    ilkay POLAT Software Engineer
    TURKEY

    Gsm : (+90) 532 542 36 71
    E-mail : ilkay_polat@yahoo.com






  • Uwe Schindler at Jul 14, 2010 at 12:26 pm
    Reverse the query sorting to display the last page.

    -----
    Uwe Schindler
    H.-H.-Meier-Allee 63, D-28213 Bremen
    http://www.thetaphi.de
    eMail: uwe@thetaphi.de

    -----Original Message-----
    From: ilkay polat
    Sent: Wednesday, July 14, 2010 12:44 PM
    To: java-user@lucene.apache.org
    Subject: Out of memory problem in search

    Hello Friends;

    Recently, I have problem with lucene search - memory problem on the basis
    that indexed file is so big. (I have indexed some kinds of information and this
    indexed file's size is nearly more than 40 gigabyte. )

    I search the lucene indexed file with
    org.apache.lucene.search.Searcher.search(query, null, offset + limit, new
    Sort(new SortField("time", SortField.LONG, true))); (This provides to find
    (offset + limit) records to back.)

    I use searching by range. For example, in web page I firstly search records
    which are in [0, 100] range then second page [100, 200] I have nearly 200,000
    records at all. When I go to last page which means records between 200,000 -
    100, 200,0, there is a memory problem(I have 4gb ram on running machine) in
    jvm( out of memory error).

    Is there a way to overcome this memory problem?

    Thanks

    --
    ilkay POLAT  Software Engineer
    TURKEY

    Gsm : (+90) 532 542 36 71
    E-mail : ilkay_polat@yahoo.com


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Ilkay polat at Jul 14, 2010 at 12:49 pm
    Indeed, this is  good solution to that kind of problems. But same problem can be  occured in future when logs are added to index file.
    For example, here 200,000 records have problem(These logs are collected in 13 days).
    With that reverse way, there will be maximum search range is 100,000.
    But if there is 400,000 records same problem will be occured(Max search space is 200,000 again).
    Is there another way which do not consume so much memory  or consume restrict memory and consume time instead of memory. This restriction come from our project hardware restrictions(Hardware memory is 8GB in maximum situation)?

    --- On Wed, 7/14/10, Uwe Schindler wrote:

    From: Uwe Schindler <uwe@thetaphi.de>
    Subject: RE: Out of memory problem in search
    To: java-user@lucene.apache.org
    Date: Wednesday, July 14, 2010, 3:25 PM

    Reverse the query sorting to display the last page.

    -----
    Uwe Schindler
    H.-H.-Meier-Allee 63, D-28213 Bremen
    http://www.thetaphi.de
    eMail: uwe@thetaphi.de

    -----Original Message-----
    From: ilkay polat
    Sent: Wednesday, July 14, 2010 12:44 PM
    To: java-user@lucene.apache.org
    Subject: Out of memory problem in search

    Hello Friends;

    Recently, I have problem with lucene search - memory problem on the basis
    that indexed file is so big. (I have indexed some kinds of information and this
    indexed file's size is nearly more than 40 gigabyte. )

    I search the lucene indexed file with
    org.apache.lucene.search.Searcher.search(query, null, offset + limit, new
    Sort(new SortField("time", SortField.LONG, true))); (This provides to find
    (offset + limit) records to back.)

    I use searching by range. For example, in web page I firstly search records
    which are in [0, 100] range then second page [100, 200] I have nearly 200,000
    records at all. When I go to last page which means records between 200,00 -
    100, 200,0, there is a memory problem(I have 4gb ram on running machine) in
    jvm( out of memory error).

    Is there a way to overcome this memory problem?

    Thanks

    --
    ilkay POLAT   Software Engineer
    TURKEY

    Gsm : (+90) 532 542 36 71
    E-mail : ilkay_polat@yahoo.com


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJul 14, '10 at 10:45a
activeJul 15, '10 at 12:52a
posts7
users4
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase