FAQ
Hello guys,
We have a dilemma on a few of our lucene machines. We have a tomcat
running our servlets for searching and indexing on each of these
machines. Its a live index where documents are being added to index
while online searches are also being served at the same time. Indexing
happens every 5 minutes and if there are new documents added, the index
gets reopend. For most of the times the performance is very good, but
under heavy load of searches, the machine goes non-responsive. We can
still telnet to machine and see that cpu-wise its not bad, but I/O seems
to be a problem. Is there anything we might be doing to cause it or
anything that we can do to avoid it. I know I did not provide a lot of
information about how we are indexing and searching but I will answer
any question anyone might have.

thanks in advance
-siraj

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Michael McCandless at Dec 22, 2009 at 10:29 pm
    Is it possible a large merge is running?

    You can turn on IndexWriter.setInfoStream to see more details about
    what IW is doing, including merging.

    Mike
    On Tue, Dec 22, 2009 at 5:19 PM, Siraj Haider wrote:
    Hello guys,
    We have a dilemma on a few of our lucene machines.  We have a tomcat running
    our servlets for searching and indexing on each of these machines.  Its a
    live index where documents are being added to index while online searches
    are also being served at the same time.  Indexing happens every 5 minutes
    and if there are new documents added, the index gets reopend.  For most of
    the times the performance is very good, but under heavy load of searches,
    the machine goes non-responsive.  We can still telnet to machine and see
    that cpu-wise its not bad, but I/O seems to be a problem.  Is there anything
    we might be doing to cause it or anything that we can do to avoid it.  I
    know I did not provide a lot of information about how we are indexing and
    searching but I will answer any question anyone might have.

    thanks in advance
    -siraj

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Siraj Haider at Dec 22, 2009 at 10:58 pm
    Hi Mike,
    You are right, sometimes there is an implicit merge running when the
    machine goes non-responsive. How can we avoid running those merges
    during the day and how can we minimize the effect it will have on searches?

    -siraj

    Michael McCandless wrote:
    Is it possible a large merge is running?

    You can turn on IndexWriter.setInfoStream to see more details about
    what IW is doing, including merging.

    Mike
    On Tue, Dec 22, 2009 at 5:19 PM, Siraj Haider wrote:

    Hello guys,
    We have a dilemma on a few of our lucene machines. We have a tomcat running
    our servlets for searching and indexing on each of these machines. Its a
    live index where documents are being added to index while online searches
    are also being served at the same time. Indexing happens every 5 minutes
    and if there are new documents added, the index gets reopend. For most of
    the times the performance is very good, but under heavy load of searches,
    the machine goes non-responsive. We can still telnet to machine and see
    that cpu-wise its not bad, but I/O seems to be a problem. Is there anything
    we might be doing to cause it or anything that we can do to avoid it. I
    know I did not provide a lot of information about how we are indexing and
    searching but I will answer any question anyone might have.

    thanks in advance
    -siraj

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Dec 22, 2009 at 11:45 pm
    A merge shouldn't make the machine completely non-responsive, just,
    slower to run searches / index documents.

    What kind of machine / IO system is this?

    You can set maxMergeDocs to limit how large the merge is allowed to
    be. But, be careful, since if you set this too small you'll wind up
    with way too many segments over time, which'll slow down search and
    risk file handle exhaustion. Likewise, increase mergeFactor. You
    could also try decreasing the merge thread priority in
    ConcurrentMergeScheduler... though that risk starvation of the merge
    thread if your CPUs are really saturated doing indexing/searching.
    Another thing to try is the BalancedSegmentMergePolicy (in
    contrib/misc); it also tries to avoid big merges.

    Also, how are you opening new readers? Can you share more how you are
    using Lucene?

    Mike
    On Tue, Dec 22, 2009 at 5:57 PM, Siraj Haider wrote:
    Hi Mike,
    You are right, sometimes there is an implicit merge running when the machine
    goes non-responsive.  How can we avoid running those merges during the day
    and how can we minimize the effect it will have on searches?

    -siraj

    Michael McCandless wrote:
    Is it possible a large merge is running?

    You can turn on IndexWriter.setInfoStream to see more details about
    what IW is doing, including merging.

    Mike
    On Tue, Dec 22, 2009 at 5:19 PM, Siraj Haider wrote:


    Hello guys,
    We have a dilemma on a few of our lucene machines.  We have a tomcat
    running
    our servlets for searching and indexing on each of these machines.  Its a
    live index where documents are being added to index while online searches
    are also being served at the same time.  Indexing happens every 5 minutes
    and if there are new documents added, the index gets reopend.  For most
    of
    the times the performance is very good, but under heavy load of searches,
    the machine goes non-responsive.  We can still telnet to machine and see
    that cpu-wise its not bad, but I/O seems to be a problem.  Is there
    anything
    we might be doing to cause it or anything that we can do to avoid it.  I
    know I did not provide a lot of information about how we are indexing and
    searching but I will answer any question anyone might have.

    thanks in advance
    -siraj

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Siraj Haider at Dec 23, 2009 at 2:13 pm
    We have dual cpu intel xeon machines running "Red Hat Enterprise Linux
    ES release 3 (Taroon Update 6)". We have 4GB memory on these machines
    with 2GB allocated to tomcat.
    After modifying the index we open a new one, warm it up, make it live
    and then close the old one.

    -siraj

    Michael McCandless wrote:
    A merge shouldn't make the machine completely non-responsive, just,
    slower to run searches / index documents.

    What kind of machine / IO system is this?

    You can set maxMergeDocs to limit how large the merge is allowed to
    be. But, be careful, since if you set this too small you'll wind up
    with way too many segments over time, which'll slow down search and
    risk file handle exhaustion. Likewise, increase mergeFactor. You
    could also try decreasing the merge thread priority in
    ConcurrentMergeScheduler... though that risk starvation of the merge
    thread if your CPUs are really saturated doing indexing/searching.
    Another thing to try is the BalancedSegmentMergePolicy (in
    contrib/misc); it also tries to avoid big merges.

    Also, how are you opening new readers? Can you share more how you are
    using Lucene?

    Mike
    On Tue, Dec 22, 2009 at 5:57 PM, Siraj Haider wrote:

    Hi Mike,
    You are right, sometimes there is an implicit merge running when the machine
    goes non-responsive. How can we avoid running those merges during the day
    and how can we minimize the effect it will have on searches?

    -siraj

    Michael McCandless wrote:
    Is it possible a large merge is running?

    You can turn on IndexWriter.setInfoStream to see more details about
    what IW is doing, including merging.

    Mike

    On Tue, Dec 22, 2009 at 5:19 PM, Siraj Haider wrote:

    Hello guys,
    We have a dilemma on a few of our lucene machines. We have a tomcat
    running
    our servlets for searching and indexing on each of these machines. Its a
    live index where documents are being added to index while online searches
    are also being served at the same time. Indexing happens every 5 minutes
    and if there are new documents added, the index gets reopend. For most
    of
    the times the performance is very good, but under heavy load of searches,
    the machine goes non-responsive. We can still telnet to machine and see
    that cpu-wise its not bad, but I/O seems to be a problem. Is there
    anything
    we might be doing to cause it or anything that we can do to avoid it. I
    know I did not provide a lot of information about how we are indexing and
    searching but I will answer any question anyone might have.

    thanks in advance
    -siraj

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Dec 23, 2009 at 2:37 pm
    Are you using IndexReader.reopen to open a new reader, from an
    existing one? That's much more efficient than opening a new reader.

    I think a good next step is to run with IndexWriter.setInfoStream on,
    and run your JRE with verbose GC, to see more details.

    Mike
    On Wed, Dec 23, 2009 at 9:12 AM, Siraj Haider wrote:
    We have dual cpu intel xeon machines running "Red Hat Enterprise Linux ES
    release 3 (Taroon Update 6)".  We have 4GB memory on these machines with 2GB
    allocated to tomcat.
    After modifying the index we open a new one, warm it up, make it live and
    then close the old one.

    -siraj

    Michael McCandless wrote:
    A merge shouldn't make the machine completely non-responsive, just,
    slower to run searches / index documents.

    What kind of machine / IO system is this?

    You can set maxMergeDocs to limit how large the merge is allowed to
    be.  But, be careful, since if you set this too small you'll wind up
    with way too many segments over time, which'll slow down search and
    risk file handle exhaustion.  Likewise, increase mergeFactor.  You
    could also try decreasing the merge thread priority in
    ConcurrentMergeScheduler... though that risk starvation of the merge
    thread if your CPUs are really saturated doing indexing/searching.
    Another thing to try is the BalancedSegmentMergePolicy (in
    contrib/misc); it also tries to avoid big merges.

    Also, how are you opening new readers?  Can you share more how you are
    using Lucene?

    Mike
    On Tue, Dec 22, 2009 at 5:57 PM, Siraj Haider wrote:


    Hi Mike,
    You are right, sometimes there is an implicit merge running when the
    machine
    goes non-responsive.  How can we avoid running those merges during the
    day
    and how can we minimize the effect it will have on searches?

    -siraj

    Michael McCandless wrote:
    Is it possible a large merge is running?

    You can turn on IndexWriter.setInfoStream to see more details about
    what IW is doing, including merging.

    Mike

    On Tue, Dec 22, 2009 at 5:19 PM, Siraj Haider wrote:

    Hello guys,
    We have a dilemma on a few of our lucene machines.  We have a tomcat
    running
    our servlets for searching and indexing on each of these machines.  Its
    a
    live index where documents are being added to index while online
    searches
    are also being served at the same time.  Indexing happens every 5
    minutes
    and if there are new documents added, the index gets reopend.  For most
    of
    the times the performance is very good, but under heavy load of
    searches,
    the machine goes non-responsive.  We can still telnet to machine and
    see
    that cpu-wise its not bad, but I/O seems to be a problem.  Is there
    anything
    we might be doing to cause it or anything that we can do to avoid it.
    I
    know I did not provide a lot of information about how we are indexing
    and
    searching but I will answer any question anyone might have.

    thanks in advance
    -siraj

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Uwe Schindler at Dec 22, 2009 at 10:30 pm
    If it suddenly gets unresponsive, you may have a GC (garbage collector)
    problem. Can you post your Java options, Java VM version, max heap and so
    on. Maybe you are doing some strange things.

    For us the best GC to use (we have Java 1.5) was the "Parallel New GC" /
    "The Concurrent Low Pause Collector", here are some of our options that
    helped solving the problem: -XX:+UseConcMarkSweepGC -XX:+UseParNewGC. You
    may also read Mark Millers Blog post:

    http://www.lucidimagination.com/blog/2009/09/19/java-garbage-collection-boot
    -camp-draft/

    Uwe

    -----
    Uwe Schindler
    H.-H.-Meier-Allee 63, D-28213 Bremen
    http://www.thetaphi.de
    eMail: uwe@thetaphi.de
    -----Original Message-----
    From: Siraj Haider
    Sent: Tuesday, December 22, 2009 11:19 PM
    To: java-user@lucene.apache.org
    Subject: Lucene going non-responsive under heavy load

    Hello guys,
    We have a dilemma on a few of our lucene machines. We have a tomcat
    running our servlets for searching and indexing on each of these
    machines. Its a live index where documents are being added to index
    while online searches are also being served at the same time. Indexing
    happens every 5 minutes and if there are new documents added, the index
    gets reopend. For most of the times the performance is very good, but
    under heavy load of searches, the machine goes non-responsive. We can
    still telnet to machine and see that cpu-wise its not bad, but I/O seems
    to be a problem. Is there anything we might be doing to cause it or
    anything that we can do to avoid it. I know I did not provide a lot of
    information about how we are indexing and searching but I will answer
    any question anyone might have.

    thanks in advance
    -siraj

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Siraj Haider at Dec 22, 2009 at 11:02 pm
    Hi Uwe,
    Following are the info:
    Java VM: "1.6.0_05"
    Java Options: -server -Xms512m -Xmx2048m
    We are not using any specific GC.

    regards
    -siraj

    Uwe Schindler wrote:
    If it suddenly gets unresponsive, you may have a GC (garbage collector)
    problem. Can you post your Java options, Java VM version, max heap and so
    on. Maybe you are doing some strange things.

    For us the best GC to use (we have Java 1.5) was the "Parallel New GC" /
    "The Concurrent Low Pause Collector", here are some of our options that
    helped solving the problem: -XX:+UseConcMarkSweepGC -XX:+UseParNewGC. You
    may also read Mark Millers Blog post:

    http://www.lucidimagination.com/blog/2009/09/19/java-garbage-collection-boot
    -camp-draft/

    Uwe

    -----
    Uwe Schindler
    H.-H.-Meier-Allee 63, D-28213 Bremen
    http://www.thetaphi.de
    eMail: uwe@thetaphi.de

    -----Original Message-----
    From: Siraj Haider
    Sent: Tuesday, December 22, 2009 11:19 PM
    To: java-user@lucene.apache.org
    Subject: Lucene going non-responsive under heavy load

    Hello guys,
    We have a dilemma on a few of our lucene machines. We have a tomcat
    running our servlets for searching and indexing on each of these
    machines. Its a live index where documents are being added to index
    while online searches are also being served at the same time. Indexing
    happens every 5 minutes and if there are new documents added, the index
    gets reopend. For most of the times the performance is very good, but
    under heavy load of searches, the machine goes non-responsive. We can
    still telnet to machine and see that cpu-wise its not bad, but I/O seems
    to be a problem. Is there anything we might be doing to cause it or
    anything that we can do to avoid it. I know I did not provide a lot of
    information about how we are indexing and searching but I will answer
    any question anyone might have.

    thanks in advance
    -siraj

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Uwe Schindler at Dec 23, 2009 at 2:56 pm
    You behaviour sounds really like a GC issue. I would switch the GC to
    verbose and see what's happening (like Mark in his blog post).

    -----
    Uwe Schindler
    H.-H.-Meier-Allee 63, D-28213 Bremen
    http://www.thetaphi.de
    eMail: uwe@thetaphi.de

    -----Original Message-----
    From: Siraj Haider
    Sent: Wednesday, December 23, 2009 12:00 AM
    To: java-user@lucene.apache.org
    Subject: Re: Lucene going non-responsive under heavy load

    Hi Uwe,
    Following are the info:
    Java VM: "1.6.0_05"
    Java Options: -server -Xms512m -Xmx2048m
    We are not using any specific GC.

    regards
    -siraj

    Uwe Schindler wrote:
    If it suddenly gets unresponsive, you may have a GC (garbage collector)
    problem. Can you post your Java options, Java VM version, max heap and so
    on. Maybe you are doing some strange things.

    For us the best GC to use (we have Java 1.5) was the "Parallel New GC" /
    "The Concurrent Low Pause Collector", here are some of our options that
    helped solving the problem: -XX:+UseConcMarkSweepGC -XX:+UseParNewGC. You
    may also read Mark Millers Blog post:

    http://www.lucidimagination.com/blog/2009/09/19/java-garbage-collection- boot
    -camp-draft/

    Uwe

    -----
    Uwe Schindler
    H.-H.-Meier-Allee 63, D-28213 Bremen
    http://www.thetaphi.de
    eMail: uwe@thetaphi.de

    -----Original Message-----
    From: Siraj Haider
    Sent: Tuesday, December 22, 2009 11:19 PM
    To: java-user@lucene.apache.org
    Subject: Lucene going non-responsive under heavy load

    Hello guys,
    We have a dilemma on a few of our lucene machines. We have a tomcat
    running our servlets for searching and indexing on each of these
    machines. Its a live index where documents are being added to index
    while online searches are also being served at the same time. Indexing
    happens every 5 minutes and if there are new documents added, the index
    gets reopend. For most of the times the performance is very good, but
    under heavy load of searches, the machine goes non-responsive. We can
    still telnet to machine and see that cpu-wise its not bad, but I/O
    seems
    to be a problem. Is there anything we might be doing to cause it or
    anything that we can do to avoid it. I know I did not provide a lot of
    information about how we are indexing and searching but I will answer
    any question anyone might have.

    thanks in advance
    -siraj

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedDec 22, '09 at 10:20p
activeDec 23, '09 at 2:56p
posts9
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase