FAQ
Hi,

I've got a few machines who post documents concurrently to a solr
instance. They do not issue the commit themselves, instead, I've got
autocommit set up at solr server side:
<autoCommit>
<maxDocs>50000</maxDocs> <!-- commit at least every 50000 docs -->
<maxTime>60000</maxTime> <!-- Stays max 60s without commit -->
</autoCommit>

This usually works fine, but sometime the server goes in a deadlock
state . Here's the errors I get from the log (these go on forever
until I delete the index and restart all from zero):

02-Nov-2009 10:35:27 org.apache.solr.update.SolrIndexWriter finalize
SEVERE: SolrIndexWriter was not closed prior to finalize(), indicates
a bug -- POSSIBLE RESOURCE LEAK!!!
...
[ multiple messages like this ]
...
02-Nov-2009 10:35:27 org.apache.solr.common.SolrException log
SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain
timed out: NativeFSLock@/home/solrdata/jobs/index/lucene-703db99881e56205cb910a2e5fd816d3-write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:85)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1538)
at org.apache.lucene.index.IndexWriter.(SolrIndexWriter.java:190)
at org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:98)
at org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173)
at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:220)
at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)


I'm wondering what could be the reason for this (if a commit takes
mire than 60 seconds for instance?), and if I should use better
locking or autocommittting options?

Here's the locking conf I've got at the moment:
<writeLockTimeout>1000</writeLockTimeout>
<commitLockTimeout>10000</commitLockTimeout>
<lockType>native</lockType>

I'm using solr trunk from 12 oct 2009 within tomcat.

Thanks for any help.

Jerome.

--
Jerome Eteve.
http://www.eteve.net
jerome@eteve.net

Search Discussions

  • Chris Hostetter at Nov 4, 2009 at 12:58 am
    : 02-Nov-2009 10:35:27 org.apache.solr.update.SolrIndexWriter finalize
    : SEVERE: SolrIndexWriter was not closed prior to finalize(), indicates
    : a bug -- POSSIBLE RESOURCE LEAK!!!

    can you post some context showing what the logs look like just before
    these errors?

    I'm not sure what might be causing lock collision but your guess about
    commit's taking too long and overlapping is a good one -- what do the log
    messages about the commits say arround the time these errors start? the
    commit logs when it finishes and how long it takes so it's easy to spot.

    increasing your writeLockTimeout is probably a good idea, but i'm still
    confused as to why the whole server would lock up until you delete the
    index and restart, at worst i would expect the update/commit attempts that
    time out getting the lock to complain loudly, but then the "slow" one
    would eventually finish and subsequent attempts would work ok.

    ...very odd.

    -Hoss
  • Jérôme Etévé at Nov 4, 2009 at 2:38 pm
    Hi,

    It seems this situation is caused by some No space left on device exeptions:
    SEVERE: java.io.IOException: No space left on device
    at java.io.RandomAccessFile.writeBytes(Native Method)
    at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
    at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:192)
    at org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)


    I'd better try to set my maxMergeDocs and mergeFactor to more
    adequates values for my app (I'm indexing ~15 Gb of data on 20Gb
    device, so I guess there's problem when solr tries to merge the index
    bits being build.

    At the moment, they are set to <mergeFactor>100</mergeFactor> and
    <maxMergeDocs>2147483647</maxMergeDocs>

    Jerome.

    --
    Jerome Eteve.
    http://www.eteve.net
    jerome@eteve.net
  • Lance Norskog at Nov 5, 2009 at 1:28 am
    This will not ever work reliably. You should have 2x total disk space
    for the index. Optimize, for one, requires this.
    On Wed, Nov 4, 2009 at 6:37 AM, Jérôme Etévé wrote:
    Hi,

    It seems this situation is caused by some No space left on device exeptions:
    SEVERE: java.io.IOException: No space left on device
    at java.io.RandomAccessFile.writeBytes(Native Method)
    at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
    at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:192)
    at org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)


    I'd better try to set my maxMergeDocs and mergeFactor to more
    adequates values for my app (I'm indexing ~15 Gb of data on 20Gb
    device, so I guess there's problem when solr tries to merge the index
    bits being build.

    At the moment, they are set to   <mergeFactor>100</mergeFactor> and
    <maxMergeDocs>2147483647</maxMergeDocs>

    Jerome.

    --
    Jerome Eteve.
    http://www.eteve.net
    jerome@eteve.net


    --
    Lance Norskog
    goksron@gmail.com
  • Mike anderson at Jan 25, 2010 at 9:15 pm
    I am getting this exception as well, but disk space is not my problem. What
    else can I do to debug this? The solr log doesn't appear to lend any other
    clues..

    Jan 25, 2010 4:02:22 PM org.apache.solr.core.SolrCore execute
    INFO: [] webapp=/solr path=/update params={} status=500 QTime=199
    Jan 25, 2010 4:02:22 PM org.apache.solr.common.SolrException log
    SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed
    out: NativeFSLock@
    /solr8984/index/lucene-98c1cb272eb9e828b1357f68112231e0-write.lock
    at org.apache.lucene.store.Lock.obtain(Lock.java:85)
    at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1545)
    at org.apache.lucene.index.IndexWriter.(SolrIndexWriter.java:190)
    at
    org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:98)
    at
    org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173)
    at
    org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:220)
    at
    org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
    at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
    at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
    at
    org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
    at
    org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
    at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
    at
    org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
    at
    org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
    at
    org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
    at
    org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
    at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
    at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
    at
    org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
    at
    org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
    at org.mortbay.jetty.Server.handle(Server.java:285)
    at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
    at
    org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
    at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
    at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
    at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
    at
    org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
    at
    org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)


    Should I consider changing the lock timeout settings (currently set to
    defaults)? If so, I'm not sure what to base these values on.

    Thanks in advance,
    mike

    On Wed, Nov 4, 2009 at 8:27 PM, Lance Norskog wrote:

    This will not ever work reliably. You should have 2x total disk space
    for the index. Optimize, for one, requires this.
    On Wed, Nov 4, 2009 at 6:37 AM, Jérôme Etévé wrote:
    Hi,

    It seems this situation is caused by some No space left on device
    exeptions:
    SEVERE: java.io.IOException: No space left on device
    at java.io.RandomAccessFile.writeBytes(Native Method)
    at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
    at
    org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:192)
    at
    org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)

    I'd better try to set my maxMergeDocs and mergeFactor to more
    adequates values for my app (I'm indexing ~15 Gb of data on 20Gb
    device, so I guess there's problem when solr tries to merge the index
    bits being build.

    At the moment, they are set to <mergeFactor>100</mergeFactor> and
    <maxMergeDocs>2147483647</maxMergeDocs>

    Jerome.

    --
    Jerome Eteve.
    http://www.eteve.net
    jerome@eteve.net


    --
    Lance Norskog
    goksron@gmail.com
  • Ian Connor at Jan 26, 2010 at 3:23 pm
    We traced one of the lock files, and it had been around for 3 hours. A
    restart removed it - but is 3 hours normal for one of these locks?

    Ian.
    On Mon, Jan 25, 2010 at 4:14 PM, mike anderson wrote:

    I am getting this exception as well, but disk space is not my problem. What
    else can I do to debug this? The solr log doesn't appear to lend any other
    clues..

    Jan 25, 2010 4:02:22 PM org.apache.solr.core.SolrCore execute
    INFO: [] webapp=/solr path=/update params={} status=500 QTime=1990
    Jan 25, 2010 4:02:22 PM org.apache.solr.common.SolrException log
    SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain
    timed
    out: NativeFSLock@
    /solr8984/index/lucene-98c1cb272eb9e828b1357f68112231e0-write.lock
    at org.apache.lucene.store.Lock.obtain(Lock.java:85)
    at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1545)
    at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1402)
    at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:190)
    at

    org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:98)
    at

    org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173)
    at

    org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:220)
    at

    org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
    at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
    at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
    at

    org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
    at

    org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
    at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
    at

    org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
    at

    org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
    at

    org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
    at
    org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
    at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
    at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
    at

    org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
    at

    org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
    at org.mortbay.jetty.Server.handle(Server.java:285)
    at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
    at

    org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
    at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
    at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
    at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
    at

    org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
    at

    org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)


    Should I consider changing the lock timeout settings (currently set to
    defaults)? If so, I'm not sure what to base these values on.

    Thanks in advance,
    mike

    On Wed, Nov 4, 2009 at 8:27 PM, Lance Norskog wrote:

    This will not ever work reliably. You should have 2x total disk space
    for the index. Optimize, for one, requires this.

    On Wed, Nov 4, 2009 at 6:37 AM, Jérôme Etévé <jerome.eteve@gmail.com>
    wrote:
    Hi,

    It seems this situation is caused by some No space left on device
    exeptions:
    SEVERE: java.io.IOException: No space left on device
    at java.io.RandomAccessFile.writeBytes(Native Method)
    at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
    at
    org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:192)
    at
    org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)

    I'd better try to set my maxMergeDocs and mergeFactor to more
    adequates values for my app (I'm indexing ~15 Gb of data on 20Gb
    device, so I guess there's problem when solr tries to merge the index
    bits being build.

    At the moment, they are set to <mergeFactor>100</mergeFactor> and
    <maxMergeDocs>2147483647</maxMergeDocs>

    Jerome.

    --
    Jerome Eteve.
    http://www.eteve.net
    jerome@eteve.net


    --
    Lance Norskog
    goksron@gmail.com
  • Ian Connor at Jan 27, 2010 at 3:35 pm
    Can anyone think of a reason why these locks would hang around for more than
    2 hours?

    I have been monitoring them and they look like they are very short lived.
    On Tue, Jan 26, 2010 at 10:15 AM, Ian Connor wrote:

    We traced one of the lock files, and it had been around for 3 hours. A
    restart removed it - but is 3 hours normal for one of these locks?

    Ian.

    On Mon, Jan 25, 2010 at 4:14 PM, mike anderson wrote:

    I am getting this exception as well, but disk space is not my problem.
    What
    else can I do to debug this? The solr log doesn't appear to lend any other
    clues..

    Jan 25, 2010 4:02:22 PM org.apache.solr.core.SolrCore execute
    INFO: [] webapp=/solr path=/update params={} status=500 QTime=1990
    Jan 25, 2010 4:02:22 PM org.apache.solr.common.SolrException log
    SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain
    timed
    out: NativeFSLock@
    /solr8984/index/lucene-98c1cb272eb9e828b1357f68112231e0-write.lock
    at org.apache.lucene.store.Lock.obtain(Lock.java:85)
    at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1545)
    at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1402)
    at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:190)
    at

    org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:98)
    at

    org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173)
    at

    org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:220)
    at

    org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
    at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
    at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
    at

    org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
    at

    org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
    at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
    at

    org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
    at

    org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
    at

    org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
    at
    org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
    at

    org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
    at
    org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
    at
    org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
    at

    org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
    at

    org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
    at
    org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
    at org.mortbay.jetty.Server.handle(Server.java:285)
    at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
    at

    org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
    at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
    at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
    at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
    at

    org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
    at

    org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)


    Should I consider changing the lock timeout settings (currently set to
    defaults)? If so, I'm not sure what to base these values on.

    Thanks in advance,
    mike

    On Wed, Nov 4, 2009 at 8:27 PM, Lance Norskog wrote:

    This will not ever work reliably. You should have 2x total disk space
    for the index. Optimize, for one, requires this.

    On Wed, Nov 4, 2009 at 6:37 AM, Jérôme Etévé <jerome.eteve@gmail.com>
    wrote:
    Hi,

    It seems this situation is caused by some No space left on device
    exeptions:
    SEVERE: java.io.IOException: No space left on device
    at java.io.RandomAccessFile.writeBytes(Native Method)
    at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
    at
    org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:192)
    at
    org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)

    I'd better try to set my maxMergeDocs and mergeFactor to more
    adequates values for my app (I'm indexing ~15 Gb of data on 20Gb
    device, so I guess there's problem when solr tries to merge the index
    bits being build.

    At the moment, they are set to <mergeFactor>100</mergeFactor> and
    <maxMergeDocs>2147483647</maxMergeDocs>

    Jerome.

    --
    Jerome Eteve.
    http://www.eteve.net
    jerome@eteve.net


    --
    Lance Norskog
    goksron@gmail.com

    --
    Regards,

    Ian Connor
    1 Leighton St #723
    Cambridge, MA 02141
    Call Center Phone: +1 (714) 239 3875 (24 hrs)
    Fax: +1(770) 818 5697
    Skype: ian.connor
  • Chris Hostetter at Jan 30, 2010 at 12:54 am
    : Can anyone think of a reason why these locks would hang around for more than
    : 2 hours?
    :
    : I have been monitoring them and they look like they are very short lived.

    Typically the lock files are only left arround for more then a few seconds
    when there was a fatal crash of some kind ... an OOM Error for example, or
    as already mentioned in this thread...

    : >> > > SEVERE: java.io.IOException: No space left on device

    ...if you check your solr logs for messages in the immediate time frame
    following the the lastModified time of the lock file you'll probably find
    something interesting.


    -Hoss

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupsolr-user @
categorieslucene
postedNov 2, '09 at 11:13a
activeJan 30, '10 at 12:54a
posts8
users5
websitelucene.apache.org...

People

Translate

site design / logo © 2017 Grokbase