FAQ
Hello,
I am having a problem with reopening the IndexReader with Lucene 2.4 ( I
updated to 2.4.1, but still no luck). The exception is preceded by an
exception in optimizing the index. I am not reopening the reader while the
commit or optimization is going on in the writer (optimizing happens in the
same thread, but much less often). The issues go away once I turn off
optimizations. I was also getting this problem before I turned off the use
of compound files. I would appreciate any guidance.

Thanks!

Regards,
Khawaja


2009-04-09 15:57:47,033 (941820) [Index Maint Thread] ERROR
gov.nasa.ensemble.core.indexer.Indexer - java.io.IOException: background
merge hit exception: _8:C41258 _9:C11382 into _a [optimize]
java.io.IOException: background merge hit exception: _8:C41258 _9:C11382
into _a [optimize]
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2273)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2218)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2198)
at
gov.nasa.ensemble.core.indexer.Indexer$IndexMaintThread.run(Indexer.java:102)
Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ
for segment _8: fieldsReader shows 30074 but segmentInfo shows 41258
at
org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:260)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4220)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
Exception in thread "Lucene Merge Thread #0"
org.apache.lucene.index.MergePolicy$MergeException:
org.apache.lucene.index.CorruptIndexException: doc counts differ for segment
_8: fieldsReader shows 30074 but segmentInfo shows 41258
at
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:286)
Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ
for segment _8: fieldsReader shows 30074 but segmentInfo shows 41258
at
org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:260)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4220)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
2009-04-09 15:57:52,814 (947601) [Index Maint Thread] ERROR
gov.nasa.ensemble.core.indexer.Indexer - java.io.FileNotFoundException:
/opt/users/merops/index/_a.fnm (No such file or directory)
java.io.FileNotFoundException: /opt/users/merops/index/_a.fnm (No such file
or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.(FSDirectory.java:552)
at
org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:488)
at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
at org.apache.lucene.index.FieldInfos.(SegmentReader.java:341)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:269)
at
org.apache.lucene.index.MultiSegmentReader.doReopen(MultiSegmentReader.java:201)
at
org.apache.lucene.index.DirectoryIndexReader$2.doBody(DirectoryIndexReader.java:157)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
at
org.apache.lucene.index.DirectoryIndexReader.reopen(DirectoryIndexReader.java:179)
at
gov.nasa.ensemble.core.indexer.Indexer$IndexMaintThread.run(Indexer.java:106)

Search Discussions

  • Michael McCandless at Apr 10, 2009 at 12:06 am
    These are serious corruption exceptions.

    Is it at all possible two writers are accessing the index at the same time?

    Can you describe more about how you're using Lucene?

    Mike
    On Thu, Apr 9, 2009 at 7:59 PM, Khawaja Shams wrote:
    Hello,
    I am having a problem with reopening the IndexReader with Lucene 2.4 ( I
    updated to 2.4.1, but still no luck). The exception is preceded by an
    exception in optimizing the index. I am not reopening the reader while the
    commit or optimization is going on in the writer (optimizing happens in the
    same thread, but much less often). The issues go away once I turn off
    optimizations. I was also getting this problem before I turned off the use
    of compound files. I would appreciate any guidance.

    Thanks!

    Regards,
    Khawaja


    2009-04-09 15:57:47,033 (941820) [Index Maint Thread] ERROR
    gov.nasa.ensemble.core.indexer.Indexer  - java.io.IOException: background
    merge hit exception: _8:C41258 _9:C11382 into _a [optimize]
    java.io.IOException: background merge hit exception: _8:C41258 _9:C11382
    into _a [optimize]
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2273)
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2218)
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2198)
    at
    gov.nasa.ensemble.core.indexer.Indexer$IndexMaintThread.run(Indexer.java:102)
    Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ
    for segment _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:260)
    at
    org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4220)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
    Exception in thread "Lucene Merge Thread #0"
    org.apache.lucene.index.MergePolicy$MergeException:
    org.apache.lucene.index.CorruptIndexException: doc counts differ for segment
    _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:286)
    Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ
    for segment _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:260)
    at
    org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4220)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
    2009-04-09 15:57:52,814 (947601) [Index Maint Thread] ERROR
    gov.nasa.ensemble.core.indexer.Indexer  - java.io.FileNotFoundException:
    /opt/users/merops/index/_a.fnm (No such file or directory)
    java.io.FileNotFoundException: /opt/users/merops/index/_a.fnm (No such file
    or directory)
    at java.io.RandomAccessFile.open(Native Method)
    at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
    at
    org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:552)
    at
    org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:582)
    at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
    at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
    at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:58)
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:341)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:269)
    at
    org.apache.lucene.index.MultiSegmentReader.doReopen(MultiSegmentReader.java:201)
    at
    org.apache.lucene.index.DirectoryIndexReader$2.doBody(DirectoryIndexReader.java:157)
    at
    org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
    at
    org.apache.lucene.index.DirectoryIndexReader.reopen(DirectoryIndexReader.java:179)
    at
    gov.nasa.ensemble.core.indexer.Indexer$IndexMaintThread.run(Indexer.java:106)
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Khawaja Shams at Apr 10, 2009 at 12:33 am
    Hi Michael,
    Thanks for the quick response. I only have one IndexWriter, and there are
    no other processes accessing this particular index. I have tried deleting
    the entire index and reconstructing it, but the index corruption is
    repeatable. Incidentally, there are no new writes since the last commit when
    the merge happens. I have over-padded my code with ReadWrite locks to make
    sure that no writes/read are happening between the commits, optimizations,
    and reopening of the index.


    Here is a snippet of the thread I use to maintain the Index ( I hope that I
    am not doing something terribly wrong):
    while (true) {
    try {
    getWriteLock();
    indexWriter.commit();
    if (shouldOptimize()) {
    indexWriter.optimize();
    }

    IndexReader oldIR = indexSearcher.getIndexReader();
    IndexReader ir = oldIR.reopen();
    if (ir != oldIR) {
    IndexSearcher oldIS = indexSearcher;
    indexSearcher = new IndexSearcher(ir);
    oldIS.close();
    oldIR.close();
    } catch (Throwable t) {
    trace.error(t, t);
    } finally {
    releaseWriteLock();
    }


    Regards,
    Khawaja
    On Thu, Apr 9, 2009 at 5:05 PM, Michael McCandless wrote:

    These are serious corruption exceptions.

    Is it at all possible two writers are accessing the index at the same time?

    Can you describe more about how you're using Lucene?

    Mike
    On Thu, Apr 9, 2009 at 7:59 PM, Khawaja Shams wrote:
    Hello,
    I am having a problem with reopening the IndexReader with Lucene 2.4 ( I
    updated to 2.4.1, but still no luck). The exception is preceded by an
    exception in optimizing the index. I am not reopening the reader while the
    commit or optimization is going on in the writer (optimizing happens in the
    same thread, but much less often). The issues go away once I turn off
    optimizations. I was also getting this problem before I turned off the use
    of compound files. I would appreciate any guidance.

    Thanks!

    Regards,
    Khawaja


    2009-04-09 15:57:47,033 (941820) [Index Maint Thread] ERROR
    gov.nasa.ensemble.core.indexer.Indexer - java.io.IOException: background
    merge hit exception: _8:C41258 _9:C11382 into _a [optimize]
    java.io.IOException: background merge hit exception: _8:C41258 _9:C11382
    into _a [optimize]
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2273)
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2218)
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2198)
    at
    gov.nasa.ensemble.core.indexer.Indexer$IndexMaintThread.run(Indexer.java:102)
    Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ
    for segment _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:260)
    at
    org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4220)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
    Exception in thread "Lucene Merge Thread #0"
    org.apache.lucene.index.MergePolicy$MergeException:
    org.apache.lucene.index.CorruptIndexException: doc counts differ for segment
    _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:286)
    Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ
    for segment _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:260)
    at
    org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4220)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
    2009-04-09 15:57:52,814 (947601) [Index Maint Thread] ERROR
    gov.nasa.ensemble.core.indexer.Indexer - java.io.FileNotFoundException:
    /opt/users/merops/index/_a.fnm (No such file or directory)
    java.io.FileNotFoundException: /opt/users/merops/index/_a.fnm (No such file
    or directory)
    at java.io.RandomAccessFile.open(Native Method)
    at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
    at
    org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:552)
    at
    org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:582)
    at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
    at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
    at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:58)
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:341)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:269)
    at
    org.apache.lucene.index.MultiSegmentReader.doReopen(MultiSegmentReader.java:201)
    at
    org.apache.lucene.index.DirectoryIndexReader$2.doBody(DirectoryIndexReader.java:157)
    at
    org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
    at
    org.apache.lucene.index.DirectoryIndexReader.reopen(DirectoryIndexReader.java:179)
    at
    gov.nasa.ensemble.core.indexer.Indexer$IndexMaintThread.run(Indexer.java:106)
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Apr 10, 2009 at 12:44 am
    That code looks right. Are there multiple threads that may enter it?

    Can you show the code where you create the IndexWriter, add docs, etc?

    Can you call IndexWriter.setInfoStream for the entire life of the
    index, up until when the optimize error happens, and post back?

    Mike
    On Thu, Apr 9, 2009 at 8:33 PM, Khawaja Shams wrote:
    Hi Michael,
    Thanks for the quick response. I only have one IndexWriter, and there are
    no other processes accessing this particular index. I have tried deleting
    the entire index and reconstructing it, but the index corruption is
    repeatable. Incidentally, there are no new writes since the last commit when
    the merge happens. I have over-padded my code with ReadWrite locks to make
    sure that no writes/read are happening between the commits, optimizations,
    and reopening of the index.


    Here is a snippet of the thread I use to maintain the Index ( I hope that I
    am not doing something terribly wrong):
    while (true) {
    try {
    getWriteLock();
    indexWriter.commit();
    if (shouldOptimize()) {
    indexWriter.optimize();
    }

    IndexReader oldIR = indexSearcher.getIndexReader();
    IndexReader ir = oldIR.reopen();
    if (ir != oldIR) {
    IndexSearcher oldIS = indexSearcher;
    indexSearcher = new IndexSearcher(ir);
    oldIS.close();
    oldIR.close();
    } catch (Throwable t) {
    trace.error(t, t);
    } finally {
    releaseWriteLock();
    }


    Regards,
    Khawaja

    On Thu, Apr 9, 2009 at 5:05 PM, Michael McCandless <
    lucene@mikemccandless.com> wrote:
    These are serious corruption exceptions.

    Is it at all possible two writers are accessing the index at the same time?

    Can you describe more about how you're using Lucene?

    Mike
    On Thu, Apr 9, 2009 at 7:59 PM, Khawaja Shams wrote:
    Hello,
    I am having a problem with reopening the IndexReader with Lucene 2.4 ( I
    updated to 2.4.1, but still no luck). The exception is preceded by an
    exception in optimizing the index. I am not reopening the reader while the
    commit or optimization is going on in the writer (optimizing happens in the
    same thread, but much less often). The issues go away once I turn off
    optimizations. I was also getting this problem before I turned off the use
    of compound files. I would appreciate any guidance.

    Thanks!

    Regards,
    Khawaja


    2009-04-09 15:57:47,033 (941820) [Index Maint Thread] ERROR
    gov.nasa.ensemble.core.indexer.Indexer  - java.io.IOException: background
    merge hit exception: _8:C41258 _9:C11382 into _a [optimize]
    java.io.IOException: background merge hit exception: _8:C41258 _9:C11382
    into _a [optimize]
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2273)
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2218)
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2198)
    at
    gov.nasa.ensemble.core.indexer.Indexer$IndexMaintThread.run(Indexer.java:102)
    Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ
    for segment _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:260)
    at
    org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4220)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
    Exception in thread "Lucene Merge Thread #0"
    org.apache.lucene.index.MergePolicy$MergeException:
    org.apache.lucene.index.CorruptIndexException: doc counts differ for segment
    _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:286)
    Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ
    for segment _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:260)
    at
    org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4220)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
    2009-04-09 15:57:52,814 (947601) [Index Maint Thread] ERROR
    gov.nasa.ensemble.core.indexer.Indexer  - java.io.FileNotFoundException:
    /opt/users/merops/index/_a.fnm (No such file or directory)
    java.io.FileNotFoundException: /opt/users/merops/index/_a.fnm (No such file
    or directory)
    at java.io.RandomAccessFile.open(Native Method)
    at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
    at
    org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:552)
    at
    org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:582)
    at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
    at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
    at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:58)
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:341)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:269)
    at
    org.apache.lucene.index.MultiSegmentReader.doReopen(MultiSegmentReader.java:201)
    at
    org.apache.lucene.index.DirectoryIndexReader$2.doBody(DirectoryIndexReader.java:157)
    at
    org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
    at
    org.apache.lucene.index.DirectoryIndexReader.reopen(DirectoryIndexReader.java:179)
    at
    gov.nasa.ensemble.core.indexer.Indexer$IndexMaintThread.run(Indexer.java:106)
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Khawaja Shams at Apr 10, 2009 at 4:43 am
    Mike,
    I am sorry for wasting your time :). There were indeed two threads that
    were performing this operation. Out of curiosity, which part of this is not
    thread safe? An indexreader reopening while a commit is going on? Thanks
    again for your help.

    Regards,
    Khawaja


    On Thu, Apr 9, 2009 at 5:44 PM, Michael McCandless wrote:

    That code looks right. Are there multiple threads that may enter it?

    Can you show the code where you create the IndexWriter, add docs, etc?

    Can you call IndexWriter.setInfoStream for the entire life of the
    index, up until when the optimize error happens, and post back?

    Mike
    On Thu, Apr 9, 2009 at 8:33 PM, Khawaja Shams wrote:
    Hi Michael,
    Thanks for the quick response. I only have one IndexWriter, and there are
    no other processes accessing this particular index. I have tried deleting
    the entire index and reconstructing it, but the index corruption is
    repeatable. Incidentally, there are no new writes since the last commit when
    the merge happens. I have over-padded my code with ReadWrite locks to make
    sure that no writes/read are happening between the commits,
    optimizations,
    and reopening of the index.


    Here is a snippet of the thread I use to maintain the Index ( I hope that I
    am not doing something terribly wrong):
    while (true) {
    try {
    getWriteLock();
    indexWriter.commit();
    if (shouldOptimize()) {
    indexWriter.optimize();
    }

    IndexReader oldIR = indexSearcher.getIndexReader();
    IndexReader ir = oldIR.reopen();
    if (ir != oldIR) {
    IndexSearcher oldIS = indexSearcher;
    indexSearcher = new IndexSearcher(ir);
    oldIS.close();
    oldIR.close();
    } catch (Throwable t) {
    trace.error(t, t);
    } finally {
    releaseWriteLock();
    }


    Regards,
    Khawaja

    On Thu, Apr 9, 2009 at 5:05 PM, Michael McCandless <
    lucene@mikemccandless.com> wrote:
    These are serious corruption exceptions.

    Is it at all possible two writers are accessing the index at the same
    time?
    Can you describe more about how you're using Lucene?

    Mike
    On Thu, Apr 9, 2009 at 7:59 PM, Khawaja Shams wrote:
    Hello,
    I am having a problem with reopening the IndexReader with Lucene 2.4
    ( I
    updated to 2.4.1, but still no luck). The exception is preceded by an
    exception in optimizing the index. I am not reopening the reader while the
    commit or optimization is going on in the writer (optimizing happens
    in
    the
    same thread, but much less often). The issues go away once I turn off
    optimizations. I was also getting this problem before I turned off the use
    of compound files. I would appreciate any guidance.

    Thanks!

    Regards,
    Khawaja


    2009-04-09 15:57:47,033 (941820) [Index Maint Thread] ERROR
    gov.nasa.ensemble.core.indexer.Indexer - java.io.IOException:
    background
    merge hit exception: _8:C41258 _9:C11382 into _a [optimize]
    java.io.IOException: background merge hit exception: _8:C41258
    _9:C11382
    into _a [optimize]
    at
    org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2273)
    at
    org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2218)
    at
    org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2198)
    at
    gov.nasa.ensemble.core.indexer.Indexer$IndexMaintThread.run(Indexer.java:102)
    Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ
    for segment _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
    at
    org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at
    org.apache.lucene.index.SegmentReader.get(SegmentReader.java:260)
    at
    org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4220)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
    Exception in thread "Lucene Merge Thread #0"
    org.apache.lucene.index.MergePolicy$MergeException:
    org.apache.lucene.index.CorruptIndexException: doc counts differ for segment
    _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:286)
    Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ
    for segment _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
    at
    org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at
    org.apache.lucene.index.SegmentReader.get(SegmentReader.java:260)
    at
    org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4220)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
    2009-04-09 15:57:52,814 (947601) [Index Maint Thread] ERROR
    gov.nasa.ensemble.core.indexer.Indexer -
    java.io.FileNotFoundException:
    /opt/users/merops/index/_a.fnm (No such file or directory)
    java.io.FileNotFoundException: /opt/users/merops/index/_a.fnm (No such file
    or directory)
    at java.io.RandomAccessFile.open(Native Method)
    at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
    at
    org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:552)
    at
    org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:582)
    at
    org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
    at
    org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
    at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:58)
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:341)
    at
    org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at
    org.apache.lucene.index.SegmentReader.get(SegmentReader.java:269)
    at
    org.apache.lucene.index.MultiSegmentReader.doReopen(MultiSegmentReader.java:201)
    at
    org.apache.lucene.index.DirectoryIndexReader$2.doBody(DirectoryIndexReader.java:157)
    at
    org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
    at
    org.apache.lucene.index.DirectoryIndexReader.reopen(DirectoryIndexReader.java:179)
    at
    gov.nasa.ensemble.core.indexer.Indexer$IndexMaintThread.run(Indexer.java:106)
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Apr 10, 2009 at 9:37 am
    Actually it's perfectly fine for two threads to enter that code
    fragment (you obtain a write lock to protect the code so that "there
    can be only one").

    Second off, even if you didn't have your write lock, the code should
    still be safe in that no index corruption is possible. Multiple
    threads may call optimize(), commit() etc. on an IndexWriter without
    harm.

    The reopen code is also "safe" (will not cause corruption), but you
    may accidentally have readers that you fail to close, or close readers
    that are in-use by in-flight searches). I'd recommend using the
    "lia.admin.SearcherManager" class from the upcoming Lucene in Action
    revision (it's in the book's source code, which you can download from
    http://www.manning.com/hatcher3/LIAsourcecode.zip) to manage/reopen
    the searcher.

    But one surefire way to cause index corruption is if two separate
    IndexWriters are open on the same index. This is normally not easy to
    do, since Lucene protects itself with the write lock in the index
    directory. So if you 1) turn off this locking (eg use NoLockFactory),
    and 2) accidentally allow two writers on once on the same index,
    you'll get corruption.

    So.... I'm not sure that we've actually explained your corruption?

    Mike
    On Fri, Apr 10, 2009 at 12:42 AM, Khawaja Shams wrote:
    Mike,
    I am sorry for wasting your time :). There were indeed two threads that
    were performing this operation. Out of curiosity, which part of this is not
    thread safe? An indexreader reopening while a commit is going on? Thanks
    again for your help.

    Regards,
    Khawaja



    On Thu, Apr 9, 2009 at 5:44 PM, Michael McCandless <
    lucene@mikemccandless.com> wrote:
    That code looks right.  Are there multiple threads that may enter it?

    Can you show the code where you create the IndexWriter, add docs, etc?

    Can you call IndexWriter.setInfoStream for the entire life of the
    index, up until when the optimize error happens, and post back?

    Mike
    On Thu, Apr 9, 2009 at 8:33 PM, Khawaja Shams wrote:
    Hi Michael,
    Thanks for the quick response. I only have one IndexWriter, and there are
    no other processes accessing this particular index. I have tried deleting
    the entire index and reconstructing it, but the index corruption is
    repeatable. Incidentally, there are no new writes since the last commit when
    the merge happens. I have over-padded my code with ReadWrite locks to make
    sure that no writes/read are happening between the commits,
    optimizations,
    and reopening of the index.


    Here is a snippet of the thread I use to maintain the Index ( I hope that I
    am not doing something terribly wrong):
    while (true) {
    try {
    getWriteLock();
    indexWriter.commit();
    if (shouldOptimize()) {
    indexWriter.optimize();
    }

    IndexReader oldIR = indexSearcher.getIndexReader();
    IndexReader ir = oldIR.reopen();
    if (ir != oldIR) {
    IndexSearcher oldIS = indexSearcher;
    indexSearcher = new IndexSearcher(ir);
    oldIS.close();
    oldIR.close();
    } catch (Throwable t) {
    trace.error(t, t);
    } finally {
    releaseWriteLock();
    }


    Regards,
    Khawaja

    On Thu, Apr 9, 2009 at 5:05 PM, Michael McCandless <
    lucene@mikemccandless.com> wrote:
    These are serious corruption exceptions.

    Is it at all possible two writers are accessing the index at the same
    time?
    Can you describe more about how you're using Lucene?

    Mike

    On Thu, Apr 9, 2009 at 7:59 PM, Khawaja Shams <ksshams@gmail.com>
    wrote:
    Hello,
    I am having a problem with reopening the IndexReader with Lucene 2.4
    ( I
    updated to 2.4.1, but still no luck). The exception is preceded by an
    exception in optimizing the index. I am not reopening the reader while the
    commit or optimization is going on in the writer (optimizing happens
    in
    the
    same thread, but much less often). The issues go away once I turn off
    optimizations. I was also getting this problem before I turned off the use
    of compound files. I would appreciate any guidance.

    Thanks!

    Regards,
    Khawaja


    2009-04-09 15:57:47,033 (941820) [Index Maint Thread] ERROR
    gov.nasa.ensemble.core.indexer.Indexer  - java.io.IOException:
    background
    merge hit exception: _8:C41258 _9:C11382 into _a [optimize]
    java.io.IOException: background merge hit exception: _8:C41258
    _9:C11382
    into _a [optimize]
    at
    org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2273)
    at
    org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2218)
    at
    org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2198)
    at
    gov.nasa.ensemble.core.indexer.Indexer$IndexMaintThread.run(Indexer.java:102)
    Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ
    for segment _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
    at
    org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at
    org.apache.lucene.index.SegmentReader.get(SegmentReader.java:260)
    at
    org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4220)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
    Exception in thread "Lucene Merge Thread #0"
    org.apache.lucene.index.MergePolicy$MergeException:
    org.apache.lucene.index.CorruptIndexException: doc counts differ for segment
    _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:286)
    Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ
    for segment _8: fieldsReader shows 30074 but segmentInfo shows 41258
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
    at
    org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at
    org.apache.lucene.index.SegmentReader.get(SegmentReader.java:260)
    at
    org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4220)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
    2009-04-09 15:57:52,814 (947601) [Index Maint Thread] ERROR
    gov.nasa.ensemble.core.indexer.Indexer  -
    java.io.FileNotFoundException:
    /opt/users/merops/index/_a.fnm (No such file or directory)
    java.io.FileNotFoundException: /opt/users/merops/index/_a.fnm (No such file
    or directory)
    at java.io.RandomAccessFile.open(Native Method)
    at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
    at
    org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:552)
    at
    org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:582)
    at
    org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
    at
    org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:482)
    at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:58)
    at
    org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:341)
    at
    org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
    at
    org.apache.lucene.index.SegmentReader.get(SegmentReader.java:269)
    at
    org.apache.lucene.index.MultiSegmentReader.doReopen(MultiSegmentReader.java:201)
    at
    org.apache.lucene.index.DirectoryIndexReader$2.doBody(DirectoryIndexReader.java:157)
    at
    org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
    at
    org.apache.lucene.index.DirectoryIndexReader.reopen(DirectoryIndexReader.java:179)
    at
    gov.nasa.ensemble.core.indexer.Indexer$IndexMaintThread.run(Indexer.java:106)
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedApr 9, '09 at 11:59p
activeApr 10, '09 at 9:37a
posts6
users2
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase