FAQ
I am getting this on a 10GB index (via solr 1.3) during an optimize:
Jan 2, 2009 6:51:52 PM org.apache.solr.common.SolrException log
SEVERE: java.io.IOException: background merge hit exception: _ks4:C2504982
_oaw:C514635 _tll:C827949 _tdx:C18372 _te8:C19929 _tej:C22201 _1agw:C1717926
_1agz:C1 into _1ah2 [optimize]
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2346)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2280)
at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:355)
at
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:77)
...

Exception in thread "Lucene Merge Thread #2"
org.apache.lucene.index.MergePolicy$MergeException:
java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 34950
at
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:314)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
Caused by: java.lang.ArrayIndexOutOfBoundsException: Array index out of
range: 34950
at org.apache.lucene.util.BitVector.get(BitVector.java:91)
at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:125)
at
org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:98)
...


Does anyone know how this is caused and how I can fix it? It happens with
every optimize. Commits were very slow on this index as well (40x as slow as
a similar index on another machine) I have plenty of disk space (many 100s
of GB) free.

Search Discussions

  • Michael McCandless at Jan 2, 2009 at 8:45 pm
    It looks like your index has some kind of corruption. Were there any other
    exceptions prior to this one, or, any previous problems with the OS/IO
    system?

    Can you run CheckIndex (java org.apache.lucene.index.CheckIndex to see
    usage) and post the output?
    Mike

    Brian Whitman wrote:
    I am getting this on a 10GB index (via solr 1.3) during an optimize:
    Jan 2, 2009 6:51:52 PM org.apache.solr.common.SolrException log
    SEVERE: java.io.IOException: background merge hit exception: _ks4:C2504982
    _oaw:C514635 _tll:C827949 _tdx:C18372 _te8:C19929 _tej:C22201
    _1agw:C1717926
    _1agz:C1 into _1ah2 [optimize]
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2346)
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2280)
    at

    org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:355)
    at

    org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:77)
    ...

    Exception in thread "Lucene Merge Thread #2"
    org.apache.lucene.index.MergePolicy$MergeException:
    java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 34950
    at

    org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:314)
    at

    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
    Caused by: java.lang.ArrayIndexOutOfBoundsException: Array index out of
    range: 34950
    at org.apache.lucene.util.BitVector.get(BitVector.java:91)
    at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:125)
    at

    org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:98)
    ...


    Does anyone know how this is caused and how I can fix it? It happens with
    every optimize. Commits were very slow on this index as well (40x as slow
    as
    a similar index on another machine) I have plenty of disk space (many 100s
    of GB) free.
  • Brian Whitman at Jan 2, 2009 at 8:48 pm
    I will but I bet I can guess what happened -- this index has many duplicates
    in it as well (same uniqueKey id multiple times) - this happened to us once
    before and it was because the solr server went down during an add. We may
    have to re-index, but I will run checkIndex now. Thanks
    (Thread for dupes here :
    http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200803.mbox/%3c4ED8C459-1B0F-41CC-986C-4FFCEEF82E55@variogr.am%3e)
    On Fri, Jan 2, 2009 at 3:44 PM, Michael McCandless wrote:

    It looks like your index has some kind of corruption. Were there any other
    exceptions prior to this one, or, any previous problems with the OS/IO
    system?

    Can you run CheckIndex (java org.apache.lucene.index.CheckIndex to see
    usage) and post the output?
    Mike

    Brian Whitman wrote:
    I am getting this on a 10GB index (via solr 1.3) during an optimize:
    Jan 2, 2009 6:51:52 PM org.apache.solr.common.SolrException log
    SEVERE: java.io.IOException: background merge hit exception:
    _ks4:C2504982
    _oaw:C514635 _tll:C827949 _tdx:C18372 _te8:C19929 _tej:C22201
    _1agw:C1717926
    _1agz:C1 into _1ah2 [optimize]
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2346)
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2280)
    at

    org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:355)
    at

    org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:77)
    ...

    Exception in thread "Lucene Merge Thread #2"
    org.apache.lucene.index.MergePolicy$MergeException:
    java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 34950
    at

    org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:314)
    at

    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
    Caused by: java.lang.ArrayIndexOutOfBoundsException: Array index out of
    range: 34950
    at org.apache.lucene.util.BitVector.get(BitVector.java:91)
    at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:125)
    at

    org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:98)
    ...


    Does anyone know how this is caused and how I can fix it? It happens with
    every optimize. Commits were very slow on this index as well (40x as slow
    as
    a similar index on another machine) I have plenty of disk space (many 100s
    of GB) free.
  • Brian Whitman at Jan 2, 2009 at 9:03 pm
    Here's checkindex:

    NOTE: testing will be more thorough if you run java with
    '-ea:org.apache.lucene', so assertions are enabled

    Opening index @ /vol/solr/data/index/

    Segments file=segments_vxx numSegments=8 version=FORMAT_HAS_PROX [Lucene
    2.4]
    1 of 8: name=_ks4 docCount=2504982
    compound=false
    hasProx=true
    numFiles=11
    size (MB)=3,965.695
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [343 fields]
    test: terms, freq, prox...OK [37238560 terms; 161527224 terms/docs
    pairs; 186273362 tokens]
    test: stored fields.......OK [55813402 total field count; avg 22.281
    fields per doc]
    test: term vectors........OK [7998458 total vector count; avg 3.193
    term/freq vector fields per doc]

    2 of 8: name=_oaw docCount=514635
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=746.887
    has deletions [delFileName=_oaw_1rb.del]
    test: open reader.........OK [155528 deleted docs]
    test: fields, norms.......OK [172 fields]
    test: terms, freq, prox...OK [7396227 terms; 28146962 terms/docs pairs;
    17298364 tokens]
    test: stored fields.......OK [5736012 total field count; avg 15.973
    fields per doc]
    test: term vectors........OK [1045176 total vector count; avg 2.91
    term/freq vector fields per doc]

    3 of 8: name=_tll docCount=827949
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=761.782
    has deletions [delFileName=_tll_2fs.del]
    test: open reader.........OK [39283 deleted docs]
    test: fields, norms.......OK [180 fields]
    test: terms, freq, prox...OK [10925397 terms; 43361019 terms/docs pairs;
    42123294 tokens]
    test: stored fields.......OK [8673255 total field count; avg 10.997
    fields per doc]
    test: term vectors........OK [880272 total vector count; avg 1.116
    term/freq vector fields per doc]

    4 of 8: name=_tdx docCount=18372
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=56.856
    has deletions [delFileName=_tdx_9.del]
    test: open reader.........OK [18368 deleted docs]
    test: fields, norms.......OK [50 fields]
    test: terms, freq, prox...OK [261974 terms; 2018842 terms/docs pairs;
    150 tokens]
    test: stored fields.......OK [76 total field count; avg 19 fields per
    doc]
    test: term vectors........OK [14 total vector count; avg 3.5 term/freq
    vector fields per doc]

    5 of 8: name=_te8 docCount=19929
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=60.475
    has deletions [delFileName=_te8_a.del]
    test: open reader.........OK [19900 deleted docs]
    test: fields, norms.......OK [72 fields]
    test: terms, freq, prox...OK [276045 terms; 2166958 terms/docs pairs;
    1196 tokens]
    test: stored fields.......OK [522 total field count; avg 18 fields per
    doc]
    test: term vectors........OK [132 total vector count; avg 4.552
    term/freq vector fields per doc]

    6 of 8: name=_tej docCount=22201
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=65.827
    has deletions [delFileName=_tej_o.del]
    test: open reader.........OK [22171 deleted docs]
    test: fields, norms.......OK [50 fields]
    test: terms, freq, prox...FAILED
    WARNING: would remove reference to this segment (-fix was not
    specified); full exception:
    java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 34950
    at org.apache.lucene.util.BitVector.get(BitVector.java:91)
    at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:125)
    at
    org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:98)
    at org.apache.lucene.index.CheckIndex.check(CheckIndex.java:222)
    at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:433)

    7 of 8: name=_1agw docCount=1717926
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=2,390.413
    has deletions [delFileName=_1agw_1.del]
    test: open reader.........OK [1 deleted docs]
    test: fields, norms.......OK [438 fields]
    test: terms, freq, prox...OK [20959015 terms; 101603282 terms/docs
    pairs; 123561985 tokens]
    test: stored fields.......OK [26248407 total field count; avg 15.279
    fields per doc]
    test: term vectors........OK [4911368 total vector count; avg 2.859
    term/freq vector fields per doc]

    8 of 8: name=_1agz docCount=1
    compound=false
    hasProx=true
    numFiles=8
    size (MB)=0
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [6 fields]
    test: terms, freq, prox...OK [6 terms; 6 terms/docs pairs; 6 tokens]
    test: stored fields.......OK [6 total field count; avg 6 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0 term/freq
    vector fields per doc]

    WARNING: 1 broken segments detected
    WARNING: 30 documents would be lost if -fix were specified

    NOTE: would write new segments file [-fix was not specified]


    On Fri, Jan 2, 2009 at 3:47 PM, Brian Whitman wrote:

    I will but I bet I can guess what happened -- this index has many
    duplicates in it as well (same uniqueKey id multiple times) - this happened
    to us once before and it was because the solr server went down during an
    add. We may have to re-index, but I will run checkIndex now. Thanks
    (Thread for dupes here :
    http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200803.mbox/%3c4ED8C459-1B0F-41CC-986C-4FFCEEF82E55@variogr.am%3e)


    On Fri, Jan 2, 2009 at 3:44 PM, Michael McCandless <
    lucene@mikemccandless.com> wrote:
    It looks like your index has some kind of corruption. Were there any
    other
    exceptions prior to this one, or, any previous problems with the OS/IO
    system?

    Can you run CheckIndex (java org.apache.lucene.index.CheckIndex to see
    usage) and post the output?
    Mike

    Brian Whitman wrote:
    I am getting this on a 10GB index (via solr 1.3) during an optimize:
    Jan 2, 2009 6:51:52 PM org.apache.solr.common.SolrException log
    SEVERE: java.io.IOException: background merge hit exception:
    _ks4:C2504982
    _oaw:C514635 _tll:C827949 _tdx:C18372 _te8:C19929 _tej:C22201
    _1agw:C1717926
    _1agz:C1 into _1ah2 [optimize]
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2346)
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2280)
    at

    org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:355)
    at

    org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:77)
    ...

    Exception in thread "Lucene Merge Thread #2"
    org.apache.lucene.index.MergePolicy$MergeException:
    java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 34950
    at

    org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:314)
    at

    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
    Caused by: java.lang.ArrayIndexOutOfBoundsException: Array index out of
    range: 34950
    at org.apache.lucene.util.BitVector.get(BitVector.java:91)
    at
    org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:125)
    at

    org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:98)
    ...


    Does anyone know how this is caused and how I can fix it? It happens with
    every optimize. Commits were very slow on this index as well (40x as slow
    as
    a similar index on another machine) I have plenty of disk space (many 100s
    of GB) free.
  • Michael McCandless at Jan 2, 2009 at 10:37 pm
    So you have a segment (_tej) with 22201 docs, all but 30 of which are
    deleted, and somehow one of the posting lists in _tej.frq is referencing an
    out-of-bound docID 34950. Odd...

    Are you sure the IO system doesn't have any consistency issues? What
    environment are you running on (machine, OS, filesystem, JVM)?

    You could re-run CheckIndex with -fix to remove that one problematic segment
    (you'd lose the 30 docs in it though).
    Mike

    Brian Whitman wrote:
    Here's checkindex:

    NOTE: testing will be more thorough if you run java with
    '-ea:org.apache.lucene', so assertions are enabled

    Opening index @ /vol/solr/data/index/

    Segments file=segments_vxx numSegments=8 version=FORMAT_HAS_PROX [Lucene
    2.4]
    1 of 8: name=_ks4 docCount=2504982
    compound=false
    hasProx=true
    numFiles=11
    size (MB)=3,965.695
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [343 fields]
    test: terms, freq, prox...OK [37238560 terms; 161527224 terms/docs
    pairs; 186273362 tokens]
    test: stored fields.......OK [55813402 total field count; avg 22.281
    fields per doc]
    test: term vectors........OK [7998458 total vector count; avg 3.193
    term/freq vector fields per doc]

    2 of 8: name=_oaw docCount=514635
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=746.887
    has deletions [delFileName=_oaw_1rb.del]
    test: open reader.........OK [155528 deleted docs]
    test: fields, norms.......OK [172 fields]
    test: terms, freq, prox...OK [7396227 terms; 28146962 terms/docs pairs;
    17298364 tokens]
    test: stored fields.......OK [5736012 total field count; avg 15.973
    fields per doc]
    test: term vectors........OK [1045176 total vector count; avg 2.91
    term/freq vector fields per doc]

    3 of 8: name=_tll docCount=827949
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=761.782
    has deletions [delFileName=_tll_2fs.del]
    test: open reader.........OK [39283 deleted docs]
    test: fields, norms.......OK [180 fields]
    test: terms, freq, prox...OK [10925397 terms; 43361019 terms/docs pairs;
    42123294 tokens]
    test: stored fields.......OK [8673255 total field count; avg 10.997
    fields per doc]
    test: term vectors........OK [880272 total vector count; avg 1.116
    term/freq vector fields per doc]

    4 of 8: name=_tdx docCount=18372
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=56.856
    has deletions [delFileName=_tdx_9.del]
    test: open reader.........OK [18368 deleted docs]
    test: fields, norms.......OK [50 fields]
    test: terms, freq, prox...OK [261974 terms; 2018842 terms/docs pairs;
    150 tokens]
    test: stored fields.......OK [76 total field count; avg 19 fields per
    doc]
    test: term vectors........OK [14 total vector count; avg 3.5 term/freq
    vector fields per doc]

    5 of 8: name=_te8 docCount=19929
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=60.475
    has deletions [delFileName=_te8_a.del]
    test: open reader.........OK [19900 deleted docs]
    test: fields, norms.......OK [72 fields]
    test: terms, freq, prox...OK [276045 terms; 2166958 terms/docs pairs;
    1196 tokens]
    test: stored fields.......OK [522 total field count; avg 18 fields per
    doc]
    test: term vectors........OK [132 total vector count; avg 4.552
    term/freq vector fields per doc]

    6 of 8: name=_tej docCount=22201
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=65.827
    has deletions [delFileName=_tej_o.del]
    test: open reader.........OK [22171 deleted docs]
    test: fields, norms.......OK [50 fields]
    test: terms, freq, prox...FAILED
    WARNING: would remove reference to this segment (-fix was not
    specified); full exception:
    java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 34950
    at org.apache.lucene.util.BitVector.get(BitVector.java:91)
    at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:125)
    at

    org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:98)
    at org.apache.lucene.index.CheckIndex.check(CheckIndex.java:222)
    at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:433)

    7 of 8: name=_1agw docCount=1717926
    compound=false
    hasProx=true
    numFiles=12
    size (MB)=2,390.413
    has deletions [delFileName=_1agw_1.del]
    test: open reader.........OK [1 deleted docs]
    test: fields, norms.......OK [438 fields]
    test: terms, freq, prox...OK [20959015 terms; 101603282 terms/docs
    pairs; 123561985 tokens]
    test: stored fields.......OK [26248407 total field count; avg 15.279
    fields per doc]
    test: term vectors........OK [4911368 total vector count; avg 2.859
    term/freq vector fields per doc]

    8 of 8: name=_1agz docCount=1
    compound=false
    hasProx=true
    numFiles=8
    size (MB)=0
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [6 fields]
    test: terms, freq, prox...OK [6 terms; 6 terms/docs pairs; 6 tokens]
    test: stored fields.......OK [6 total field count; avg 6 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0 term/freq
    vector fields per doc]

    WARNING: 1 broken segments detected
    WARNING: 30 documents would be lost if -fix were specified

    NOTE: would write new segments file [-fix was not specified]


    On Fri, Jan 2, 2009 at 3:47 PM, Brian Whitman wrote:

    I will but I bet I can guess what happened -- this index has many
    duplicates in it as well (same uniqueKey id multiple times) - this happened
    to us once before and it was because the solr server went down during an
    add. We may have to re-index, but I will run checkIndex now. Thanks
    (Thread for dupes here :
    http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200803.mbox/%3c4ED8C459-1B0F-41CC-986C-4FFCEEF82E55@variogr.am%3e
    )

    On Fri, Jan 2, 2009 at 3:44 PM, Michael McCandless <
    lucene@mikemccandless.com> wrote:
    It looks like your index has some kind of corruption. Were there any
    other
    exceptions prior to this one, or, any previous problems with the OS/IO
    system?

    Can you run CheckIndex (java org.apache.lucene.index.CheckIndex to see
    usage) and post the output?
    Mike

    Brian Whitman wrote:
    I am getting this on a 10GB index (via solr 1.3) during an optimize:
    Jan 2, 2009 6:51:52 PM org.apache.solr.common.SolrException log
    SEVERE: java.io.IOException: background merge hit exception:
    _ks4:C2504982
    _oaw:C514635 _tll:C827949 _tdx:C18372 _te8:C19929 _tej:C22201
    _1agw:C1717926
    _1agz:C1 into _1ah2 [optimize]
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2346)
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2280)
    at
    org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:355)
    at
    org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:77)
    ...

    Exception in thread "Lucene Merge Thread #2"
    org.apache.lucene.index.MergePolicy$MergeException:
    java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 34950
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:314)
    at
    org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
    Caused by: java.lang.ArrayIndexOutOfBoundsException: Array index out
    of
    range: 34950
    at org.apache.lucene.util.BitVector.get(BitVector.java:91)
    at
    org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:125)
    at
    org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:98)
    ...


    Does anyone know how this is caused and how I can fix it? It happens with
    every optimize. Commits were very slow on this index as well (40x as slow
    as
    a similar index on another machine) I have plenty of disk space (many 100s
    of GB) free.
  • Yonik Seeley at Jan 2, 2009 at 10:10 pm

    On Fri, Jan 2, 2009 at 3:47 PM, Brian Whitman wrote:
    I will but I bet I can guess what happened -- this index has many duplicates
    in it as well (same uniqueKey id multiple times) - this happened to us once
    before and it was because the solr server went down during an add.
    That should no longer be possible with Solr 1.3, which uses Lucene for
    handling the duplicates in a transactional manner.

    -Yonik

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Jan 2, 2009 at 10:26 pm
    Also, this (Solr server going down during an add) should not be able to
    cause this kind of corruption.
    Mike

    Yonik Seeley wrote:
    On Fri, Jan 2, 2009 at 3:47 PM, Brian Whitman wrote:
    I will but I bet I can guess what happened -- this index has many
    duplicates
    in it as well (same uniqueKey id multiple times) - this happened to us once
    before and it was because the solr server went down during an add.
    That should no longer be possible with Solr 1.3, which uses Lucene for
    handling the duplicates in a transactional manner.

    -Yonik

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Brian Whitman at Jan 2, 2009 at 11:40 pm
    So my apologies for the duplicate comments, I went to go get proof of
    duplicates and was confused as we apparently have duplicates across
    different shards now in our distributed setup (a bug on our end.) I assumed
    when I saw duplicates that it was the same problem as last time. Still
    doesn't help me get at my segment corruption problem, though :(

    Michael, in answer to your question: java 1.6 64-bit, debian linux, amazon
    ec2 machine with the index on an elastic block store. No other problems with
    that setup for a few months now.

    I ran checkindex with -fix on and optimize still throws the same error.

    On Fri, Jan 2, 2009 at 5:26 PM, Michael McCandless wrote:

    Also, this (Solr server going down during an add) should not be able to
    cause this kind of corruption.
    Mike

    Yonik Seeley wrote:
    On Fri, Jan 2, 2009 at 3:47 PM, Brian Whitman wrote:
    I will but I bet I can guess what happened -- this index has many
    duplicates
    in it as well (same uniqueKey id multiple times) - this happened to us once
    before and it was because the solr server went down during an add.
    That should no longer be possible with Solr 1.3, which uses Lucene for
    handling the duplicates in a transactional manner.

    -Yonik

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Jan 3, 2009 at 1:55 pm
    It's remotely possible you hit a JVM bug in the past and that caused the
    corruption.
    EG there is at least one JVM bug lurking that can affect Lucene (though
    apparently with an OS-level fault):
    https://issues.apache.org/jira/browse/LUCENE-1342

    I don't know much about Amazon's elastic block store, but presumably it's
    unlikely to have undetected IO errors.

    Did this corruption happen only once? (You mentioned hitting dups in the
    past... but did you also see corruption too?)

    It's very strange that CheckIndex -fix did not resolve the issue. After
    fixing it, if you re-run CheckIndex on the index do you still see that
    original one broken segment present? CheckIndex should have removed
    reference to that one segment.

    Mike

    Brian Whitman wrote:
    So my apologies for the duplicate comments, I went to go get proof of
    duplicates and was confused as we apparently have duplicates across
    different shards now in our distributed setup (a bug on our end.) I assumed
    when I saw duplicates that it was the same problem as last time. Still
    doesn't help me get at my segment corruption problem, though :(

    Michael, in answer to your question: java 1.6 64-bit, debian linux, amazon
    ec2 machine with the index on an elastic block store. No other problems
    with
    that setup for a few months now.

    I ran checkindex with -fix on and optimize still throws the same error.


    On Fri, Jan 2, 2009 at 5:26 PM, Michael McCandless <
    lucene@mikemccandless.com> wrote:
    Also, this (Solr server going down during an add) should not be able to
    cause this kind of corruption.
    Mike

    Yonik Seeley wrote:
    On Fri, Jan 2, 2009 at 3:47 PM, Brian Whitman <brian@echonest.com>
    wrote:
    I will but I bet I can guess what happened -- this index has many
    duplicates
    in it as well (same uniqueKey id multiple times) - this happened to
    us
    once
    before and it was because the solr server went down during an add.
    That should no longer be possible with Solr 1.3, which uses Lucene for
    handling the duplicates in a transactional manner.

    -Yonik

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Brian Whitman at Jan 3, 2009 at 4:05 pm


    It's very strange that CheckIndex -fix did not resolve the issue. After
    fixing it, if you re-run CheckIndex on the index do you still see that
    original one broken segment present? CheckIndex should have removed
    reference to that one segment.
    I just ran it again, and it detected the same error and claimed to fix it. I
    then shut down the solr server (I wasn't sure if this would be an issue),
    ran it a third time (where it again found and claimed to fix the error),
    then a fourth where it did not find any problems, and now the optimize()
    call on the running server does not throw the merge exception.

    Did this corruption happen only once? (You mentioned hitting dups in
    the past...
    but did you also see corruption too?)

    Not that we know of, but it's very likely we never noticed. (The only reason
    I discovered this was our commits were taking 20-40x longer on this index
    than others)
  • Vivek sar at Feb 26, 2009 at 1:30 am
    Hi,

    We ran into the same issue (corrupted index) using Lucene 2.4.0.
    There was no outage or system reboot - not sure how could it get
    corrupted. Here is the exception,

    Caused by: java.io.IOException: background merge hit exception:
    _io5:c66777491 _nh9:c10656736 _taq:c2021563 _s8m:c1421051
    _uh5:c2065961 _r0y:c1124653 _s4s:c2477731 _t6w:c4340938 _ucx:c8018451
    _xkb:c13842776 _xkd:c3394 _xke:c3379 _xkg:c1231 _xkh:c1252 _xkj:c1680
    _xkk:c1689 into _xkl [optimize]
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2258)
    at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2194)
    at com.packetmotion.manager.query.fulltext.index.Indexer.optimizeMasterIndex(Indexer.java:887)
    at com.packetmotion.manager.query.fulltext.index.Indexer.createNewIndexPartition(Indexer.java:783)
    at com.packetmotion.manager.query.fulltext.index.Indexer.indexByDevice(Indexer.java:582)
    at com.packetmotion.manager.query.fulltext.index.Indexer.indexData(Indexer.java:440)
    at com.packetmotion.manager.query.fulltext.index.Indexer$$FastClassByCGLIB$$97fb7e9b.invoke((MethodProxy.java:149)
    at org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:628)
    at com.packetmotion.manager.query.fulltext.index.Indexer$$EnhancerByCGLIB$$ebbe3914.indexData((IndexerJob.java:38)
    ... 8 more
    Caused by: java.lang.IndexOutOfBoundsException: Index: 51, Size: 26
    at java.util.ArrayList.RangeCheck(ArrayList.java:547)
    at java.util.ArrayList.get(ArrayList.java:322)
    at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:265)
    at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:185)
    at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:729)
    at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:359)
    at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139)
    at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4226)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3877)
    at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
    at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)


    I ran the checkIndex tool to fix the corrupted index. Here is its output,


    Opening index @ /opt/ps_manager/apps/conf/index/MasterIndex

    Segments file=segments_1587 numSegments=18 version=FORMAT_HAS_PROX [Lucene 2.4]
    1 of 18: name=_io5 docCount=66777491
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=6,680.574
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [368415 terms; 1761057989 terms/docs
    pairs; 2095636359 tokens]
    test: stored fields.......OK [66777491 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    2 of 18: name=_nh9 docCount=10656736
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=1,058.869
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [118964 terms; 278176718 terms/docs
    pairs; 329825350 tokens]
    test: stored fields.......OK [10656736 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    3 of 18: name=_taq docCount=2021563
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=208.544
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [145494 terms; 54131697 terms/docs
    pairs; 65411701 tokens]
    test: stored fields.......OK [2021563 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    4 of 18: name=_s8m docCount=1421051
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=162.443
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [210276 terms; 42491363 terms/docs
    pairs; 53054214 tokens]
    test: stored fields.......OK [1421051 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    5 of 18: name=_uh5 docCount=2065961
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=229.394
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [126789 terms; 60416663 terms/docs
    pairs; 75120265 tokens]
    test: stored fields.......OK [2065964 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    6 of 18: name=_r0y docCount=1124653
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=115.792
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...FAILED
    WARNING: fixIndex() would remove reference to this segment; full exception:
    java.lang.RuntimeException: term desthost:wir docFreq=6 != num docs
    seen 0 + num docs deleted 0
    at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:475)
    at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:676)

    7 of 18: name=_s4s docCount=2477731
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=230.698
    no deletions
    test: open reader.........PuTTYPuTTYOK
    test: fields, norms.......OK [25 fields]
    test: terms, freq, prox...OK [90327 terms; 60212096 terms/docs
    pairs; 72740383 tokens]
    test: stored fields.......OK [2477731 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    8 of 18: name=_t6w docCount=4340938
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=451.389
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...OK [157817 terms; 121753990 terms/docs
    pairs; 151141568 tokens]
    test: stored fields.......OK [4340938 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    9 of 18: name=_ucx docCount=8018451
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=845.968
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...OK [410057 terms; 227617455 terms/docs
    pairs; 283398975 tokens]
    test: stored fields.......OK [8018451 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    10 of 18: name=_xl9 docCount=13891933
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=1,408.881
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...OK [535226 terms; 376394493 terms/docs
    pairs; 465003060 tokens]
    test: stored fields.......OK [13891933 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    11 of 18: name=_xlb docCount=3521
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.411
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [25 fields]
    test: terms, freq, prox...OK [2070 terms; 108496 terms/docs pairs;
    136365 tokens]
    test: stored fields.......OK [3521 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    12 of 18: name=_xlc docCount=3529
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.412
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [22 fields]
    test: terms, freq, prox...OK [2214 terms; 108633 terms/docs pairs;
    136384 tokens]
    test: stored fields.......OK [3529 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    13 of 18: name=_xle docCount=1401
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.188
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [1854 terms; 48703 terms/docs pairs;
    62221 tokens]
    test: stored fields.......OK [1401 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    14 of 18: name=_xlf docCount=1399
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.188
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [1763 terms; 48910 terms/docs pairs;
    63035 tokens]
    test: stored fields.......OK [1399 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    15 of 18: name=_xlh docCount=1727
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.24
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [2318 terms; 62596 terms/docs pairs;
    80688 tokens]
    test: stored fields.......OK [1727 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    16 of 18: name=_xli docCount=1716
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.237
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [2264 terms; 61867 terms/docs pairs;
    79497 tokens]
    test: stored fields.......OK [1716 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    17 of 18: name=_xlk docCount=2921
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.364
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [25 fields]
    test: terms, freq, prox...OK [2077 terms; 96536 terms/docs pairs;
    123166 tokens]
    test: stored fields.......OK [2921 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    18 of 18: name=_xll docCount=3876
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.476
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [2261 terms; 130104 terms/docs pairs;
    166867 tokens]
    test: stored fields.......OK [3876 total field count; avg 1 fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    WARNING: 1 broken segments (containing 1124653 documents) detected
    WARNING: 1124653 documents will be lost

    NOTE: will write new segments file in 5 seconds; this will remove
    1124653 docs from the index. THIS IS YOUR LAST CHANCE TO CTRL+C!
    5...
    4...
    3...
    2...
    1...
    Writing...
    OK
    Wrote new segments file "segments_1588"


    Any ideas how would the index get corrupted?

    Thanks,
    -vivek

    On Sat, Jan 3, 2009 at 8:04 AM, Brian Whitman wrote:


    It's very strange that CheckIndex -fix did not resolve the issue.  After
    fixing it, if you re-run CheckIndex on the index do you still see that
    original one broken segment present?  CheckIndex should have removed
    reference to that one segment.
    I just ran it again, and it detected the same error and claimed to fix it. I
    then shut down the solr server (I wasn't sure if this would be an issue),
    ran it a third time (where it again found and claimed to fix the error),
    then a fourth where it did not find any problems, and now the optimize()
    call on the running server does not throw the merge exception.

    Did this corruption happen only once?  (You mentioned hitting dups in
    the past...
    but did you also see corruption too?)

    Not that we know of, but it's very likely we never noticed. (The only reason
    I discovered this was our commits were taking 20-40x longer on this index
    than others)
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Feb 26, 2009 at 11:52 am
    The exception you hit during optimize indicates some corruption in the
    stored fields file (_X.fdt). Then, the exception hit during
    CheckIndex was different -- the postings entry (_r0y.frq) for term
    "desthost:wir" was supposed to have 6 entries but had 0.

    Did "CheckIndex -fix" allow you to optimize successfully? In which
    case, both of these different corruptions were in the same segment
    (_r0y) that CheckIndex removed.

    Was this index fully created with 2.4.0 or a prior release?

    Which exact JRE was used to create/write to this index? And which one
    was used for searching it?

    How often do you hit this? Is it repeatable?

    It's remotely possible you have bad hardware (RAM, hard drive)

    Most likely it's gone... but if somehow I could get access to that bad
    segment, I could try to dig.

    vivek sar wrote:
    Hi,

    We ran into the same issue (corrupted index) using Lucene 2.4.0.
    There was no outage or system reboot - not sure how could it get
    corrupted. Here is the exception,

    Caused by: java.io.IOException: background merge hit exception:
    _io5:c66777491 _nh9:c10656736 _taq:c2021563 _s8m:c1421051
    _uh5:c2065961 _r0y:c1124653 _s4s:c2477731 _t6w:c4340938 _ucx:c8018451
    _xkb:c13842776 _xkd:c3394 _xke:c3379 _xkg:c1231 _xkh:c1252 _xkj:c1680
    _xkk:c1689 into _xkl [optimize]
    at
    org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2258)
    at
    org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2194)
    at
    com
    .packetmotion
    .manager
    .query.fulltext.index.Indexer.optimizeMasterIndex(Indexer.java:887)
    at
    com
    .packetmotion
    .manager
    .query.fulltext.index.Indexer.createNewIndexPartition(Indexer.java:
    783)
    at
    com
    .packetmotion
    .manager.query.fulltext.index.Indexer.indexByDevice(Indexer.java:582)
    at
    com
    .packetmotion
    .manager.query.fulltext.index.Indexer.indexData(Indexer.java:440)
    at com.packetmotion.manager.query.fulltext.index.Indexer$
    $FastClassByCGLIB$$97fb7e9b.invoke(<generated>)
    at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:149)
    at org.springframework.aop.framework.Cglib2AopProxy
    $DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:628)
    at com.packetmotion.manager.query.fulltext.index.Indexer$
    $EnhancerByCGLIB$$ebbe3914.indexData(<generated>)
    at com.packetmotion.manager.query.fulltext.index.IndexerJob
    $1.run(IndexerJob.java:38)
    ... 8 more
    Caused by: java.lang.IndexOutOfBoundsException: Index: 51, Size: 26
    at java.util.ArrayList.RangeCheck(ArrayList.java:547)
    at java.util.ArrayList.get(ArrayList.java:322)
    at
    org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:265)
    at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:
    185)
    at
    org.apache.lucene.index.SegmentReader.document(SegmentReader.java:729)
    at
    org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:
    359)
    at
    org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139)
    at
    org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4226)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:
    3877)
    at
    org
    .apache
    .lucene
    .index
    .ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
    at org.apache.lucene.index.ConcurrentMergeScheduler
    $MergeThread.run(ConcurrentMergeScheduler.java:260)


    I ran the checkIndex tool to fix the corrupted index. Here is its
    output,


    Opening index @ /opt/ps_manager/apps/conf/index/MasterIndex

    Segments file=segments_1587 numSegments=18 version=FORMAT_HAS_PROX
    [Lucene 2.4]
    1 of 18: name=_io5 docCount=66777491
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=6,680.574
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [368415 terms; 1761057989 terms/docs
    pairs; 2095636359 tokens]
    test: stored fields.......OK [66777491 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    2 of 18: name=_nh9 docCount=10656736
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=1,058.869
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [118964 terms; 278176718 terms/docs
    pairs; 329825350 tokens]
    test: stored fields.......OK [10656736 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    3 of 18: name=_taq docCount=2021563
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=208.544
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [145494 terms; 54131697 terms/docs
    pairs; 65411701 tokens]
    test: stored fields.......OK [2021563 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    4 of 18: name=_s8m docCount=1421051
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=162.443
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [210276 terms; 42491363 terms/docs
    pairs; 53054214 tokens]
    test: stored fields.......OK [1421051 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    5 of 18: name=_uh5 docCount=2065961
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=229.394
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [126789 terms; 60416663 terms/docs
    pairs; 75120265 tokens]
    test: stored fields.......OK [2065964 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    6 of 18: name=_r0y docCount=1124653
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=115.792
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...FAILED
    WARNING: fixIndex() would remove reference to this segment; full
    exception:
    java.lang.RuntimeException: term desthost:wir docFreq=6 != num docs
    seen 0 + num docs deleted 0
    at
    org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:475)
    at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:676)

    7 of 18: name=_s4s docCount=2477731
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=230.698
    no deletions
    test: open reader.........PuTTYPuTTYOK
    test: fields, norms.......OK [25 fields]
    test: terms, freq, prox...OK [90327 terms; 60212096 terms/docs
    pairs; 72740383 tokens]
    test: stored fields.......OK [2477731 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    8 of 18: name=_t6w docCount=4340938
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=451.389
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...OK [157817 terms; 121753990 terms/docs
    pairs; 151141568 tokens]
    test: stored fields.......OK [4340938 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    9 of 18: name=_ucx docCount=8018451
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=845.968
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...OK [410057 terms; 227617455 terms/docs
    pairs; 283398975 tokens]
    test: stored fields.......OK [8018451 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    10 of 18: name=_xl9 docCount=13891933
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=1,408.881
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...OK [535226 terms; 376394493 terms/docs
    pairs; 465003060 tokens]
    test: stored fields.......OK [13891933 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    11 of 18: name=_xlb docCount=3521
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.411
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [25 fields]
    test: terms, freq, prox...OK [2070 terms; 108496 terms/docs pairs;
    136365 tokens]
    test: stored fields.......OK [3521 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    12 of 18: name=_xlc docCount=3529
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.412
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [22 fields]
    test: terms, freq, prox...OK [2214 terms; 108633 terms/docs pairs;
    136384 tokens]
    test: stored fields.......OK [3529 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    13 of 18: name=_xle docCount=1401
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.188
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [1854 terms; 48703 terms/docs pairs;
    62221 tokens]
    test: stored fields.......OK [1401 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    14 of 18: name=_xlf docCount=1399
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.188
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [1763 terms; 48910 terms/docs pairs;
    63035 tokens]
    test: stored fields.......OK [1399 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    15 of 18: name=_xlh docCount=1727
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.24
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [2318 terms; 62596 terms/docs pairs;
    80688 tokens]
    test: stored fields.......OK [1727 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    16 of 18: name=_xli docCount=1716
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.237
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [2264 terms; 61867 terms/docs pairs;
    79497 tokens]
    test: stored fields.......OK [1716 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    17 of 18: name=_xlk docCount=2921
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.364
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [25 fields]
    test: terms, freq, prox...OK [2077 terms; 96536 terms/docs pairs;
    123166 tokens]
    test: stored fields.......OK [2921 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    18 of 18: name=_xll docCount=3876
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.476
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [2261 terms; 130104 terms/docs pairs;
    166867 tokens]
    test: stored fields.......OK [3876 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    WARNING: 1 broken segments (containing 1124653 documents) detected
    WARNING: 1124653 documents will be lost

    NOTE: will write new segments file in 5 seconds; this will remove
    1124653 docs from the index. THIS IS YOUR LAST CHANCE TO CTRL+C!
    5...
    4...
    3...
    2...
    1...
    Writing...
    OK
    Wrote new segments file "segments_1588"


    Any ideas how would the index get corrupted?

    Thanks,
    -vivek

    On Sat, Jan 3, 2009 at 8:04 AM, Brian Whitman wrote:


    It's very strange that CheckIndex -fix did not resolve the issue.
    After
    fixing it, if you re-run CheckIndex on the index do you still see
    that
    original one broken segment present? CheckIndex should have removed
    reference to that one segment.
    I just ran it again, and it detected the same error and claimed to
    fix it. I
    then shut down the solr server (I wasn't sure if this would be an
    issue),
    ran it a third time (where it again found and claimed to fix the
    error),
    then a fourth where it did not find any problems, and now the
    optimize()
    call on the running server does not throw the merge exception.

    Did this corruption happen only once? (You mentioned hitting dups
    in
    the past...
    but did you also see corruption too?)

    Not that we know of, but it's very likely we never noticed. (The
    only reason
    I discovered this was our commits were taking 20-40x longer on this
    index
    than others)
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Feb 26, 2009 at 11:58 am
    Also: what sorts of stored fields are you storing? Binary?
    Compressed? Text with unicode characters? Roughly how many stored
    fields per document?

    Mike

    vivek sar wrote:
    Hi,

    We ran into the same issue (corrupted index) using Lucene 2.4.0.
    There was no outage or system reboot - not sure how could it get
    corrupted. Here is the exception,

    Caused by: java.io.IOException: background merge hit exception:
    _io5:c66777491 _nh9:c10656736 _taq:c2021563 _s8m:c1421051
    _uh5:c2065961 _r0y:c1124653 _s4s:c2477731 _t6w:c4340938 _ucx:c8018451
    _xkb:c13842776 _xkd:c3394 _xke:c3379 _xkg:c1231 _xkh:c1252 _xkj:c1680
    _xkk:c1689 into _xkl [optimize]
    at
    org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2258)
    at
    org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2194)
    at
    com
    .packetmotion
    .manager
    .query.fulltext.index.Indexer.optimizeMasterIndex(Indexer.java:887)
    at
    com
    .packetmotion
    .manager
    .query.fulltext.index.Indexer.createNewIndexPartition(Indexer.java:
    783)
    at
    com
    .packetmotion
    .manager.query.fulltext.index.Indexer.indexByDevice(Indexer.java:582)
    at
    com
    .packetmotion
    .manager.query.fulltext.index.Indexer.indexData(Indexer.java:440)
    at com.packetmotion.manager.query.fulltext.index.Indexer$
    $FastClassByCGLIB$$97fb7e9b.invoke(<generated>)
    at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:149)
    at org.springframework.aop.framework.Cglib2AopProxy
    $DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:628)
    at com.packetmotion.manager.query.fulltext.index.Indexer$
    $EnhancerByCGLIB$$ebbe3914.indexData(<generated>)
    at com.packetmotion.manager.query.fulltext.index.IndexerJob
    $1.run(IndexerJob.java:38)
    ... 8 more
    Caused by: java.lang.IndexOutOfBoundsException: Index: 51, Size: 26
    at java.util.ArrayList.RangeCheck(ArrayList.java:547)
    at java.util.ArrayList.get(ArrayList.java:322)
    at
    org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:265)
    at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:
    185)
    at
    org.apache.lucene.index.SegmentReader.document(SegmentReader.java:729)
    at
    org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:
    359)
    at
    org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139)
    at
    org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4226)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:
    3877)
    at
    org
    .apache
    .lucene
    .index
    .ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
    at org.apache.lucene.index.ConcurrentMergeScheduler
    $MergeThread.run(ConcurrentMergeScheduler.java:260)


    I ran the checkIndex tool to fix the corrupted index. Here is its
    output,


    Opening index @ /opt/ps_manager/apps/conf/index/MasterIndex

    Segments file=segments_1587 numSegments=18 version=FORMAT_HAS_PROX
    [Lucene 2.4]
    1 of 18: name=_io5 docCount=66777491
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=6,680.574
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [368415 terms; 1761057989 terms/docs
    pairs; 2095636359 tokens]
    test: stored fields.......OK [66777491 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    2 of 18: name=_nh9 docCount=10656736
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=1,058.869
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [118964 terms; 278176718 terms/docs
    pairs; 329825350 tokens]
    test: stored fields.......OK [10656736 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    3 of 18: name=_taq docCount=2021563
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=208.544
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [145494 terms; 54131697 terms/docs
    pairs; 65411701 tokens]
    test: stored fields.......OK [2021563 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    4 of 18: name=_s8m docCount=1421051
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=162.443
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [210276 terms; 42491363 terms/docs
    pairs; 53054214 tokens]
    test: stored fields.......OK [1421051 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    5 of 18: name=_uh5 docCount=2065961
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=229.394
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [30 fields]
    test: terms, freq, prox...OK [126789 terms; 60416663 terms/docs
    pairs; 75120265 tokens]
    test: stored fields.......OK [2065964 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    6 of 18: name=_r0y docCount=1124653
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=115.792
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...FAILED
    WARNING: fixIndex() would remove reference to this segment; full
    exception:
    java.lang.RuntimeException: term desthost:wir docFreq=6 != num docs
    seen 0 + num docs deleted 0
    at
    org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:475)
    at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:676)

    7 of 18: name=_s4s docCount=2477731
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=230.698
    no deletions
    test: open reader.........PuTTYPuTTYOK
    test: fields, norms.......OK [25 fields]
    test: terms, freq, prox...OK [90327 terms; 60212096 terms/docs
    pairs; 72740383 tokens]
    test: stored fields.......OK [2477731 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    8 of 18: name=_t6w docCount=4340938
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=451.389
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...OK [157817 terms; 121753990 terms/docs
    pairs; 151141568 tokens]
    test: stored fields.......OK [4340938 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    9 of 18: name=_ucx docCount=8018451
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=845.968
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...OK [410057 terms; 227617455 terms/docs
    pairs; 283398975 tokens]
    test: stored fields.......OK [8018451 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    10 of 18: name=_xl9 docCount=13891933
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=1,408.881
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [26 fields]
    test: terms, freq, prox...OK [535226 terms; 376394493 terms/docs
    pairs; 465003060 tokens]
    test: stored fields.......OK [13891933 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    11 of 18: name=_xlb docCount=3521
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.411
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [25 fields]
    test: terms, freq, prox...OK [2070 terms; 108496 terms/docs pairs;
    136365 tokens]
    test: stored fields.......OK [3521 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    12 of 18: name=_xlc docCount=3529
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.412
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [22 fields]
    test: terms, freq, prox...OK [2214 terms; 108633 terms/docs pairs;
    136384 tokens]
    test: stored fields.......OK [3529 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    13 of 18: name=_xle docCount=1401
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.188
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [1854 terms; 48703 terms/docs pairs;
    62221 tokens]
    test: stored fields.......OK [1401 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    14 of 18: name=_xlf docCount=1399
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.188
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [1763 terms; 48910 terms/docs pairs;
    63035 tokens]
    test: stored fields.......OK [1399 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    15 of 18: name=_xlh docCount=1727
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.24
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [2318 terms; 62596 terms/docs pairs;
    80688 tokens]
    test: stored fields.......OK [1727 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    16 of 18: name=_xli docCount=1716
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.237
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [2264 terms; 61867 terms/docs pairs;
    79497 tokens]
    test: stored fields.......OK [1716 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    17 of 18: name=_xlk docCount=2921
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.364
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [25 fields]
    test: terms, freq, prox...OK [2077 terms; 96536 terms/docs pairs;
    123166 tokens]
    test: stored fields.......OK [2921 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    18 of 18: name=_xll docCount=3876
    compound=true
    hasProx=true
    numFiles=1
    size (MB)=0.476
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [21 fields]
    test: terms, freq, prox...OK [2261 terms; 130104 terms/docs pairs;
    166867 tokens]
    test: stored fields.......OK [3876 total field count; avg 1
    fields per doc]
    test: term vectors........OK [0 total vector count; avg 0
    term/freq vector fields per doc]

    WARNING: 1 broken segments (containing 1124653 documents) detected
    WARNING: 1124653 documents will be lost

    NOTE: will write new segments file in 5 seconds; this will remove
    1124653 docs from the index. THIS IS YOUR LAST CHANCE TO CTRL+C!
    5...
    4...
    3...
    2...
    1...
    Writing...
    OK
    Wrote new segments file "segments_1588"


    Any ideas how would the index get corrupted?

    Thanks,
    -vivek

    On Sat, Jan 3, 2009 at 8:04 AM, Brian Whitman wrote:


    It's very strange that CheckIndex -fix did not resolve the issue.
    After
    fixing it, if you re-run CheckIndex on the index do you still see
    that
    original one broken segment present? CheckIndex should have removed
    reference to that one segment.
    I just ran it again, and it detected the same error and claimed to
    fix it. I
    then shut down the solr server (I wasn't sure if this would be an
    issue),
    ran it a third time (where it again found and claimed to fix the
    error),
    then a fourth where it did not find any problems, and now the
    optimize()
    call on the running server does not throw the merge exception.

    Did this corruption happen only once? (You mentioned hitting dups
    in
    the past...
    but did you also see corruption too?)

    Not that we know of, but it's very likely we never noticed. (The
    only reason
    I discovered this was our commits were taking 20-40x longer on this
    index
    than others)
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJan 2, '09 at 7:34p
activeFeb 26, '09 at 11:58a
posts13
users4
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase