FAQ
Hi...



It was a little comforting to know that other people have
seen Windows Explorer refreshes crash java Lucene on Windows. We seem
to be running into a long list of file system issues with Lucene, and I
was wondering if other people had noticed these sort of things (and
hopefully any tips and tricks for working around them).



We've got a process running as a Windows service that's
trying to keep a set of Lucene indexes up-to-date from a database. The
corpus is pretty small, so we copy the last index build to a temp
directory and then try to do an incremental index of the changes on the
working copy. Since our software is evolving, we've put a version
number in the meta data of the index files which we check when we're
starting up. If the version numbers don't match, we scrap the whole
thing and start over. The problem is that java Lucene on Windows
doesn't do "scrap" very well.



The current hypothesis is that File.renameTo, File.delete,
and other jvm operations on windows fail if there is any other handle
open on the file and that Lucene objects aren't closing/finalizing their
file handles cleanly/reliably so other things blow up later.



Here's the chain of events we had in our service process
last night:

20060817T165611.682,EDT [Indexer.java 813]: Exception occurred deleting
document 183971: Lock obtain timed out:
Lock@C:\WINDOWS\TEMP\lucene-22b8462f0f541160a41abfdff8d52f94-write.lock

20060817T165612.682,EDT [Indexer.java 813]: Exception occurred deleting
document 257265: Lock obtain timed out:
Lock@C:\WINDOWS\TEMP\lucene-22b8462f0f541160a41abfdff8d52f94-write.lock

20060817T165613.744,EDT [Indexer.java 1184]: Indexing failure, db
changes will be rolled back and partial index deleted.

java.io.IOException: Lock obtain timed out:
Lock@C:\WINDOWS\TEMP\lucene-22b8462f0f541160a41abfdff8d52f94-write.lock

at org.apache.lucene.store.Lock.obtain(Lock.java:56)

at
org.apache.lucene.index.IndexReader.aquireWriteLock(IndexReader.java:489
)

at
org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java:514)

at
org.apache.lucene.index.IndexReader.deleteDocuments(IndexReader.java:541
)

at ...Indexer.buildIncrementalIndex(Indexer.java:915)

...



Why it couldn't get that temporary lock file, I don't know. The same
process runs continuously, and I don't know if Lucene reuses the same
tmpnames from run to run. If the files were left around because of
these JVM File system errors, maybe that would explain it.



The "partial index deleted" part of our message means that we did a
recursive (java) delete of all the index directories we were working
with. Our attempt to clean the slate got everything but
contact_index\_3zhe.cfs.



An hour later, we come back and try to start another index build. We
find the directory still exists, so we try to validate the index version
number using

searcher = new
IndexSearcher(FSDirectory.getDirectory(indexFile, false));

TermQuery tq = new TermQuery(new
Term(METADATA_DOCUMENT_FIELD, METADATA_DOCUMENT_FIELD_VALUE));

Hits h = searcher.search(tq);

if (h.length() == 1)

...

finally

{

if (searcher != null)

{

try { searcher.close(); } catch (Exception e) { /*
ignore */ }

}

}



Obviously with only the _3zhe.cfs file left, it's not a valid index, so
the attempt to get the metadata fails. No matter what happens, we do a
searcher.close(). My suspicion is that IndexSearch.close() isn't really
doing a File.close() on all the files it's using, so you have to wait
until searcher is garbage collected and its file objects finalized
before things will work - because immediately after this check, Lucene
fails a full build with

20060817T175614.323,EDT [Indexer.java 843]: Error building full index

java.io.IOException: Cannot delete
\\xx.xx.xx.xx\indexbuild2\contact_index\_3zhe.cfs

at
org.apache.lucene.store.FSDirectory.create(FSDirectory.java:198)

at
org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:144)

at
org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:224)

...



Doing a full build, Lucene does it's own attempt to clear out the
leftovers which fails because it can't delete the file. And we're stuck
in this loop all night.



The guy who wrote the Indexing code says the version check is the only
place in the code where we have an IndexSearcher created using a file
path string - that all others have a pre-existing IndexReader. He wants
me to try it that way instead, so that we can explicitly close the
reader and hopefully clear that loose file handle.



Sorry for the long-winded vent, but does anyone have any advice for
getting java lucene working on windows? Any idea why it would seize up
on the lock files? This service process is the only lucene process on
the system, and the finished indexes are copied off to another server to
serve the search requests, so it's puzzling that the daemon process
would block itself... Anyone know if an IndexReader.close() would do a
better job of cleaning up the file handles than IndexSearcher.close()?



Thanks

-Mark


This e-mail message, and any attachments, is intended only for the use of the individual or entity identified in the alias address of this message and may contain information that is confidential, privileged and subject to legal restrictions and penalties regarding its unauthorized disclosure and use. Any unauthorized review, copying, disclosure, use or distribution is strictly prohibited. If you have received this e-mail message in error, please notify the sender immediately by reply e-mail and delete this message, and any attachments, from your system. Thank you.

Search Discussions

  • Michael McCandless at Aug 18, 2006 at 7:11 pm

    It was a little comforting to know that other people have
    seen Windows Explorer refreshes crash java Lucene on Windows. We seem
    to be running into a long list of file system issues with Lucene, and I
    was wondering if other people had noticed these sort of things (and
    hopefully any tips and tricks for working around them).
    Sorry you're having so many troubles. Keep these emails, questions &
    issues coming because this is how we [gradually] fix Lucene to be more
    robust!

    OK a few quick possibilities / suggestions:

    * Make sure in your Indexer.java that when you delete docs, you
    close any open IndexWriter's before you try to call
    deleteDocuments from your IndexReader. Only one writer
    (IndexWriter adding docs or IndexReader deleting docs) can be open
    at once and if you fail to do this you'll get exactly that "lock
    obtain timed out" error. You could also use IndexModifier which
    under the hood is doing this open-close logic for you. But: try
    to buffer up adds and deletes together if possible to minimize
    cost of open/closes.

    * That one file really seems to have an open file handle on it. Are
    you sure you called close on all IndexReaders (IndexSearchers)?
    That file is a "compound file format" segment, and IndexReaders
    hold an open file handle to these files (IndexWriters do as well,
    but they quickly close the file handles after writing to them).

    * There was a thread recently, similar to this issue, where
    File.renameTo was failing, and there was a suggestion that this is
    a bug in some JVMs and to get the JVM to GC (System.gc()) to see
    if that then closes the underlying file.

    * IndexSearcher.close() will only close the underlying IndexReader
    if you created it with a String. If you create it with just an
    IndexReader it will not close that reader. You have to separately
    call IndexReader.close to close the reader.

    * If the JVM exited un-gracefully then the lock files will be left
    on disk and Lucene will incorrectly think the lock is held by
    another process (and then hit that "lock obtain timed out"
    error). You can just remove the lock files (from
    c:\windows\temp\...) if you are certain no Lucene processes are
    running.

    We are working towards using native locks in Lucene (for a future
    release) so that even un-graceful exits of the JVM will properly
    free the lock.

    * Perhaps, change your "build a new index" logic so that it does so
    in an entirely fresh directory? Just to avoid any hazards at all
    of anything holding files open in the old directory ...

    Mike

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Chris Lu at Aug 18, 2006 at 7:17 pm
    Hi, Mark,

    I had the same issue with Lucene when maintaining Lucene index on
    Windows. It's mostly due to Windows OS can not delete a file
    correctly. While some versioning can alivate the problem somehow, my
    advice is to move to Linux according to my experience.

    Regarding the lock, you need to delete all locks before the indexing starts.

    Regarding the leftover index, you can create a new index in a new
    directory and copy the new index to your target directory.

    Chris Lu
    ------------------------------------
    Lucene Search Server for Any Databases/Applications
    http://www.dbsight.net
    On 8/18/06, Mark Modrall wrote:
    Hi...



    It was a little comforting to know that other people have
    seen Windows Explorer refreshes crash java Lucene on Windows. We seem
    to be running into a long list of file system issues with Lucene, and I
    was wondering if other people had noticed these sort of things (and
    hopefully any tips and tricks for working around them).



    We've got a process running as a Windows service that's
    trying to keep a set of Lucene indexes up-to-date from a database. The
    corpus is pretty small, so we copy the last index build to a temp
    directory and then try to do an incremental index of the changes on the
    working copy. Since our software is evolving, we've put a version
    number in the meta data of the index files which we check when we're
    starting up. If the version numbers don't match, we scrap the whole
    thing and start over. The problem is that java Lucene on Windows
    doesn't do "scrap" very well.



    The current hypothesis is that File.renameTo, File.delete,
    and other jvm operations on windows fail if there is any other handle
    open on the file and that Lucene objects aren't closing/finalizing their
    file handles cleanly/reliably so other things blow up later.



    Here's the chain of events we had in our service process
    last night:

    20060817T165611.682,EDT [Indexer.java 813]: Exception occurred deleting
    document 183971: Lock obtain timed out:
    Lock@C:\WINDOWS\TEMP\lucene-22b8462f0f541160a41abfdff8d52f94-write.lock

    20060817T165612.682,EDT [Indexer.java 813]: Exception occurred deleting
    document 257265: Lock obtain timed out:
    Lock@C:\WINDOWS\TEMP\lucene-22b8462f0f541160a41abfdff8d52f94-write.lock

    20060817T165613.744,EDT [Indexer.java 1184]: Indexing failure, db
    changes will be rolled back and partial index deleted.

    java.io.IOException: Lock obtain timed out:
    Lock@C:\WINDOWS\TEMP\lucene-22b8462f0f541160a41abfdff8d52f94-write.lock

    at org.apache.lucene.store.Lock.obtain(Lock.java:56)

    at
    org.apache.lucene.index.IndexReader.aquireWriteLock(IndexReader.java:489
    )

    at
    org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java:514)

    at
    org.apache.lucene.index.IndexReader.deleteDocuments(IndexReader.java:541
    )

    at ...Indexer.buildIncrementalIndex(Indexer.java:915)

    ...



    Why it couldn't get that temporary lock file, I don't know. The same
    process runs continuously, and I don't know if Lucene reuses the same
    tmpnames from run to run. If the files were left around because of
    these JVM File system errors, maybe that would explain it.



    The "partial index deleted" part of our message means that we did a
    recursive (java) delete of all the index directories we were working
    with. Our attempt to clean the slate got everything but
    contact_index\_3zhe.cfs.



    An hour later, we come back and try to start another index build. We
    find the directory still exists, so we try to validate the index version
    number using

    searcher = new
    IndexSearcher(FSDirectory.getDirectory(indexFile, false));

    TermQuery tq = new TermQuery(new
    Term(METADATA_DOCUMENT_FIELD, METADATA_DOCUMENT_FIELD_VALUE));

    Hits h = searcher.search(tq);

    if (h.length() == 1)

    ...

    finally

    {

    if (searcher != null)

    {

    try { searcher.close(); } catch (Exception e) { /*
    ignore */ }

    }

    }



    Obviously with only the _3zhe.cfs file left, it's not a valid index, so
    the attempt to get the metadata fails. No matter what happens, we do a
    searcher.close(). My suspicion is that IndexSearch.close() isn't really
    doing a File.close() on all the files it's using, so you have to wait
    until searcher is garbage collected and its file objects finalized
    before things will work - because immediately after this check, Lucene
    fails a full build with

    20060817T175614.323,EDT [Indexer.java 843]: Error building full index

    java.io.IOException: Cannot delete
    \\xx.xx.xx.xx\indexbuild2\contact_index\_3zhe.cfs

    at
    org.apache.lucene.store.FSDirectory.create(FSDirectory.java:198)

    at
    org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:144)

    at
    org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:224)

    ...



    Doing a full build, Lucene does it's own attempt to clear out the
    leftovers which fails because it can't delete the file. And we're stuck
    in this loop all night.



    The guy who wrote the Indexing code says the version check is the only
    place in the code where we have an IndexSearcher created using a file
    path string - that all others have a pre-existing IndexReader. He wants
    me to try it that way instead, so that we can explicitly close the
    reader and hopefully clear that loose file handle.



    Sorry for the long-winded vent, but does anyone have any advice for
    getting java lucene working on windows? Any idea why it would seize up
    on the lock files? This service process is the only lucene process on
    the system, and the finished indexes are copied off to another server to
    serve the search requests, so it's puzzling that the daemon process
    would block itself... Anyone know if an IndexReader.close() would do a
    better job of cleaning up the file handles than IndexSearcher.close()?



    Thanks

    -Mark


    This e-mail message, and any attachments, is intended only for the use of the individual or entity identified in the alias address of this message and may contain information that is confidential, privileged and subject to legal restrictions and penalties regarding its unauthorized disclosure and use. Any unauthorized review, copying, disclosure, use or distribution is strictly prohibited. If you have received this e-mail message in error, please notify the sender immediately by reply e-mail and delete this message, and any attachments, from your system. Thank you.
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Mark Modrall at Aug 18, 2006 at 9:54 pm
    Hi Mike,

    I do appreciate the thoroughness and graciousness of your
    responses, and I hope there's nothing in my frustration that you would
    take personally. Googling around, I've found other references to the
    sun jvm handling of the Windows file system to be, well, quixotic at
    best.

    In our current system, we have two modes of operation, full
    index recreation and incremental indexing. Which to use is determined
    by a quick validate check (check to see if the path exists, see if it is
    a directory. If it is, make an IndexSearcher to check the meta data as
    below. If the reader passes the test, build incremental; otherwise
    delete the directory and start fresh
    searcher = new IndexSearcher(FSDirectory.getDirectory(indexFile,
    false));
    TermQuery tq = new TermQuery(new Term(METADATA_DOCUMENT_FIELD,
    METADATA_DOCUMENT_FIELD_VALUE));
    Hits h = searcher.search(tq);
    ).

    The validation IndexSearcher gets closed in a finally block, so
    there shouldn't be anything left over from that.

    If it's a full rebuild, we just have an IndexWriter (no reader).
    If it's incremental, there's an IndexReader to delete old documents,
    which is closed, followed by an IndexWriter that is also closed (when
    things go well).

    I haven't gone looking in the source to figure out what goes
    into the middle of the lucene-<xxx>-write.lock naming convention, but as
    you say they could have been left over from some abnormal termination.

    Our indexing schema bats back and forth between 2 build dirs;
    one's supposed to be the last successful build, the other is the one you
    can work on. When a successful build is finished, all the files are
    copied over into the scratch dir and the next build goes in the scratch
    dir. If part of the glorp in the lock file name is a hash of the
    directory path, we could run for a while and not hit the locking issue
    for a couple of builds.

    I still can't figure out how the .cfs file delete would fail,
    though, unless the IndexSearcher.close() hadn't really let go of the
    file. What would happen with an IndexSearcher on a malformed directory?
    I.e. if there was only a .cfs file there? Would .close() know to
    release the one handle it had?

    Anyway, I'll implement something at the root to delete the lock
    files before starting to do anything to make sure the slate is clean and
    cross my fingers.

    Thanks
    -Mark





    This e-mail message, and any attachments, is intended only for the use of the individual or entity identified in the alias address of this message and may contain information that is confidential, privileged and subject to legal restrictions and penalties regarding its unauthorized disclosure and use. Any unauthorized review, copying, disclosure, use or distribution is strictly prohibited. If you have received this e-mail message in error, please notify the sender immediately by reply e-mail and delete this message, and any attachments, from your system. Thank you.

    -----Original Message-----

    From: Michael McCandless
    Sent: Friday, August 18, 2006 3:11 PM
    To: java-user@lucene.apache.org
    Subject: Re: More frustration with Lucene/Java file i/o on Windows

    It was a little comforting to know that other people have
    seen Windows Explorer refreshes crash java Lucene on Windows. We seem
    to be running into a long list of file system issues with Lucene, and I
    was wondering if other people had noticed these sort of things (and
    hopefully any tips and tricks for working around them).
    Sorry you're having so many troubles. Keep these emails, questions &
    issues coming because this is how we [gradually] fix Lucene to be more
    robust!

    OK a few quick possibilities / suggestions:

    * Make sure in your Indexer.java that when you delete docs, you
    close any open IndexWriter's before you try to call
    deleteDocuments from your IndexReader. Only one writer
    (IndexWriter adding docs or IndexReader deleting docs) can be open
    at once and if you fail to do this you'll get exactly that "lock
    obtain timed out" error. You could also use IndexModifier which
    under the hood is doing this open-close logic for you. But: try
    to buffer up adds and deletes together if possible to minimize
    cost of open/closes.

    * That one file really seems to have an open file handle on it. Are
    you sure you called close on all IndexReaders (IndexSearchers)?
    That file is a "compound file format" segment, and IndexReaders
    hold an open file handle to these files (IndexWriters do as well,
    but they quickly close the file handles after writing to them).

    * There was a thread recently, similar to this issue, where
    File.renameTo was failing, and there was a suggestion that this is
    a bug in some JVMs and to get the JVM to GC (System.gc()) to see
    if that then closes the underlying file.

    * IndexSearcher.close() will only close the underlying IndexReader
    if you created it with a String. If you create it with just an
    IndexReader it will not close that reader. You have to separately
    call IndexReader.close to close the reader.

    * If the JVM exited un-gracefully then the lock files will be left
    on disk and Lucene will incorrectly think the lock is held by
    another process (and then hit that "lock obtain timed out"
    error). You can just remove the lock files (from
    c:\windows\temp\...) if you are certain no Lucene processes are
    running.

    We are working towards using native locks in Lucene (for a future
    release) so that even un-graceful exits of the JVM will properly
    free the lock.

    * Perhaps, change your "build a new index" logic so that it does so
    in an entirely fresh directory? Just to avoid any hazards at all
    of anything holding files open in the old directory ...

    Mike

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Michael McCandless at Aug 18, 2006 at 10:35 pm

    I do appreciate the thoroughness and graciousness of your
    responses, and I hope there's nothing in my frustration that you would
    take personally. Googling around, I've found other references to the
    sun jvm handling of the Windows file system to be, well, quixotic at
    best.
    No problem!

    And I suspect Sun doesn't like Microsoft :)
    In our current system, we have two modes of operation, full
    index recreation and incremental indexing. Which to use is determined
    by a quick validate check (check to see if the path exists, see if it is
    a directory. If it is, make an IndexSearcher to check the meta data as
    below. If the reader passes the test, build incremental; otherwise
    delete the directory and start fresh
    searcher = new IndexSearcher(FSDirectory.getDirectory(indexFile,
    false));
    TermQuery tq = new TermQuery(new Term(METADATA_DOCUMENT_FIELD,
    METADATA_DOCUMENT_FIELD_VALUE));
    Hits h = searcher.search(tq);
    ).

    The validation IndexSearcher gets closed in a finally block, so
    there shouldn't be anything left over from that.
    OK, this sounds fine.
    If it's a full rebuild, we just have an IndexWriter (no reader).
    If it's incremental, there's an IndexReader to delete old documents,
    which is closed, followed by an IndexWriter that is also closed (when
    things go well).
    OK but be real careful on the incremental case: you can only have
    exactly one of IndexReader or IndexWriter open at a time. In other
    words, you have to close one in order to open the other, and vice/versa.
    It sounds like you do all deletes with an IndexReader, then close it,
    then open an IndexWriter, do all your adds, then close it? In which
    case that should be fine... the closes are also in finally blocks?
    I haven't gone looking in the source to figure out what goes
    into the middle of the lucene-<xxx>-write.lock naming convention, but as
    you say they could have been left over from some abnormal termination.
    The Lucene classes have finalizers that try to release these locks so
    "in theory" (cross fingers) it should only be a hard KILL or C-level
    exception in the JVM that would cause these lock files to be left behind.
    Our indexing schema bats back and forth between 2 build dirs;
    one's supposed to be the last successful build, the other is the one you
    can work on. When a successful build is finished, all the files are
    copied over into the scratch dir and the next build goes in the scratch
    dir. If part of the glorp in the lock file name is a hash of the
    directory path, we could run for a while and not hit the locking issue
    for a couple of builds.
    OK I see. Yes indeed the glorp is a "digest" from the directory name ...
    I still can't figure out how the .cfs file delete would fail,
    though, unless the IndexSearcher.close() hadn't really let go of the
    file. What would happen with an IndexSearcher on a malformed directory?
    I.e. if there was only a .cfs file there? Would .close() know to
    release the one handle it had?
    Yeah the fact that the OS wouldn't let Lucene nor you delete the CFS
    file means it was indeed still open. That combined with write locks
    stuck in the filesystem really sorta feels like there was an
    IndexSearcher that didn't get closed. Or it could indeed be the lurking
    [possible] bug in the JVM that fails to really close a file even when
    you call File.close() from Java.

    What JVM & version of Lucene are you using?
    Anyway, I'll implement something at the root to delete the lock
    files before starting to do anything to make sure the slate is clean and
    cross my fingers.
    OK good luck!

    Mike

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedAug 18, '06 at 2:34p
activeAug 18, '06 at 10:35p
posts5
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase