FAQ
Hello,

I'm starting to wander how "bullet proof" are Lucene indexes? Do they
get corrupted easely? If so is there a way to rebuild them?

I'm started to get the following exception left and right...

"04/25 18:34:39 (Warning) Indexer.indexObjectWithValues:
java.io.IOException: _91.fnm already exists"

I build a little app (http://homepage.mac.com/zoe_info/) that uses
Lucene quiet extensively, and I would like to keep it that way. However,
I'm starting to have second thought about Lucene's reliability... :-(

I'm sure I'm doing something wrong somewhere, but I really cannot see
what...

Any help or insight greatly appreciated.

Thanks.

PA.


--
To unsubscribe, e-mail:
For additional commands, e-mail:

Search Discussions

  • Karl Øie at Apr 26, 2002 at 12:09 pm
    there are some strange problems with FSDirectory, i have found that building
    chuncks in a RAMDirectory and then merge these into a FSDirectory is more
    stable than indexing directly into the FSDirectory, i ran into your problem
    and the dreaded "too many open files" problems when indexing large documents
    with many fields....

    using a RAMDir as a middle man solved my problems...

    mvh karl øie
    On Friday 26 April 2002 13:54, petite_abeille wrote:
    Hello,

    I'm starting to wander how "bullet proof" are Lucene indexes? Do they
    get corrupted easely? If so is there a way to rebuild them?

    I'm started to get the following exception left and right...

    "04/25 18:34:39 (Warning) Indexer.indexObjectWithValues:
    java.io.IOException: _91.fnm already exists"

    I build a little app (http://homepage.mac.com/zoe_info/) that uses
    Lucene quiet extensively, and I would like to keep it that way. However,
    I'm starting to have second thought about Lucene's reliability... :-(

    I'm sure I'm doing something wrong somewhere, but I really cannot see
    what...

    Any help or insight greatly appreciated.

    Thanks.

    PA.

    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:
  • Petite_abeille at Apr 26, 2002 at 12:23 pm
    using a RAMDir as a middle man solved my problems...
    Thanks. What's is your heuristic to flush the RAMDirectory? Also how do
    you deal with System.exit() or application death? Eg, your are indexing
    something and the application dies or is killed.

    Thanks for any input.

    R.


    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:
  • Karl Øie at Apr 26, 2002 at 1:26 pm
    that is a great problem with lucene as it uses a FSDir to store it has no
    sence of transaction handling, for critical indexes i serialize a RAMdir to a
    database blob, so i can performe a rollback if needed, but this is a enourmos
    overhead....
    Thanks. What's is your heuristic to flush the RAMDirectory?
    please explain this because i don't understand english that good :-(

    mvh karl øie
    On Friday 26 April 2002 14:23, petite_abeille wrote:
    using a RAMDir as a middle man solved my problems...
    Thanks. What's is your heuristic to flush the RAMDirectory? Also how do
    you deal with System.exit() or application death? Eg, your are indexing
    something and the application dies or is killed.

    Thanks for any input.

    R.
  • Petite_abeille at Apr 26, 2002 at 1:33 pm

    Thanks. What's is your heuristic to flush the RAMDirectory?
    please explain this because i don't understand english that good :-(
    That's ok, I don't really understand English either :-)

    Simply put, when do you "flush" the RAMDirectory into the FSDirectory?
    Every five documents? Ten? A thousand? What is a good balance between
    RAM and FS?

    Thanks.

    PA.


    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:
  • Karl Øie at Apr 26, 2002 at 1:41 pm
    ah, now i see, what i have is a server with 512mb of ram, so i have used two
    different approaches and both works ok;

    1 - i index a fixed number of documents into a RAMDir, like 10 (each of the
    docs are xml docs about 1,5-2mb) and then i optimize the RAMDir and merge it
    into the FSDir and then optimize the FSDir...

    2 - i use the Runtime.freeMemory() and Runtime.totalMemory() to see if i have
    reached more than 80% of the available memory, if so i optimize the RAMDir,
    merge it and optimize the FSDir..., if not i just add more documents to the
    RAMDir....

    as far as i have tested i have never experienced a failure while merging a
    RAMDir into a FSDir regardless of size, so it's my systems memory that is the
    problem....

    mvh karl øie

    On Friday 26 April 2002 15:33, petite_abeille wrote:
    Thanks. What's is your heuristic to flush the RAMDirectory?
    please explain this because i don't understand english that good :-(
    That's ok, I don't really understand English either :-)

    Simply put, when do you "flush" the RAMDirectory into the FSDirectory?
    Every five documents? Ten? A thousand? What is a good balance between
    RAM and FS?

    Thanks.

    PA.

    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:
  • Karl Øie at Apr 26, 2002 at 1:47 pm
    forgot this:

    its a bit hard to determine a good number of balance while indexing XML
    documents because the internal relations of a DOM can make a XML document
    become nearly 21 times as big in memory compared to disk (i am not lying, i
    have seen it my self)...

    also the RAMDir must be kept in memory while indexing and merging, so checking
    the systems free memory is easier that trying to calculate memoryusage....

    mvh karl øie



    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:
  • Petite_abeille at Apr 26, 2002 at 1:52 pm

    also the RAMDir must be kept in memory while indexing and merging, so
    checking
    the systems free memory is easier that trying to calculate
    memoryusage....
    I see... I don't deal with XML so I guess I have a better grasp on the
    memory requirements of my objects. In any case, I'm afraid I might be
    abusing Lucene a bit, as build a kind of oodbms on top of it... Oh,
    well...

    Thanks for your help.

    PA.


    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:
  • Petite_abeille at Apr 26, 2002 at 1:47 pm

    ah, now i see, what i have is a server with 512mb of ram, so i have
    used two
    different approaches and both works ok;
    Thanks a lot! I will give it a try...

    PA.


    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:
  • Otis Gospodnetic at Apr 26, 2002 at 2:30 pm
    Morning,
    I'm starting to wander how "bullet proof" are Lucene indexes? Do they

    get corrupted easely? If so is there a way to rebuild them?
    There is no tool to detect index corruption, fixing of indexing, nor
    index rebuilding.
    The last one anyone can/has to do on their own.
    I'm started to get the following exception left and right...

    "04/25 18:34:39 (Warning) Indexer.indexObjectWithValues:
    java.io.IOException: _91.fnm already exists"
    I've seen people asking about this on the list, but I never encountered
    this particular exception.
    I build a little app (http://homepage.mac.com/zoe_info/) that uses
    Lucene quiet extensively, and I would like to keep it that way.
    However,
    I'm starting to have second thought about Lucene's reliability... :-(

    I'm sure I'm doing something wrong somewhere, but I really cannot see

    what...
    Maybe it's not a Lucene issue then, although I've seen this mentioned
    so often, which means that documentation could be improved to prevent
    people from making the same mistakes that others have already made.

    Otis


    __________________________________________________
    Do You Yahoo!?
    Yahoo! Games - play chess, backgammon, pool and more
    http://games.yahoo.com/

    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:
  • Petite_abeille at Apr 26, 2002 at 3:32 pm
    Hello again,
    There is no tool to detect index corruption, fixing of indexing, nor
    index rebuilding.
    The last one anyone can/has to do on their own.
    :-( Well, that *very* sad to say the least... How do I know if my
    indexes are not corrupted even if everything seems to be working fine?
    Don't tell me I'm the first one to run into this kind of issues?!? How
    can I "trust" an index if there is *no* way of checking its integrity?
    And even if you happen to notice that something is fishy, there is no
    way to rebuild the index -short or re-indexing everything from scratch?
    That does not sound like a very "healthy" situation to me. "Fragile"
    will be kind for describing it...
    I've seen people asking about this on the list, but I never encountered
    this particular exception.
    Lucky you...
    Maybe it's not a Lucene issue then, although I've seen this mentioned
    so often, which means that documentation could be improved to prevent
    people from making the same mistakes that others have already made.
    Maybe, maybe not. And most likely I'm doing something odd. In any case,
    could you point me to the "mistakes that others have already made"? Or
    did I miss something obvious here?

    Thanks.

    PA


    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:
  • Otis Gospodnetic at Apr 26, 2002 at 3:51 pm
    Hello,
    There is no tool to detect index corruption, fixing of indexing, nor
    index rebuilding.
    The last one anyone can/has to do on their own.
    :-( Well, that *very* sad to say the least... How do I know if my
    indexes are not corrupted even if everything seems to be working
    fine?
    Don't tell me I'm the first one to run into this kind of issues?!?
    How
    can I "trust" an index if there is *no* way of checking its
    integrity?
    And even if you happen to notice that something is fishy, there is no

    way to rebuild the index -short or re-indexing everything from
    scratch?
    That does not sound like a very "healthy" situation to me. "Fragile"
    will be kind for describing it...
    Yes, that's all unfortunate. If you come up with anything, please
    share it. Or, you can use Lucene Sandbox and develop stuff there.
    I've seen people asking about this on the list, but I never
    encountered
    this particular exception.
    Lucky you...
    :)
    Maybe it's not a Lucene issue then, although I've seen this mentioned
    so often, which means that documentation could be improved to prevent
    people from making the same mistakes that others have already made.
    Maybe, maybe not. And most likely I'm doing something odd. In any
    case,
    could you point me to the "mistakes that others have already made"?
    Or
    did I miss something obvious here?
    Nah, the only thing I can suggest is check the lists' archives, that is
    where mistakes of others would be recorded.

    Otis


    __________________________________________________
    Do You Yahoo!?
    Yahoo! Games - play chess, backgammon, pool and more
    http://games.yahoo.com/

    --
    To unsubscribe, e-mail:
    For additional commands, e-mail:

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedApr 26, '02 at 11:54a
activeApr 26, '02 at 3:51p
posts12
users3
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase