FAQ
I have not been able to find much information about this, hence this
question.

Currently I use Lucene through Compass with the data stored in RAM. The
indexed information is updated daily and therefore I create a new
Compass/Lucene combination every day, let it load the new data and then
swap the active search engine (simply assigning a 'global' variable).
So, all in memory.

I'm not confident what this will do with the memory. In my setupo each
Compass/Lucene engine gets a separate ram store based on a different id
per engine:

...setSetting(CompassEnvironment.CONNECTION, "ram://" + id)

And this seems to be working okay. However:
1. Is this the correct way to have 2 engines running next to each other
(while the second one is loading the new data)?
2. After the engines are swapped, is the RAM store of the now not used
engine automatically cleaned up? If not, how do I?

Thanks for any help!

Tom


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Anshum at Jun 10, 2008 at 9:51 am
    Hi Tom, perhaps this is regarding Compass and you would have wanted to post
    this query there.
    In case you are looking and asking about running 2 instances of lucene on
    the machine with 2 separate indexes in RAM, could you specify :
    * Do you use tmpfs or do you load it using the RamDirectory class?
    * In case you use tmpfs, you'd have to clean it manually after releasing the
    hold of the engine (by stopping or closing the old index reader). The
    cleaning is done straight off by deleting the index or moving the index off
    the tmpfs after closing the index reader. Incase you happen to do it without
    closing the reader, though the dir won't be listed under the directory
    listing but it would still be using all of that space (which you would only
    realise if you run out of space i.e. have space constraints).
    * In the other case, if you are just using the OS cache mechanism or
    something like it, it would not get cleaned up but gradually be cleaned as
    and when the OS needs to allocate more memory (that is how caching works).

    --
    Anshum
    Naukri Labs!
    On Tue, Jun 10, 2008 at 2:53 PM, Tom wrote:

    I have not been able to find much information about this, hence this
    question.

    Currently I use Lucene through Compass with the data stored in RAM. The
    indexed information is updated daily and therefore I create a new
    Compass/Lucene combination every day, let it load the new data and then swap
    the active search engine (simply assigning a 'global' variable). So, all in
    memory.

    I'm not confident what this will do with the memory. In my setupo each
    Compass/Lucene engine gets a separate ram store based on a different id per
    engine:

    ...setSetting(CompassEnvironment.CONNECTION, "ram://" + id)

    And this seems to be working okay. However:
    1. Is this the correct way to have 2 engines running next to each other
    (while the second one is loading the new data)?
    2. After the engines are swapped, is the RAM store of the now not used
    engine automatically cleaned up? If not, how do I?

    Thanks for any help!

    Tom


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    --
    --
    The facts expressed here belong to everybody, the opinions to me.
    The distinction is yours to draw............
  • Tom at Jun 10, 2008 at 1:46 pm
    Thanks for reacting Anshum,

    I also posted this on the Compass forum, but I was not sure where the
    responsibility was. But your information did give me a good start. From
    the Compass' RAMDirectoryStore source code: "uses
    org.apache.lucene.store.RAMDirectory"

    So appearantly I use two separate Lucene RamDirectory instances, one for
    each Compass/Lucene engine, which would mean that the data is garbage
    collected. Good!

    I also have the sheer impression that the id I postfix to the "ram://"
    is quite useless. I cannot find any location where that information is
    actually used to configure separate filesystems on a ram drive. As a
    comparison, the FSDirectoryStore has a "configure" method which actually
    uses the part behind the ://, but the RAMDirectoryStore only is created.
    In fact the RAMDirectoryStore does nothing else aside from being
    created; no deleteIndex is implemented, no cleanIndex, no nothing. It
    seems a miracle it works at all ;-) The only thing that is present is a
    delete-files loop in beforeCopyFrom.

    So I think I dare to assume that the memory is garbage collected.

    Thanks again.

    Tom





    Anshum wrote:
    Hi Tom, perhaps this is regarding Compass and you would have wanted to post
    this query there.
    In case you are looking and asking about running 2 instances of lucene on
    the machine with 2 separate indexes in RAM, could you specify :
    * Do you use tmpfs or do you load it using the RamDirectory class?
    * In case you use tmpfs, you'd have to clean it manually after releasing the
    hold of the engine (by stopping or closing the old index reader). The
    cleaning is done straight off by deleting the index or moving the index off
    the tmpfs after closing the index reader. Incase you happen to do it without
    closing the reader, though the dir won't be listed under the directory
    listing but it would still be using all of that space (which you would only
    realise if you run out of space i.e. have space constraints).
    * In the other case, if you are just using the OS cache mechanism or
    something like it, it would not get cleaned up but gradually be cleaned as
    and when the OS needs to allocate more memory (that is how caching works).

    --
    Anshum
    Naukri Labs!

    On Tue, Jun 10, 2008 at 2:53 PM, Tom wrote:

    I have not been able to find much information about this, hence this
    question.

    Currently I use Lucene through Compass with the data stored in RAM. The
    indexed information is updated daily and therefore I create a new
    Compass/Lucene combination every day, let it load the new data and then swap
    the active search engine (simply assigning a 'global' variable). So, all in
    memory.

    I'm not confident what this will do with the memory. In my setupo each
    Compass/Lucene engine gets a separate ram store based on a different id per
    engine:

    ...setSetting(CompassEnvironment.CONNECTION, "ram://" + id)

    And this seems to be working okay. However:
    1. Is this the correct way to have 2 engines running next to each other
    (while the second one is loading the new data)?
    2. After the engines are swapped, is the RAM store of the now not used
    engine automatically cleaned up? If not, how do I?

    Thanks for any help!

    Tom


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

  • Anshum at Jun 11, 2008 at 4:10 am
    Hi Tom,

    Just a suggestion, why don't you try explicitly using tmpfs. (considering
    you use a unix/linux server. I tried both, RAMDirectory implementation and
    tmpfs and found the latter more appealing. Firstly, in this case you'll have
    more control of cleaning up/retaining as per your needs. Also, using
    RAMDirectory depends on the architecture i.e. In case you run a 32 bit
    machine with a 32 bit OS and a 32 bit JVM, you wouldnt be able to allocate
    more than 2 Gigs of memory to any process(in this case JVM) and hence the
    total memory available for utilization by the program (includes both, the
    index as well as program size) would be restricted by this limit.
    In case of tmpfs, there is no such limitation. All you need to do is mount a
    tmpfs type partion (which uses your ram, and the total RAM available is the
    only limit here) and copy the index to that mount. The just open readers to
    read from that partition.

    All that I said above was just a suggestion, so you might want to try.
    As about automatic GC, you could also look at the options available with
    JVM. :)

    Hope this helps.

    --
    Anshum Gupta
    Naukri Labs
    On Tue, Jun 10, 2008 at 7:15 PM, Tom wrote:

    Thanks for reacting Anshum,

    I also posted this on the Compass forum, but I was not sure where the
    responsibility was. But your information did give me a good start. From the
    Compass' RAMDirectoryStore source code: "uses
    org.apache.lucene.store.RAMDirectory"

    So appearantly I use two separate Lucene RamDirectory instances, one for
    each Compass/Lucene engine, which would mean that the data is garbage
    collected. Good!

    I also have the sheer impression that the id I postfix to the "ram://" is
    quite useless. I cannot find any location where that information is actually
    used to configure separate filesystems on a ram drive. As a comparison, the
    FSDirectoryStore has a "configure" method which actually uses the part
    behind the ://, but the RAMDirectoryStore only is created. In fact the
    RAMDirectoryStore does nothing else aside from being created; no deleteIndex
    is implemented, no cleanIndex, no nothing. It seems a miracle it works at
    all ;-) The only thing that is present is a delete-files loop in
    beforeCopyFrom.

    So I think I dare to assume that the memory is garbage collected.

    Thanks again.

    Tom






    Anshum wrote:
    Hi Tom, perhaps this is regarding Compass and you would have wanted to
    post
    this query there.
    In case you are looking and asking about running 2 instances of lucene on
    the machine with 2 separate indexes in RAM, could you specify :
    * Do you use tmpfs or do you load it using the RamDirectory class?
    * In case you use tmpfs, you'd have to clean it manually after releasing
    the
    hold of the engine (by stopping or closing the old index reader). The
    cleaning is done straight off by deleting the index or moving the index
    off
    the tmpfs after closing the index reader. Incase you happen to do it
    without
    closing the reader, though the dir won't be listed under the directory
    listing but it would still be using all of that space (which you would
    only
    realise if you run out of space i.e. have space constraints).
    * In the other case, if you are just using the OS cache mechanism or
    something like it, it would not get cleaned up but gradually be cleaned as
    and when the OS needs to allocate more memory (that is how caching works).

    --
    Anshum
    Naukri Labs!

    On Tue, Jun 10, 2008 at 2:53 PM, Tom wrote:


    I have not been able to find much information about this, hence this
    question.

    Currently I use Lucene through Compass with the data stored in RAM. The
    indexed information is updated daily and therefore I create a new
    Compass/Lucene combination every day, let it load the new data and then
    swap
    the active search engine (simply assigning a 'global' variable). So, all
    in
    memory.

    I'm not confident what this will do with the memory. In my setupo each
    Compass/Lucene engine gets a separate ram store based on a different id
    per
    engine:

    ...setSetting(CompassEnvironment.CONNECTION, "ram://" + id)

    And this seems to be working okay. However:
    1. Is this the correct way to have 2 engines running next to each other
    (while the second one is loading the new data)?
    2. After the engines are swapped, is the RAM store of the now not used
    engine automatically cleaned up? If not, how do I?

    Thanks for any help!

    Tom


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org




    --
    --
    The facts expressed here belong to everybody, the opinions to me.
    The distinction is yours to draw............

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJun 10, '08 at 9:24a
activeJun 11, '08 at 4:10a
posts4
users2
websitelucene.apache.org

2 users in discussion

Anshum: 2 posts Tom: 2 posts

People

Translate

site design / logo © 2022 Grokbase