FAQ
Hello all,

I know little bit about scale out design, Sharding the database across systems. Is any one in this group tried Scale up architecture? I think to scale up, we need to use 64 bit. How about the Lucene performance in 64 bit? Whether we could use 8 GB RAM completely?

Could any share their thoughts on this.

Regards
Ganesh
Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Pradeep Singh at Dec 13, 2010 at 6:47 am
    8GB is used on laptops. For servers you need more.
    On Sun, Dec 12, 2010 at 10:25 PM, Ganesh wrote:

    Hello all,

    I know little bit about scale out design, Sharding the database across
    systems. Is any one in this group tried Scale up architecture? I think to
    scale up, we need to use 64 bit. How about the Lucene performance in 64 bit?
    Whether we could use 8 GB RAM completely?

    Could any share their thoughts on this.

    Regards
    Ganesh
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
    Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Ganesh at Dec 13, 2010 at 7:01 am
    Have you tried using Lucene in 64 Bit with more than 8 GB RAM.

    Regards
    Ganesh

    ----- Original Message -----
    From: "Pradeep Singh" <pksinghus@gmail.com>
    To: <java-user@lucene.apache.org>
    Sent: Monday, December 13, 2010 12:16 PM
    Subject: Re: Scale up design

    8GB is used on laptops. For servers you need more.
    On Sun, Dec 12, 2010 at 10:25 PM, Ganesh wrote:

    Hello all,

    I know little bit about scale out design, Sharding the database across
    systems. Is any one in this group tried Scale up architecture? I think to
    scale up, we need to use 64 bit. How about the Lucene performance in 64 bit?
    Whether we could use 8 GB RAM completely?

    Could any share their thoughts on this.

    Regards
    Ganesh
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
    Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Danil ŢORIN at Dec 13, 2010 at 8:16 am
    GC times on large heaps are pretty painfull right now (haven't tried
    G1 collector, knowledgeable people : please advise)

    Also it's very dependent on your index and query pattern, so you could
    improve it by using some -XX magic.

    My recommendation is to scale horizontally (spit index into shards),
    this way you'll be able to scale up much easier than moving to even
    beefier server.
    Initially if your server is big enough, you may host all your shards
    on it, just in separate jvms.

    If you are thinking on BIG indexes, you probably don't want to loose
    them, so you also must think of replication, standbys and so on.
    And from my experience overall cost (for same availability) is cheaper
    when you use many smaller servers than few large ones.
    On Mon, Dec 13, 2010 at 09:01, Ganesh wrote:
    Have you tried using Lucene in 64 Bit with more than 8 GB RAM.

    Regards
    Ganesh

    ----- Original Message -----
    From: "Pradeep Singh" <pksinghus@gmail.com>
    To: <java-user@lucene.apache.org>
    Sent: Monday, December 13, 2010 12:16 PM
    Subject: Re: Scale up design

    8GB is used on laptops. For servers you need more.
    On Sun, Dec 12, 2010 at 10:25 PM, Ganesh wrote:

    Hello all,

    I know little bit about scale out design, Sharding the database across
    systems. Is any one in this group tried Scale up architecture? I think to
    scale up, we need to use 64 bit. How about the Lucene performance in 64 bit?
    Whether we could use 8 GB RAM completely?

    Could any share their thoughts on this.

    Regards
    Ganesh
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
    Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • William Newport at Dec 13, 2010 at 11:33 am
    I've used 30-35gb heaps and it is painful.

    Sent from my iPhone
    On Dec 13, 2010, at 2:16 AM, "Danil ŢORIN" wrote:

    GC times on large heaps are pretty painfull right now (haven't tried
    G1 collector, knowledgeable people : please advise)

    Also it's very dependent on your index and query pattern, so you could
    improve it by using some -XX magic.

    My recommendation is to scale horizontally (spit index into shards),
    this way you'll be able to scale up much easier than moving to even
    beefier server.
    Initially if your server is big enough, you may host all your shards
    on it, just in separate jvms.

    If you are thinking on BIG indexes, you probably don't want to loose
    them, so you also must think of replication, standbys and so on.
    And from my experience overall cost (for same availability) is cheaper
    when you use many smaller servers than few large ones.
    On Mon, Dec 13, 2010 at 09:01, Ganesh wrote:
    Have you tried using Lucene in 64 Bit with more than 8 GB RAM.

    Regards
    Ganesh

    ----- Original Message -----
    From: "Pradeep Singh" <pksinghus@gmail.com>
    To: <java-user@lucene.apache.org>
    Sent: Monday, December 13, 2010 12:16 PM
    Subject: Re: Scale up design

    8GB is used on laptops. For servers you need more.
    On Sun, Dec 12, 2010 at 10:25 PM, Ganesh wrote:

    Hello all,

    I know little bit about scale out design, Sharding the database across
    systems. Is any one in this group tried Scale up architecture? I think to
    scale up, we need to use 64 bit. How about the Lucene performance in 64 bit?
    Whether we could use 8 GB RAM completely?

    Could any share their thoughts on this.

    Regards
    Ganesh
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
    Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Erick Erickson at Dec 13, 2010 at 1:30 pm
    Here's a great intro to the garbage collection options:
    http://www.lucidimagination.com/blog/2009/09/19/java-garbage-collection-boot-camp-draft/

    @Ganesh:
    The issue with 64 bit isn't really performance, it's that you can't allocate
    much of
    your memory to the JVM. So by definition your performance will tank much
    earlier
    with a 32-bit JVM no matter what physical hardware you're running on because
    you're constrained by how much of the physical memory you *can* use.

    Best
    Erick
    On Mon, Dec 13, 2010 at 6:33 AM, William Newport wrote:

    I've used 30-35gb heaps and it is painful.

    Sent from my iPhone
    On Dec 13, 2010, at 2:16 AM, "Danil ŢORIN" wrote:

    GC times on large heaps are pretty painfull right now (haven't tried
    G1 collector, knowledgeable people : please advise)

    Also it's very dependent on your index and query pattern, so you could
    improve it by using some -XX magic.

    My recommendation is to scale horizontally (spit index into shards),
    this way you'll be able to scale up much easier than moving to even
    beefier server.
    Initially if your server is big enough, you may host all your shards
    on it, just in separate jvms.

    If you are thinking on BIG indexes, you probably don't want to loose
    them, so you also must think of replication, standbys and so on.
    And from my experience overall cost (for same availability) is cheaper
    when you use many smaller servers than few large ones.
    On Mon, Dec 13, 2010 at 09:01, Ganesh wrote:
    Have you tried using Lucene in 64 Bit with more than 8 GB RAM.

    Regards
    Ganesh

    ----- Original Message -----
    From: "Pradeep Singh" <pksinghus@gmail.com>
    To: <java-user@lucene.apache.org>
    Sent: Monday, December 13, 2010 12:16 PM
    Subject: Re: Scale up design

    8GB is used on laptops. For servers you need more.
    On Sun, Dec 12, 2010 at 10:25 PM, Ganesh wrote:

    Hello all,

    I know little bit about scale out design, Sharding the database across
    systems. Is any one in this group tried Scale up architecture? I think
    to
    scale up, we need to use 64 bit. How about the Lucene performance in
    64 bit?
    Whether we could use 8 GB RAM completely?

    Could any share their thoughts on this.

    Regards
    Ganesh
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
    Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
    Download Now! http://messenger.yahoo.com/download.php
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Ganesh at Dec 15, 2010 at 8:42 am
    What is the advantage of going for 64 Bit. People claim performance and usage of more RAM.

    In 32 Bit OS, JVM handles 1 to 1.5 GB of RAM then in case of 64 Bit, Single JVM cannot use more than 1.5 GB RAM? What if we host multiple JVM instance in the single system.

    Please help me with some more ideas. We need to design whether to scale out or scale up.

    Regards
    Ganesh



    ----- Original Message -----
    From: "Erick Erickson" <erickerickson@gmail.com>
    To: <java-user@lucene.apache.org>
    Sent: Monday, December 13, 2010 7:00 PM
    Subject: Re: Scale up design


    Here's a great intro to the garbage collection options:
    http://www.lucidimagination.com/blog/2009/09/19/java-garbage-collection-boot-camp-draft/

    @Ganesh:
    The issue with 64 bit isn't really performance, it's that you can't allocate
    much of
    your memory to the JVM. So by definition your performance will tank much
    earlier
    with a 32-bit JVM no matter what physical hardware you're running on because
    you're constrained by how much of the physical memory you *can* use.

    Best
    Erick
    On Mon, Dec 13, 2010 at 6:33 AM, William Newport wrote:

    I've used 30-35gb heaps and it is painful.

    Sent from my iPhone
    On Dec 13, 2010, at 2:16 AM, "Danil ŢORIN" wrote:

    GC times on large heaps are pretty painfull right now (haven't tried
    G1 collector, knowledgeable people : please advise)

    Also it's very dependent on your index and query pattern, so you could
    improve it by using some -XX magic.

    My recommendation is to scale horizontally (spit index into shards),
    this way you'll be able to scale up much easier than moving to even
    beefier server.
    Initially if your server is big enough, you may host all your shards
    on it, just in separate jvms.

    If you are thinking on BIG indexes, you probably don't want to loose
    them, so you also must think of replication, standbys and so on.
    And from my experience overall cost (for same availability) is cheaper
    when you use many smaller servers than few large ones.
    On Mon, Dec 13, 2010 at 09:01, Ganesh wrote:
    Have you tried using Lucene in 64 Bit with more than 8 GB RAM.

    Regards
    Ganesh

    ----- Original Message -----
    From: "Pradeep Singh" <pksinghus@gmail.com>
    To: <java-user@lucene.apache.org>
    Sent: Monday, December 13, 2010 12:16 PM
    Subject: Re: Scale up design

    8GB is used on laptops. For servers you need more.
    On Sun, Dec 12, 2010 at 10:25 PM, Ganesh wrote:

    Hello all,

    I know little bit about scale out design, Sharding the database across
    systems. Is any one in this group tried Scale up architecture? I think
    to
    scale up, we need to use 64 bit. How about the Lucene performance in
    64 bit?
    Whether we could use 8 GB RAM completely?

    Could any share their thoughts on this.

    Regards
    Ganesh
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
    Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
    Download Now! http://messenger.yahoo.com/download.php
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Toke Eskildsen at Dec 15, 2010 at 11:00 am

    On Wed, 2010-12-15 at 09:42 +0100, Ganesh wrote:
    What is the advantage of going for 64 Bit.
    Larger maximum heap, more memory in the machine.
    People claim performance and usage of more RAM.
    Yes, pointers normally take up 64bit on a 64bit machine. Depending on
    the application, the overhead can be anything from practically
    non-existing to close to 100%. You can set an option for the JVM to try
    and use smaller pointers on 64bit machines. This limits the maximum
    memory allocation in the JVM to 32GB, which seems like a fair compromise
    at this point in time.
    http://wikis.sun.com/display/HotSpotInternals/CompressedOops
    In 32 Bit OS, JVM handles 1 to 1.5 GB of RAM then in case
    of 64 Bit, Single JVM cannot use more than 1.5 GB RAM?
    Say what? When running on a 64bit, the JVM heap limit is normally the
    system's per-process memory limit. For Linux this is generally well
    above any real world hardware. For Windows it seems like you need to
    enable something:
    http://msdn.microsoft.com/en-us/library/aa366778%28v=vs.85%29.aspx
    (note: I have no experience with 64bit Windows)
    Please help me with some more ideas. We need to design whether
    to scale out or scale up.
    Maybe you could describe your vision in more detail? What scale are you
    looking at? How large is your index in GB, how many documents, how fast
    do you need the searcher to respond, are you doing any sorting or
    faceting (and do you facet on a few unique values or things like title
    or author)?

    It makes little sense to try and get a single machine to handle billions
    of documents with large faceting, but it seems silly to distribute 10GB
    of index with 1 million documents. As a general rule of thumb. As always
    your mileage might wary.

    For the record, our current index is 40GB/9 million records. We're doing
    sorting on title and faceting on 15 fields, out of which 2 has 4-6
    million unique values. This runs on a single machine (okay, 2, but they
    are mirrored) with 6GB of RAM and it works fine with sub-second response
    times (normally <300ms AFAIR). Our experimental setup can get by with
    1.2GB and would thus not require 64bit.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Ganesh at Dec 16, 2010 at 6:00 am
    Thanks for your information.

    My current stats:
    250 GB of data, 40 GB of Index Size, 60 million records is working fine with 1 GB RAM. We are storing minmal amount of data in index. We are doing sorting on Date. Even in single system, the database are shard.

    We are planning to build hosted solution. This stats will increase by minimum 10 times in 2 - 3 years. I plan to use 64 Bit, with 8 - 10 GB RAM allocated to JVM. Will i face any issues? Its good to scale up as it would be easy to backup and maintanance. If i scale out then i may require atleast 5 system.

    Any thoughts?

    Regards
    Ganesh


    ----- Original Message -----
    From: "Toke Eskildsen" <te@statsbiblioteket.dk>
    To: <java-user@lucene.apache.org>
    Sent: Wednesday, December 15, 2010 4:36 PM
    Subject: [Bulk] Re: Scale up design

    On Wed, 2010-12-15 at 09:42 +0100, Ganesh wrote:
    What is the advantage of going for 64 Bit.
    Larger maximum heap, more memory in the machine.
    People claim performance and usage of more RAM.
    Yes, pointers normally take up 64bit on a 64bit machine. Depending on
    the application, the overhead can be anything from practically
    non-existing to close to 100%. You can set an option for the JVM to try
    and use smaller pointers on 64bit machines. This limits the maximum
    memory allocation in the JVM to 32GB, which seems like a fair compromise
    at this point in time.
    http://wikis.sun.com/display/HotSpotInternals/CompressedOops
    In 32 Bit OS, JVM handles 1 to 1.5 GB of RAM then in case
    of 64 Bit, Single JVM cannot use more than 1.5 GB RAM?
    Say what? When running on a 64bit, the JVM heap limit is normally the
    system's per-process memory limit. For Linux this is generally well
    above any real world hardware. For Windows it seems like you need to
    enable something:
    http://msdn.microsoft.com/en-us/library/aa366778%28v=vs.85%29.aspx
    (note: I have no experience with 64bit Windows)
    Please help me with some more ideas. We need to design whether
    to scale out or scale up.
    Maybe you could describe your vision in more detail? What scale are you
    looking at? How large is your index in GB, how many documents, how fast
    do you need the searcher to respond, are you doing any sorting or
    faceting (and do you facet on a few unique values or things like title
    or author)?

    It makes little sense to try and get a single machine to handle billions
    of documents with large faceting, but it seems silly to distribute 10GB
    of index with 1 million documents. As a general rule of thumb. As always
    your mileage might wary.

    For the record, our current index is 40GB/9 million records. We're doing
    sorting on title and faceting on 15 fields, out of which 2 has 4-6
    million unique values. This runs on a single machine (okay, 2, but they
    are mirrored) with 6GB of RAM and it works fine with sub-second response
    times (normally <300ms AFAIR). Our experimental setup can get by with
    1.2GB and would thus not require 64bit.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Toke Eskildsen at Dec 16, 2010 at 10:44 am

    On Thu, 2010-12-16 at 06:59 +0100, Ganesh wrote:
    250 GB of data, 40 GB of Index Size, 60 million records is
    working fine with 1 GB RAM. We are storing minmal amount
    of data in index. We are doing sorting on Date. Even in
    single system, the database are shard.
    Looking back in the list, I see that you're sharding on weeks with 50+
    weeks in the index.
    build hosted solution. This stats will
    increase by minimum 10 times in 2 - 3 years. I plan to use
    64 Bit, with 8 - 10 GB RAM allocated to JVM.
    When making a conservative estimate and multiplying with 10, you must
    remember to do the same for the system memory available for disk cache.

    If your shards are searched sequentially, you could measure the response
    time for a single shard (after warm up and with different queries), then
    create a test-shard by merging 10 shards and measure response-time for
    that. Subtracting the two numbers (to remove the overhead of the
    front-end layer) and multiplying with 50 should give you a rough
    estimate for the performance of an upscaled setup.

    Another measurement suggestion: Divide the current performance of the
    full setup with the performance of a single shard, then multiply the
    performance of a single created by merging 10 shards with that number.

    Regards,
    Toke Eskildsen


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Ganesh at Dec 20, 2010 at 7:40 am
    I have done some benchmarking and based on that my estimate of RAM requirement would be 3 - 4 GB. My question is to go for 64 bit or scale out with 3 systems?

    Regards
    Ganesh

    ----- Original Message -----
    From: "Toke Eskildsen" <te@statsbiblioteket.dk>
    To: <java-user@lucene.apache.org>
    Sent: Thursday, December 16, 2010 4:20 PM
    Subject: Re: Re: Scale up design

    On Thu, 2010-12-16 at 06:59 +0100, Ganesh wrote:
    250 GB of data, 40 GB of Index Size, 60 million records is
    working fine with 1 GB RAM. We are storing minmal amount
    of data in index. We are doing sorting on Date. Even in
    single system, the database are shard.
    Looking back in the list, I see that you're sharding on weeks with 50+
    weeks in the index.
    build hosted solution. This stats will
    increase by minimum 10 times in 2 - 3 years. I plan to use
    64 Bit, with 8 - 10 GB RAM allocated to JVM.
    When making a conservative estimate and multiplying with 10, you must
    remember to do the same for the system memory available for disk cache.

    If your shards are searched sequentially, you could measure the response
    time for a single shard (after warm up and with different queries), then
    create a test-shard by merging 10 shards and measure response-time for
    that. Subtracting the two numbers (to remove the overhead of the
    front-end layer) and multiplying with 50 should give you a rough
    estimate for the performance of an upscaled setup.

    Another measurement suggestion: Divide the current performance of the
    full setup with the performance of a single shard, then multiply the
    performance of a single created by merging 10 shards with that number.

    Regards,
    Toke Eskildsen


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Simon Willnauer at Dec 20, 2010 at 8:41 am

    On Mon, Dec 20, 2010 at 8:39 AM, Ganesh wrote:
    I have done some benchmarking and based on that my estimate of RAM requirement would be 3 - 4 GB. My question is to go for 64 bit or scale out with 3 systems?
    What keeps you from moving to 64bit, I mean if you have those RAM req.
    for JAVA HeapSpace I don't see much of a choice. You could try some
    address space extensions which are around but I don't think its worth
    it. Any reason why you hesitate?

    simon
    Regards
    Ganesh

    ----- Original Message -----
    From: "Toke Eskildsen" <te@statsbiblioteket.dk>
    To: <java-user@lucene.apache.org>
    Sent: Thursday, December 16, 2010 4:20 PM
    Subject: Re: Re: Scale up design

    On Thu, 2010-12-16 at 06:59 +0100, Ganesh wrote:
    250 GB of data, 40 GB of Index Size, 60 million records is
    working fine with 1 GB RAM. We are storing minmal amount
    of data in index. We are doing sorting on Date. Even in
    single system, the database are shard.
    Looking back in the list, I see that you're sharding on weeks with 50+
    weeks in the index.
    build hosted solution. This stats will
    increase by minimum 10 times in 2 - 3 years. I plan to use
    64 Bit, with 8 - 10 GB RAM allocated to JVM.
    When making a conservative estimate and multiplying with 10, you must
    remember to do the same for the system memory available for disk cache.

    If your shards are searched sequentially, you could measure the response
    time for a single shard (after warm up and with different queries), then
    create a test-shard by merging 10 shards and measure response-time for
    that. Subtracting the two numbers (to remove the overhead of the
    front-end layer) and multiplying with 50 should give you a rough
    estimate for the performance of an upscaled setup.

    Another measurement suggestion: Divide the current performance of the
    full setup with the performance of a single shard, then multiply the
    performance of a single created by merging 10 shards with that number.

    Regards,
    Toke Eskildsen


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Ganesh at Dec 22, 2010 at 5:02 am
    Hello Simon,

    I don't hesitate to move to 64 bit. I require a suggestion whether to move to 64 bit (Scale up) or scale out with multiple system. I have started investigating 64 bit, i want to know about its performance and if anyone in this group has already tried using it.

    Regards
    Ganesh

    ----- Original Message -----
    From: "Simon Willnauer" <simon.willnauer@googlemail.com>
    To: <java-user@lucene.apache.org>
    Sent: Monday, December 20, 2010 2:11 PM
    Subject: Re: Re: Scale up design

    On Mon, Dec 20, 2010 at 8:39 AM, Ganesh wrote:
    I have done some benchmarking and based on that my estimate of RAM requirement would be 3 - 4 GB. My question is to go for 64 bit or scale out with 3 systems?
    What keeps you from moving to 64bit, I mean if you have those RAM req.
    for JAVA HeapSpace I don't see much of a choice. You could try some
    address space extensions which are around but I don't think its worth
    it. Any reason why you hesitate?

    simon
    Regards
    Ganesh

    ----- Original Message -----
    From: "Toke Eskildsen" <te@statsbiblioteket.dk>
    To: <java-user@lucene.apache.org>
    Sent: Thursday, December 16, 2010 4:20 PM
    Subject: Re: Re: Scale up design

    On Thu, 2010-12-16 at 06:59 +0100, Ganesh wrote:
    250 GB of data, 40 GB of Index Size, 60 million records is
    working fine with 1 GB RAM. We are storing minmal amount
    of data in index. We are doing sorting on Date. Even in
    single system, the database are shard.
    Looking back in the list, I see that you're sharding on weeks with 50+
    weeks in the index.
    build hosted solution. This stats will
    increase by minimum 10 times in 2 - 3 years. I plan to use
    64 Bit, with 8 - 10 GB RAM allocated to JVM.
    When making a conservative estimate and multiplying with 10, you must
    remember to do the same for the system memory available for disk cache.

    If your shards are searched sequentially, you could measure the response
    time for a single shard (after warm up and with different queries), then
    create a test-shard by merging 10 shards and measure response-time for
    that. Subtracting the two numbers (to remove the overhead of the
    front-end layer) and multiplying with 50 should give you a rough
    estimate for the performance of an upscaled setup.

    Another measurement suggestion: Divide the current performance of the
    full setup with the performance of a single shard, then multiply the
    performance of a single created by merging 10 shards with that number.

    Regards,
    Toke Eskildsen


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Danil ŢORIN at Dec 22, 2010 at 7:14 am
    There are no noticeable performance gains/loses when moving to 64 bit,
    assuming is the exactly same hardware (just 64bit OS), same index and
    reasonable amount of java heap
    (keep in mind that if you had 2gb on 32 bit you'll need almost 3gb on
    64 bit due to lager pointer representation)

    But once your index grows...heap grows, GC pauses might be quite
    unpredictable on such large heaps.
    That's why most people for 10 fold growth will recommend scale out,
    instead of scale up.

    But it depends a lot on how your index is structured, what kind of
    queries you run, how often you update your index, and so on.

    YMMV...the only to find out what works for you is to try it.
    On Wed, Dec 22, 2010 at 07:01, Ganesh wrote:
    Hello Simon,

    I don't hesitate to move to 64 bit. I require a suggestion whether to move to 64 bit (Scale up) or scale out with multiple system. I have started investigating 64 bit,  i want to know about its performance and if anyone in this group has already tried using it.

    Regards
    Ganesh

    ----- Original Message -----
    From: "Simon Willnauer" <simon.willnauer@googlemail.com>
    To: <java-user@lucene.apache.org>
    Sent: Monday, December 20, 2010 2:11 PM
    Subject: Re: Re: Scale up design

    On Mon, Dec 20, 2010 at 8:39 AM, Ganesh wrote:
    I have done some benchmarking and based on that my estimate of RAM requirement would be 3 - 4 GB. My question is to go for 64 bit or scale out with 3 systems?
    What keeps you from moving to 64bit, I mean if you have those RAM req.
    for JAVA HeapSpace I don't see much of a choice. You could try some
    address space extensions which are around but I don't think its worth
    it. Any reason why you hesitate?

    simon
    Regards
    Ganesh

    ----- Original Message -----
    From: "Toke Eskildsen" <te@statsbiblioteket.dk>
    To: <java-user@lucene.apache.org>
    Sent: Thursday, December 16, 2010 4:20 PM
    Subject: Re: Re: Scale up design

    On Thu, 2010-12-16 at 06:59 +0100, Ganesh wrote:
    250 GB of data, 40 GB of Index Size, 60 million records is
    working fine with 1 GB RAM. We are storing minmal amount
    of data in index. We are doing sorting on Date. Even in
    single system, the database are shard.
    Looking back in the list, I see that you're sharding on weeks with 50+
    weeks in the index.
    build hosted solution. This stats will
    increase by minimum 10 times in 2 - 3 years. I plan to use
    64 Bit, with 8 - 10 GB RAM allocated to JVM.
    When making a conservative estimate and multiplying with 10, you must
    remember to do the same for the system memory available for disk cache.

    If your shards are searched sequentially, you could measure the response
    time for a single shard (after warm up and with different queries), then
    create a test-shard by merging 10 shards and measure response-time for
    that. Subtracting the two numbers (to remove the overhead of the
    front-end layer) and multiplying with 50 should give you a rough
    estimate for the performance of an upscaled setup.

    Another measurement suggestion: Divide the current performance of the
    full setup with the performance of a single shard, then multiply the
    performance of a single created by merging 10 shards with that number.

    Regards,
    Toke Eskildsen


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Ganesh at Dec 22, 2010 at 7:38 am
    Thanks. I going to try in 64 bit. I will post some update in a day or two.

    Do I need to compile the Lucene and analyzer code in 64 bit JVM?
    Do I need to use MMAPDirectory in 64 bit?

    Any other tips targeting 64 bit?

    Regards
    Ganesh

    ----- Original Message -----
    From: "Danil ŢORIN" <torindan@gmail.com>
    To: <java-user@lucene.apache.org>
    Sent: Wednesday, December 22, 2010 12:44 PM
    Subject: Re: Re: Scale up design


    There are no noticeable performance gains/loses when moving to 64 bit,
    assuming is the exactly same hardware (just 64bit OS), same index and
    reasonable amount of java heap
    (keep in mind that if you had 2gb on 32 bit you'll need almost 3gb on
    64 bit due to lager pointer representation)

    But once your index grows...heap grows, GC pauses might be quite
    unpredictable on such large heaps.
    That's why most people for 10 fold growth will recommend scale out,
    instead of scale up.

    But it depends a lot on how your index is structured, what kind of
    queries you run, how often you update your index, and so on.

    YMMV...the only to find out what works for you is to try it.
    On Wed, Dec 22, 2010 at 07:01, Ganesh wrote:
    Hello Simon,

    I don't hesitate to move to 64 bit. I require a suggestion whether to move to 64 bit (Scale up) or scale out with multiple system. I have started investigating 64 bit, i want to know about its performance and if anyone in this group has already tried using it.

    Regards
    Ganesh

    ----- Original Message -----
    From: "Simon Willnauer" <simon.willnauer@googlemail.com>
    To: <java-user@lucene.apache.org>
    Sent: Monday, December 20, 2010 2:11 PM
    Subject: Re: Re: Scale up design

    On Mon, Dec 20, 2010 at 8:39 AM, Ganesh wrote:
    I have done some benchmarking and based on that my estimate of RAM requirement would be 3 - 4 GB. My question is to go for 64 bit or scale out with 3 systems?
    What keeps you from moving to 64bit, I mean if you have those RAM req.
    for JAVA HeapSpace I don't see much of a choice. You could try some
    address space extensions which are around but I don't think its worth
    it. Any reason why you hesitate?

    simon
    Regards
    Ganesh

    ----- Original Message -----
    From: "Toke Eskildsen" <te@statsbiblioteket.dk>
    To: <java-user@lucene.apache.org>
    Sent: Thursday, December 16, 2010 4:20 PM
    Subject: Re: Re: Scale up design

    On Thu, 2010-12-16 at 06:59 +0100, Ganesh wrote:
    250 GB of data, 40 GB of Index Size, 60 million records is
    working fine with 1 GB RAM. We are storing minmal amount
    of data in index. We are doing sorting on Date. Even in
    single system, the database are shard.
    Looking back in the list, I see that you're sharding on weeks with 50+
    weeks in the index.
    build hosted solution. This stats will
    increase by minimum 10 times in 2 - 3 years. I plan to use
    64 Bit, with 8 - 10 GB RAM allocated to JVM.
    When making a conservative estimate and multiplying with 10, you must
    remember to do the same for the system memory available for disk cache.

    If your shards are searched sequentially, you could measure the response
    time for a single shard (after warm up and with different queries), then
    create a test-shard by merging 10 shards and measure response-time for
    that. Subtracting the two numbers (to remove the overhead of the
    front-end layer) and multiplying with 50 should give you a rough
    estimate for the performance of an upscaled setup.

    Another measurement suggestion: Divide the current performance of the
    full setup with the performance of a single shard, then multiply the
    performance of a single created by merging 10 shards with that number.

    Regards,
    Toke Eskildsen


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Findbestopensource at Dec 22, 2010 at 9:49 am
    Do I need to compile the Lucene and analyzer code in 64 bit JVM?
    You don't need to compile. Just drop your jars in 64 Bit JVM in 64 Bit OS.

    Regards
    Aditya
    www.findbestopensource.com


    On Wed, Dec 22, 2010 at 1:07 PM, Ganesh wrote:

    Thanks. I going to try in 64 bit. I will post some update in a day or two.

    Do I need to compile the Lucene and analyzer code in 64 bit JVM?
    Do I need to use MMAPDirectory in 64 bit?

    Any other tips targeting 64 bit?

    Regards
    Ganesh

    ----- Original Message -----
    From: "Danil ŢORIN" <torindan@gmail.com>
    To: <java-user@lucene.apache.org>
    Sent: Wednesday, December 22, 2010 12:44 PM
    Subject: Re: Re: Scale up design


    There are no noticeable performance gains/loses when moving to 64 bit,
    assuming is the exactly same hardware (just 64bit OS), same index and
    reasonable amount of java heap
    (keep in mind that if you had 2gb on 32 bit you'll need almost 3gb on
    64 bit due to lager pointer representation)

    But once your index grows...heap grows, GC pauses might be quite
    unpredictable on such large heaps.
    That's why most people for 10 fold growth will recommend scale out,
    instead of scale up.

    But it depends a lot on how your index is structured, what kind of
    queries you run, how often you update your index, and so on.

    YMMV...the only to find out what works for you is to try it.
    On Wed, Dec 22, 2010 at 07:01, Ganesh wrote:
    Hello Simon,

    I don't hesitate to move to 64 bit. I require a suggestion whether to
    move to 64 bit (Scale up) or scale out with multiple system. I have started
    investigating 64 bit, i want to know about its performance and if anyone in
    this group has already tried using it.
    Regards
    Ganesh

    ----- Original Message -----
    From: "Simon Willnauer" <simon.willnauer@googlemail.com>
    To: <java-user@lucene.apache.org>
    Sent: Monday, December 20, 2010 2:11 PM
    Subject: Re: Re: Scale up design

    On Mon, Dec 20, 2010 at 8:39 AM, Ganesh wrote:
    I have done some benchmarking and based on that my estimate of RAM
    requirement would be 3 - 4 GB. My question is to go for 64 bit or scale out
    with 3 systems?
    What keeps you from moving to 64bit, I mean if you have those RAM req.
    for JAVA HeapSpace I don't see much of a choice. You could try some
    address space extensions which are around but I don't think its worth
    it. Any reason why you hesitate?

    simon
    Regards
    Ganesh

    ----- Original Message -----
    From: "Toke Eskildsen" <te@statsbiblioteket.dk>
    To: <java-user@lucene.apache.org>
    Sent: Thursday, December 16, 2010 4:20 PM
    Subject: Re: Re: Scale up design

    On Thu, 2010-12-16 at 06:59 +0100, Ganesh wrote:
    250 GB of data, 40 GB of Index Size, 60 million records is
    working fine with 1 GB RAM. We are storing minmal amount
    of data in index. We are doing sorting on Date. Even in
    single system, the database are shard.
    Looking back in the list, I see that you're sharding on weeks with 50+
    weeks in the index.
    build hosted solution. This stats will
    increase by minimum 10 times in 2 - 3 years. I plan to use
    64 Bit, with 8 - 10 GB RAM allocated to JVM.
    When making a conservative estimate and multiplying with 10, you must
    remember to do the same for the system memory available for disk
    cache.
    If your shards are searched sequentially, you could measure the
    response
    time for a single shard (after warm up and with different queries),
    then
    create a test-shard by merging 10 shards and measure response-time for
    that. Subtracting the two numbers (to remove the overhead of the
    front-end layer) and multiplying with 50 should give you a rough
    estimate for the performance of an upscaled setup.

    Another measurement suggestion: Divide the current performance of the
    full setup with the performance of a single shard, then multiply the
    performance of a single created by merging 10 shards with that number.

    Regards,
    Toke Eskildsen


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
    Download Now! http://messenger.yahoo.com/download.php
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
    Download Now! http://messenger.yahoo.com/download.php
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

    Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
    Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Steven A Rowe at Dec 22, 2010 at 4:37 pm

    On 12/22/2010 at 2:38 AM, Ganesh wrote:
    Any other tips targeting 64 bit?
    If memory usage is an issue, you might consider using HotSpot's "compressed oops" option:

    <http://wikis.sun.com/display/HotSpotInternals/CompressedOops>
    <http://blog.juma.me.uk/2008/10/14/32-bit-or-64-bit-jvm-how-about-a-hybrid/>

    Benson Margulies has written that the memory savings from using "compressed oops" isn't necessarily free - it can impact performance:

    <http://lists.apple.com/archives/java-dev/2010/Apr/msg00157.html>

    Steve
  • Whtiandike at May 21, 2011 at 3:28 pm
    jhnyt

    发自我的 iPad

    在 2010-12-13,14:46,Pradeep Singh <pksinghus@gmail.com> 写道:
    8GB is used on laptops. For servers you need more.
    On Sun, Dec 12, 2010 at 10:25 PM, Ganesh wrote:

    Hello all,

    I know little bit about scale out design, Sharding the database across
    systems. Is any one in this group tried Scale up architecture? I think to
    scale up, we need to use 64 bit. How about the Lucene performance in 64 bit?
    Whether we could use 8 GB RAM completely?

    Could any share their thoughts on this.

    Regards
    Ganesh
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
    Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedDec 13, '10 at 6:26a
activeMay 21, '11 at 3:28p
posts18
users10
websitelucene.apache.org

People

Translate

site design / logo © 2022 Grokbase