FAQ
Hello all,

Could you any one guide me what all the various ways we could scale out?

1. Index: Add data to the nodes in round-robin.
Search: Query all the nodes and cluster the results using carrot2.

2.Horizontal partitioning and No shared architecture,
Index: Split the data based on userid and index few set of users data in each node.
Search: Have a mapper kind of application which could tell which userid is mapped to node, redirect the search traffic to corresponding node.

Which one is best? Did you guys tried any of these approach. Please share your thoughts.

Regards
Ganesh
Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Search Discussions

  • Anshum at Jan 21, 2011 at 6:35 am
    Hi Ganesh,
    I'd suggest, if you have a particular dimension/field on which you could
    shard your data such that the query/data breakup gets predictable, that
    would be a good way to scale out e.g. if you have users which are equally
    active/searched then you may want to split their data on a simple mod of
    some numeric (auto increment) userid.
    This works well under normal cases unless your partitioning is not
    predictable.

    --
    Anshum Gupta
    http://ai-cafe.blogspot.com

    On Fri, Jan 21, 2011 at 10:52 AM, Ganesh wrote:

    Hello all,

    Could you any one guide me what all the various ways we could scale out?

    1. Index: Add data to the nodes in round-robin.
    Search: Query all the nodes and cluster the results using carrot2.

    2.Horizontal partitioning and No shared architecture,
    Index: Split the data based on userid and index few set of users data
    in each node.
    Search: Have a mapper kind of application which could tell which userid
    is mapped to node, redirect the search traffic to corresponding node.

    Which one is best? Did you guys tried any of these approach. Please share
    your thoughts.

    Regards
    Ganesh
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
    Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Ganesh at Feb 4, 2011 at 4:55 am
    I am also in the same idea. Based on the field, I could shard but there are two practical difficulties.

    1. If normal user logged-in then result could be fetched from the corresponding search server but if Admin user logged-in, then he may need to see all data. The query should be issued across servers and results should be consolidated.

    2. Consider a scenario I am sharding based on the User, I am having single search server and It is handling 1000 members. Now as the memory consumption is high, I have added one more search server. New users could access the second server but what about the old users, their data will be still added to the server1. How to address this issue. Is rebuilding the index the only way.

    Could any one share their experience, How they solved scale out problems?

    Regards
    Ganesh


    ----- Original Message -----
    From: "Anshum" <anshumg@gmail.com>
    To: <java-user@lucene.apache.org>
    Sent: Friday, January 21, 2011 12:04 PM
    Subject: Re: Scale out design patterns

    Hi Ganesh,
    I'd suggest, if you have a particular dimension/field on which you could
    shard your data such that the query/data breakup gets predictable, that
    would be a good way to scale out e.g. if you have users which are equally
    active/searched then you may want to split their data on a simple mod of
    some numeric (auto increment) userid.
    This works well under normal cases unless your partitioning is not
    predictable.

    --
    Anshum Gupta
    http://ai-cafe.blogspot.com

    On Fri, Jan 21, 2011 at 10:52 AM, Ganesh wrote:

    Hello all,

    Could you any one guide me what all the various ways we could scale out?

    1. Index: Add data to the nodes in round-robin.
    Search: Query all the nodes and cluster the results using carrot2.

    2.Horizontal partitioning and No shared architecture,
    Index: Split the data based on userid and index few set of users data
    in each node.
    Search: Have a mapper kind of application which could tell which userid
    is mapped to node, redirect the search traffic to corresponding node.

    Which one is best? Did you guys tried any of these approach. Please share
    your thoughts.

    Regards
    Ganesh
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
    Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
    Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org
  • Toke Eskildsen at Feb 4, 2011 at 6:25 am

    On Fri, 2011-02-04 at 05:54 +0100, Ganesh wrote:
    2. Consider a scenario I am sharding based on the User, I am having single search server and It is handling 1000 members. Now as the memory consumption is high, I have added one more search server. New users could access the second server but what about the old users, their data will be still added to the server1. How to address this issue. Is rebuilding the index the only way.
    You can move old users by reindexing their data at the new server and
    deleting them from the old one? That's only a partial modification.

    If you are about to move a whole lot of users, you can copy the old
    index, delete all documents from the copy, except the ones that are to
    be moved, then merge the pruned index with the new one.


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupjava-user @
categorieslucene
postedJan 21, '11 at 5:22a
activeFeb 4, '11 at 6:25a
posts4
users3
websitelucene.apache.org

3 users in discussion

Ganesh: 2 posts Toke Eskildsen: 1 post Anshum: 1 post

People

Translate

site design / logo © 2022 Grokbase