FAQ
Hi all,
How are you?

I am new to hbase and all I have been doing for the past week is reading
information exists on the web.
My goal is to master HBase from the System Administrator point of view and
to setup an Cloudera HBase cluster relying on HDFS storage (Production on
AMAZON Web Services).
Hadoop HDFS is running over EC2 Large instances (2 Processing Units, 7.5G
ram * 3 data nodes).
I am about to have 4 tables in HBase, and I was wondering what is the best
practice for my situation?
How many HRegionServers should I use? will large AMAZON EC2 instances will
be enough?

I have another confusion regarding -ROOT and .META. regions, and regarding
the process of a client approaching HBase.
Where do these regions are being stored? How do they structured (rows,
columns)?
A client first approaches the Zoo Keeper and asks for the -ROOT region
location? what happens next?
Please elaborate as much as you can.

Thanks and Best Regards,
*Ronen Itkin*

<http://www.taykey.com/>

Search Discussions

  • Doug Meil at Sep 13, 2011 at 12:24 pm
    Hi there-

    Regarding EC2, see this in the Hbase book...

    http://hbase.apache.org/book.html#trouble.ec2

    Regarding ROOT/META, see this in the Hbase book

    http://hbase.apache.org/book.html#arch.catalog





    On 9/13/11 6:16 AM, "Ronen Itkin" wrote:

    Hi all,
    How are you?

    I am new to hbase and all I have been doing for the past week is reading
    information exists on the web.
    My goal is to master HBase from the System Administrator point of view and
    to setup an Cloudera HBase cluster relying on HDFS storage (Production on
    AMAZON Web Services).
    Hadoop HDFS is running over EC2 Large instances (2 Processing Units, 7.5G
    ram * 3 data nodes).
    I am about to have 4 tables in HBase, and I was wondering what is the best
    practice for my situation?
    How many HRegionServers should I use? will large AMAZON EC2 instances will
    be enough?

    I have another confusion regarding -ROOT and .META. regions, and regarding
    the process of a client approaching HBase.
    Where do these regions are being stored? How do they structured (rows,
    columns)?
    A client first approaches the Zoo Keeper and asks for the -ROOT region
    location? what happens next?
    Please elaborate as much as you can.

    Thanks and Best Regards,
    *Ronen Itkin*

    <http://www.taykey.com/>
  • Ronen Itkin at Sep 13, 2011 at 3:24 pm
    Hi,

    Thanks for the answer!
    Another question is what should I take into account if I'll decide to run
    HRegionServers on separated servers and not on the hdfs datanodes??

    Thanks!


    On Tue, Sep 13, 2011 at 3:26 PM, Doug Meil wrote:


    Hi there-

    Regarding EC2, see this in the Hbase book...

    http://hbase.apache.org/book.html#trouble.ec2

    Regarding ROOT/META, see this in the Hbase book

    http://hbase.apache.org/book.html#arch.catalog





    On 9/13/11 6:16 AM, "Ronen Itkin" wrote:

    Hi all,
    How are you?

    I am new to hbase and all I have been doing for the past week is reading
    information exists on the web.
    My goal is to master HBase from the System Administrator point of view and
    to setup an Cloudera HBase cluster relying on HDFS storage (Production on
    AMAZON Web Services).
    Hadoop HDFS is running over EC2 Large instances (2 Processing Units, 7.5G
    ram * 3 data nodes).
    I am about to have 4 tables in HBase, and I was wondering what is the best
    practice for my situation?
    How many HRegionServers should I use? will large AMAZON EC2 instances will
    be enough?

    I have another confusion regarding -ROOT and .META. regions, and regarding
    the process of a client approaching HBase.
    Where do these regions are being stored? How do they structured (rows,
    columns)?
    A client first approaches the Zoo Keeper and asks for the -ROOT region
    location? what happens next?
    Please elaborate as much as you can.

    Thanks and Best Regards,
    *Ronen Itkin*

    <http://www.taykey.com/>

    --
    *
    Ronen Itkin*
    Taykey | www.taykey.com
  • Buttler, David at Sep 13, 2011 at 9:40 pm
    If you do that then all data access will be over the network. Amazon's internal network is very busy and you might see a lot of delays in processing data. This would be partially alleviated if you could run enough region servers to keep your entire table in memory in the block cache -- but that is not a typical scenario, and will not help at all with writes (As they must be flushed to disk (in the WAL) before writes complete.

    Dave

    -----Original Message-----
    From: Ronen Itkin
    Sent: Tuesday, September 13, 2011 8:24 AM
    To: user@hbase.apache.org
    Subject: Re: HBase best practice and Regions confusion

    Hi,

    Thanks for the answer!
    Another question is what should I take into account if I'll decide to run
    HRegionServers on separated servers and not on the hdfs datanodes??

    Thanks!


    On Tue, Sep 13, 2011 at 3:26 PM, Doug Meil wrote:


    Hi there-

    Regarding EC2, see this in the Hbase book...

    http://hbase.apache.org/book.html#trouble.ec2

    Regarding ROOT/META, see this in the Hbase book

    http://hbase.apache.org/book.html#arch.catalog





    On 9/13/11 6:16 AM, "Ronen Itkin" wrote:

    Hi all,
    How are you?

    I am new to hbase and all I have been doing for the past week is reading
    information exists on the web.
    My goal is to master HBase from the System Administrator point of view and
    to setup an Cloudera HBase cluster relying on HDFS storage (Production on
    AMAZON Web Services).
    Hadoop HDFS is running over EC2 Large instances (2 Processing Units, 7.5G
    ram * 3 data nodes).
    I am about to have 4 tables in HBase, and I was wondering what is the best
    practice for my situation?
    How many HRegionServers should I use? will large AMAZON EC2 instances will
    be enough?

    I have another confusion regarding -ROOT and .META. regions, and regarding
    the process of a client approaching HBase.
    Where do these regions are being stored? How do they structured (rows,
    columns)?
    A client first approaches the Zoo Keeper and asks for the -ROOT region
    location? what happens next?
    Please elaborate as much as you can.

    Thanks and Best Regards,
    *Ronen Itkin*

    <http://www.taykey.com/>

    --
    *
    Ronen Itkin*
    Taykey | www.taykey.com
  • Eric Charles at Sep 13, 2011 at 4:16 pm

    On 13/09/11 05:26, Doug Meil wrote:
    Hi there-

    Regarding EC2, see this in the Hbase book...

    http://hbase.apache.org/book.html#trouble.ec2
    btw, There's also the whirr project (http://whirr.apache.org/) that
    allows to deploy hbase on amazon without trouble.

    I can submit a patch if it makes sense to add a section in the book for
    this?
    Regarding ROOT/META, see this in the Hbase book

    http://hbase.apache.org/book.html#arch.catalog
    http://ofps.oreilly.com/titles/9781449396107/adminapi.html could also help.

    From my understanding, -ROOT- and .META. are system tables, although
    they are persisted just like any other user table (via
    store/memstore/hfile). You can even update them at your own risk.

    Client will go via zookeeper to find -ROOT- and will use -ROOT- to find
    the location of the adhoc region of .META. Finally, .META. is used to
    find the location of the user space region of the target table.

    So .META. can span multiple regions. It's foreseen in the process. What
    I'm not sure, is if -ROOT- can span multiple regions? (still have to
    look in code) If this is the case, zookeeper should have multiple
    entries. I guess the expected size of -ROOT- is not so high, so it can
    reside in main cases in one region?

    Thx.




    On 9/13/11 6:16 AM, "Ronen Itkin"wrote:
    Hi all,
    How are you?

    I am new to hbase and all I have been doing for the past week is reading
    information exists on the web.
    My goal is to master HBase from the System Administrator point of view and
    to setup an Cloudera HBase cluster relying on HDFS storage (Production on
    AMAZON Web Services).
    Hadoop HDFS is running over EC2 Large instances (2 Processing Units, 7.5G
    ram * 3 data nodes).
    I am about to have 4 tables in HBase, and I was wondering what is the best
    practice for my situation?
    How many HRegionServers should I use? will large AMAZON EC2 instances will
    be enough?

    I have another confusion regarding -ROOT and .META. regions, and regarding
    the process of a client approaching HBase.
    Where do these regions are being stored? How do they structured (rows,
    columns)?
    A client first approaches the Zoo Keeper and asks for the -ROOT region
    location? what happens next?
    Please elaborate as much as you can.

    Thanks and Best Regards,
    *Ronen Itkin*

    <http://www.taykey.com/>
    --
    Eric
    http://about.echarles.net

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshbase, hadoop
postedSep 13, '11 at 10:16a
activeSep 13, '11 at 9:40p
posts5
users4
websitehbase.apache.org

People

Translate

site design / logo © 2022 Grokbase