FAQ
Dear Philip and all,

I have successfully installed CM 3 on 20 cloud servers (Ubuntu 10.04) from
rackspace and they are all in good health.

I ran a simple hbase thrift program, that random reads rows from the
cluster.

But I am experiencing severe performance issues, it handles only 100
Requests/sec, and it doesn't scale.

I have tried the same program on 10 nodes, then on 20 nodes and the same
performance problems.

So could anyone help me what to do? and what are the reasons for this bad
performance?

I have an article in Hortonworks:
http://docs.hortonworks.com/CURRENT/index.htm#About_Hortonworks_Data_Platform/Hardware_Recommendations_For_Apache_Hadoop.htm

*"One way to quickly deploy Hadoop cluster, is to opt for “cloud trials” or
use virtual infrastructure. Horton­works makes the distribution available
through Hortonworks Data Platform (HDP). HDP can be easily installed in
public and private clouds using Whirr, Microsoft Azure, and Amazon Web
Services. For more details, contact the Hortonworks Support
Team<support%40hortonworks.com?subject=Accessing%20Hortonworks%20Data%20Platform>.
*

*However, note that cloud services and virtual infrastructures are not
architected for Hadoop. Hadoop and HBase deployments in this case,
might experience
poor performance due to virtualization and suboptimal I/O architecture."*

What do u think about bolded part, is this right?

Kindly reply asap

Thanks,

Search Discussions

  • Harsh J at Sep 26, 2012 at 3:42 pm
    Hi Dalia,

    Can you shed more light on what your client actually is doing? The
    random reads are all from one machine/client instance or several in
    parallel?

    More machines doesn't mean faster reads if your client is singular of
    course. You're still capped at the single client level.
    On Wed, Sep 26, 2012 at 8:51 PM, Dalia Hassan wrote:
    Dear Philip and all,

    I have successfully installed CM 3 on 20 cloud servers (Ubuntu 10.04) from
    rackspace and they are all in good health.

    I ran a simple hbase thrift program, that random reads rows from the
    cluster.

    But I am experiencing severe performance issues, it handles only 100
    Requests/sec, and it doesn't scale.

    I have tried the same program on 10 nodes, then on 20 nodes and the same
    performance problems.

    So could anyone help me what to do? and what are the reasons for this bad
    performance?

    I have an article in Hortonworks:
    http://docs.hortonworks.com/CURRENT/index.htm#About_Hortonworks_Data_Platform/Hardware_Recommendations_For_Apache_Hadoop.htm

    "One way to quickly deploy Hadoop cluster, is to opt for “cloud trials” or
    use virtual infrastructure. Horton­works makes the distribution available
    through Hortonworks Data Platform (HDP). HDP can be easily installed in
    public and private clouds using Whirr, Microsoft Azure, and Amazon Web
    Services. For more details, contact the Hortonworks Support Team.

    However, note that cloud services and virtual infrastructures are not
    architected for Hadoop. Hadoop and HBase deployments in this case, might
    experience poor performance due to virtualization and suboptimal I/O
    architecture."

    What do u think about bolded part, is this right?

    Kindly reply asap

    Thanks,


    --
    Harsh J
  • Dalia Hassan at Nov 22, 2012 at 10:11 am
    Dear Harsh,

    My client is calling some functions through hbase.

    Could you advice me how to improve the performance of my system while
    adding more nodes for the reads or aggregate functions.

    Thanks for your help :D
    On Wednesday, September 26, 2012 5:35:23 PM UTC+2, Harsh J wrote:

    Hi Dalia,

    Can you shed more light on what your client actually is doing? The
    random reads are all from one machine/client instance or several in
    parallel?

    More machines doesn't mean faster reads if your client is singular of
    course. You're still capped at the single client level.
    On Wed, Sep 26, 2012 at 8:51 PM, Dalia Hassan wrote:
    Dear Philip and all,

    I have successfully installed CM 3 on 20 cloud servers (Ubuntu 10.04) from
    rackspace and they are all in good health.

    I ran a simple hbase thrift program, that random reads rows from the
    cluster.

    But I am experiencing severe performance issues, it handles only 100
    Requests/sec, and it doesn't scale.

    I have tried the same program on 10 nodes, then on 20 nodes and the same
    performance problems.

    So could anyone help me what to do? and what are the reasons for this bad
    performance?

    I have an article in Hortonworks:
    http://docs.hortonworks.com/CURRENT/index.htm#About_Hortonworks_Data_Platform/Hardware_Recommendations_For_Apache_Hadoop.htm
    "One way to quickly deploy Hadoop cluster, is to opt for “cloud trials” or
    use virtual infrastructure. Horton­works makes the distribution available
    through Hortonworks Data Platform (HDP). HDP can be easily installed in
    public and private clouds using Whirr, Microsoft Azure, and Amazon Web
    Services. For more details, contact the Hortonworks Support Team.

    However, note that cloud services and virtual infrastructures are not
    architected for Hadoop. Hadoop and HBase deployments in this case, might
    experience poor performance due to virtualization and suboptimal I/O
    architecture."

    What do u think about bolded part, is this right?

    Kindly reply asap

    Thanks,


    --
    Harsh J
  • Dalia Hassan at Nov 22, 2012 at 10:14 am
    The
    random reads are all from one machine/client instance or several in
    parallel?

    I cannot understand you, I implemented a rest server.
    I made a rest client, through apache bench I called that rest server and I
    requested 1000, 2000, 3000...etc request. But still the output was that on
    average it handles 100 requests per second...
    On Thursday, November 22, 2012 12:11:15 PM UTC+2, Dalia Hassan wrote:

    Dear Harsh,

    My client is calling some functions through hbase.

    Could you advice me how to improve the performance of my system while
    adding more nodes for the reads or aggregate functions.

    Thanks for your help :D
    On Wednesday, September 26, 2012 5:35:23 PM UTC+2, Harsh J wrote:

    Hi Dalia,

    Can you shed more light on what your client actually is doing? The
    random reads are all from one machine/client instance or several in
    parallel?

    More machines doesn't mean faster reads if your client is singular of
    course. You're still capped at the single client level.

    On Wed, Sep 26, 2012 at 8:51 PM, Dalia Hassan <daliah...@gmail.com>
    wrote:
    Dear Philip and all,

    I have successfully installed CM 3 on 20 cloud servers (Ubuntu 10.04) from
    rackspace and they are all in good health.

    I ran a simple hbase thrift program, that random reads rows from the
    cluster.

    But I am experiencing severe performance issues, it handles only 100
    Requests/sec, and it doesn't scale.

    I have tried the same program on 10 nodes, then on 20 nodes and the same
    performance problems.

    So could anyone help me what to do? and what are the reasons for this bad
    performance?

    I have an article in Hortonworks:
    http://docs.hortonworks.com/CURRENT/index.htm#About_Hortonworks_Data_Platform/Hardware_Recommendations_For_Apache_Hadoop.htm
    "One way to quickly deploy Hadoop cluster, is to opt for “cloud trials” or
    use virtual infrastructure. Horton­works makes the distribution available
    through Hortonworks Data Platform (HDP). HDP can be easily installed in
    public and private clouds using Whirr, Microsoft Azure, and Amazon Web
    Services. For more details, contact the Hortonworks Support Team.

    However, note that cloud services and virtual infrastructures are not
    architected for Hadoop. Hadoop and HBase deployments in this case, might
    experience poor performance due to virtualization and suboptimal I/O
    architecture."

    What do u think about bolded part, is this right?

    Kindly reply asap

    Thanks,


    --
    Harsh J

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedSep 26, '12 at 3:22p
activeNov 22, '12 at 10:14a
posts4
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Dalia Hassan: 3 posts Harsh J: 1 post

People

Translate

site design / logo © 2022 Grokbase