FAQ
Make hbase more less of a fainting lily when running beside a hogging tasktracker
---------------------------------------------------------------------------------

Key: HBASE-1044
URL: https://issues.apache.org/jira/browse/HBASE-1044
Project: Hadoop HBase
Issue Type: Bug
Reporter: stack

From IRC (with some improving text added -- finishes on a good point made by jgray):
{code}
18:53 < St^Ack> The coupling in a MR cluster is looser than it is in hbase cluster
18:53 < St^Ack> TTs only need report in every ten minutes and a task can fail and be restarted
18:54 < St^Ack> Whereas with hbase, it must report in at least every two minutes (IIRC) and cannot 'redo' lost edit -- no chance of a redo
18:54 < St^Ack> So, your MR job can be rougher about getting to the finish line; messier.
18:55 < tim_s> yeah.
18:55 < St^Ack> If MR is running on same nodes as those hosting hbase, then it can rob resources from hbase in a way that damages hbase but not TT
18:56 < St^Ack> So, maybe we need to look at the hbase config; make it more tolerant when its running beside a hogging TT job
18:57 < tim_s> hmm, that would be lovely
18:57 < St^Ack> Need to look at HDFS; see how 'fragile' it is too; hbase should be at least that 'fragile'
18:57 < jgray> yeah, most issues we see come from resource issues on shared hdfs/tt/rs nodes
18:57 < tim_s> so is it common to host hbase elsewhere?
18:57 < St^Ack> Let me make an issue on it because this is common failure case for hbase (Setup hbase then run your old MR job as though nothing has changed -- then surprise when the little hbase lady faints)
18:57 < jgray> tim_s: currently, no. common practice is shared
18:58 < St^Ack> ... and its better if shared -- locality benefits
18:58 < tim_s> would that be a good idea though? cause I don't really need to have hadoop local I guess.
18:58 < tim_s> ahh
18:58 < apurtell> we share also
...
18:59 < jgray> beyond locality, sharing makes sense as hdfs and hbase nodes have different requirements... hdfs being heaviest on io (where hbase has no use), hbase heavy in memory, TTs vary greatly but most often heavy in cpu/io
{code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • stack (JIRA) at Dec 13, 2008 at 9:43 pm
    [ https://issues.apache.org/jira/browse/HBASE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack reassigned HBASE-1044:
    ----------------------------

    Assignee: stack
    Make hbase less of a fainting lily when running beside a hogging tasktracker
    ----------------------------------------------------------------------------

    Key: HBASE-1044
    URL: https://issues.apache.org/jira/browse/HBASE-1044
    Project: Hadoop HBase
    Issue Type: Bug
    Reporter: stack
    Assignee: stack
    Fix For: 0.19.0


    From IRC (with some improving text added -- finishes on a good point made by jgray):
    {code}
    18:53 < St^Ack> The coupling in a MR cluster is looser than it is in hbase cluster
    18:53 < St^Ack> TTs only need report in every ten minutes and a task can fail and be restarted
    18:54 < St^Ack> Whereas with hbase, it must report in at least every two minutes (IIRC) and cannot 'redo' lost edit -- no chance of a redo
    18:54 < St^Ack> So, your MR job can be rougher about getting to the finish line; messier.
    18:55 < tim_s> yeah.
    18:55 < St^Ack> If MR is running on same nodes as those hosting hbase, then it can rob resources from hbase in a way that damages hbase but not TT
    18:56 < St^Ack> So, maybe we need to look at the hbase config; make it more tolerant when its running beside a hogging TT job
    18:57 < tim_s> hmm, that would be lovely
    18:57 < St^Ack> Need to look at HDFS; see how 'fragile' it is too; hbase should be at least that 'fragile'
    18:57 < jgray> yeah, most issues we see come from resource issues on shared hdfs/tt/rs nodes
    18:57 < tim_s> so is it common to host hbase elsewhere?
    18:57 < St^Ack> Let me make an issue on it because this is common failure case for hbase (Setup hbase then run your old MR job as though nothing has changed -- then surprise when the little hbase lady faints)
    18:57 < jgray> tim_s: currently, no. common practice is shared
    18:58 < St^Ack> ... and its better if shared -- locality benefits
    18:58 < tim_s> would that be a good idea though? cause I don't really need to have hadoop local I guess.
    18:58 < tim_s> ahh
    18:58 < apurtell> we share also
    ...
    18:59 < jgray> beyond locality, sharing makes sense as hdfs and hbase nodes have different requirements... hdfs being heaviest on io (where hbase has no use), hbase heavy in memory, TTs vary greatly but most often heavy in cpu/io
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Dec 13, 2008 at 9:43 pm
    [ https://issues.apache.org/jira/browse/HBASE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-1044:
    -------------------------

    Fix Version/s: 0.19.0
    Summary: Make hbase less of a fainting lily when running beside a hogging tasktracker (was: Make hbase more less of a fainting lily when running beside a hogging tasktracker)

    For 0.19.0, minimally make sure that our timeouts are at least defaults of hdfs.
    Make hbase less of a fainting lily when running beside a hogging tasktracker
    ----------------------------------------------------------------------------

    Key: HBASE-1044
    URL: https://issues.apache.org/jira/browse/HBASE-1044
    Project: Hadoop HBase
    Issue Type: Bug
    Reporter: stack
    Fix For: 0.19.0


    From IRC (with some improving text added -- finishes on a good point made by jgray):
    {code}
    18:53 < St^Ack> The coupling in a MR cluster is looser than it is in hbase cluster
    18:53 < St^Ack> TTs only need report in every ten minutes and a task can fail and be restarted
    18:54 < St^Ack> Whereas with hbase, it must report in at least every two minutes (IIRC) and cannot 'redo' lost edit -- no chance of a redo
    18:54 < St^Ack> So, your MR job can be rougher about getting to the finish line; messier.
    18:55 < tim_s> yeah.
    18:55 < St^Ack> If MR is running on same nodes as those hosting hbase, then it can rob resources from hbase in a way that damages hbase but not TT
    18:56 < St^Ack> So, maybe we need to look at the hbase config; make it more tolerant when its running beside a hogging TT job
    18:57 < tim_s> hmm, that would be lovely
    18:57 < St^Ack> Need to look at HDFS; see how 'fragile' it is too; hbase should be at least that 'fragile'
    18:57 < jgray> yeah, most issues we see come from resource issues on shared hdfs/tt/rs nodes
    18:57 < tim_s> so is it common to host hbase elsewhere?
    18:57 < St^Ack> Let me make an issue on it because this is common failure case for hbase (Setup hbase then run your old MR job as though nothing has changed -- then surprise when the little hbase lady faints)
    18:57 < jgray> tim_s: currently, no. common practice is shared
    18:58 < St^Ack> ... and its better if shared -- locality benefits
    18:58 < tim_s> would that be a good idea though? cause I don't really need to have hadoop local I guess.
    18:58 < tim_s> ahh
    18:58 < apurtell> we share also
    ...
    18:59 < jgray> beyond locality, sharing makes sense as hdfs and hbase nodes have different requirements... hdfs being heaviest on io (where hbase has no use), hbase heavy in memory, TTs vary greatly but most often heavy in cpu/io
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Dec 17, 2008 at 6:59 am
    [ https://issues.apache.org/jira/browse/HBASE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack resolved HBASE-1044.
    --------------------------

    Resolution: Invalid

    I reviewed datanode and namenode configurations. It heartbeats every 3 seconds. Retries 3 times. Doesn't have notion of shutting down if doesn't connect to master as we do.

    Closing as invalid. When ZK is in the mix, I think we'll be less fragile.
    Make hbase less of a fainting lily when running beside a hogging tasktracker
    ----------------------------------------------------------------------------

    Key: HBASE-1044
    URL: https://issues.apache.org/jira/browse/HBASE-1044
    Project: Hadoop HBase
    Issue Type: Bug
    Reporter: stack
    Assignee: stack
    Fix For: 0.19.0


    From IRC (with some improving text added -- finishes on a good point made by jgray):
    {code}
    18:53 < St^Ack> The coupling in a MR cluster is looser than it is in hbase cluster
    18:53 < St^Ack> TTs only need report in every ten minutes and a task can fail and be restarted
    18:54 < St^Ack> Whereas with hbase, it must report in at least every two minutes (IIRC) and cannot 'redo' lost edit -- no chance of a redo
    18:54 < St^Ack> So, your MR job can be rougher about getting to the finish line; messier.
    18:55 < tim_s> yeah.
    18:55 < St^Ack> If MR is running on same nodes as those hosting hbase, then it can rob resources from hbase in a way that damages hbase but not TT
    18:56 < St^Ack> So, maybe we need to look at the hbase config; make it more tolerant when its running beside a hogging TT job
    18:57 < tim_s> hmm, that would be lovely
    18:57 < St^Ack> Need to look at HDFS; see how 'fragile' it is too; hbase should be at least that 'fragile'
    18:57 < jgray> yeah, most issues we see come from resource issues on shared hdfs/tt/rs nodes
    18:57 < tim_s> so is it common to host hbase elsewhere?
    18:57 < St^Ack> Let me make an issue on it because this is common failure case for hbase (Setup hbase then run your old MR job as though nothing has changed -- then surprise when the little hbase lady faints)
    18:57 < jgray> tim_s: currently, no. common practice is shared
    18:58 < St^Ack> ... and its better if shared -- locality benefits
    18:58 < tim_s> would that be a good idea though? cause I don't really need to have hadoop local I guess.
    18:58 < tim_s> ahh
    18:58 < apurtell> we share also
    ...
    18:59 < jgray> beyond locality, sharing makes sense as hdfs and hbase nodes have different requirements... hdfs being heaviest on io (where hbase has no use), hbase heavy in memory, TTs vary greatly but most often heavy in cpu/io
    {code}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedDec 3, '08 at 7:08p
activeDec 17, '08 at 6:59a
posts4
users1
websitehbase.apache.org

1 user in discussion

stack (JIRA): 4 posts

People

Translate

site design / logo © 2023 Grokbase