Grokbase Groups HBase dev July 2012
FAQ
Hi,

FYI, I created a set of jira in HDFS, related to HBase MTTR or recovery alone.

HDFS-3706 :Add the possibility to mark a node as 'low priority' for
reads in the DFSClient
HDFS-3705: Add the possibility to mark a node as 'low priority' for
writes in the DFSClient
HDFS-3704: In the DFSClient, Add the node to the dead list when the
ipc.Client call fails
HDFS-3703: Decrease the datanode failure detection time
HDFS-3702: Add an option for NOT writing the blocks locally if there
is a datanode on the same box as the client
HDFS-3701: HDFS may miss the final block when reading a file opened
for writing if one of the datanode is dead


N.

Search Discussions

  • Michael Stack at Jul 23, 2012 at 5:00 pm

    On Mon, Jul 23, 2012 at 1:15 PM, N Keywal wrote:
    Hi,

    FYI, I created a set of jira in HDFS, related to HBase MTTR or recovery alone.

    HDFS-3706 :Add the possibility to mark a node as 'low priority' for
    reads in the DFSClient
    HDFS-3705: Add the possibility to mark a node as 'low priority' for
    writes in the DFSClient
    HDFS-3704: In the DFSClient, Add the node to the dead list when the
    ipc.Client call fails
    HDFS-3703: Decrease the datanode failure detection time
    HDFS-3702: Add an option for NOT writing the blocks locally if there
    is a datanode on the same box as the client
    HDFS-3701: HDFS may miss the final block when reading a file opened
    for writing if one of the datanode is dead
    Thanks for doing the above.

    What about your idea where you'd like to have different socket
    timeouts dependent on what we're doing? Or Todd's idea of being able
    to try a DN replica and if its taking too long to read, move on to the
    next one quick? To do this, do you think we'd need to get our fingers
    into the DFSClient in other areas?

    Good stuff,
    St.Ack
  • N Keywal at Jul 23, 2012 at 6:46 pm
    There's a bunch of things in this area, yes: A big one would be to
    split the connect timeout vs. the readtimeout. Todd created a jira for
    HDFS, and it could make a huge difference imho. readtimeout are not
    really clear, between GC stuff, may be bugs, network issues, risk with
    retrying as you don't really know what has been done and so on. But
    connect are dead process and big network issues only, even under crazy
    GC the socket gets connected. And if it fails, you can retry. So
    splitting the two would be useful, we could go to a next DN quickly.
    But to be very useful, we will need the DFSClient to do it (I still
    have my doc on dfs timeouts to send, will do it soon).
    - I need to review some cases on pure functional split in HBase, there
    are some existing stuff already there that I need to understand.
    On Mon, Jul 23, 2012 at 6:59 PM, Stack wrote:
    On Mon, Jul 23, 2012 at 1:15 PM, N Keywal wrote:
    Hi,

    FYI, I created a set of jira in HDFS, related to HBase MTTR or recovery alone.

    HDFS-3706 :Add the possibility to mark a node as 'low priority' for
    reads in the DFSClient
    HDFS-3705: Add the possibility to mark a node as 'low priority' for
    writes in the DFSClient
    HDFS-3704: In the DFSClient, Add the node to the dead list when the
    ipc.Client call fails
    HDFS-3703: Decrease the datanode failure detection time
    HDFS-3702: Add an option for NOT writing the blocks locally if there
    is a datanode on the same box as the client
    HDFS-3701: HDFS may miss the final block when reading a file opened
    for writing if one of the datanode is dead
    Thanks for doing the above.

    What about your idea where you'd like to have different socket
    timeouts dependent on what we're doing? Or Todd's idea of being able
    to try a DN replica and if its taking too long to read, move on to the
    next one quick? To do this, do you think we'd need to get our fingers
    into the DFSClient in other areas?

    Good stuff,
    St.Ack

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedJul 23, '12 at 11:16a
activeJul 23, '12 at 6:46p
posts3
users2
websitehbase.apache.org

2 users in discussion

N Keywal: 2 posts Michael Stack: 1 post

People

Translate

site design / logo © 2023 Grokbase