FAQ
Hi Guys

I am trying to work on HDFS to improve its performance by adding RDMA
functionality to its code. But i cannot find any sort of documentation or
help regarding this topic except some information about Socket Direct
Protocol or Allocated buffers. since RDMA is the premier protocol which is
highly preferred for Distributed File Systems , why is it not preferred in
case of HDFS or what are the factors which makes it tough. Any suggestions
or papers or ideas regarding the ways we can get some way to work on it. I
am using OFED Distribution , RDMA for this.

Thanks

Search Discussions

  • Christopher Smith at Feb 22, 2011 at 5:16 am

    On Fri, Feb 18, 2011 at 10:32 AM, Rajat Sharma wrote:
    Hi Guys

    I am trying to work on HDFS to improve its performance by adding RDMA
    functionality to its code. But i cannot find any sort of documentation or
    help regarding this topic except some information about Socket Direct
    Protocol or Allocated buffers. since RDMA is the premier protocol which is
    highly preferred for Distributed File Systems , why is it not preferred in
    case of HDFS or what are the factors which makes it tough. Any suggestions
    or papers or ideas regarding the ways we can get some way to work on it. I
    am using OFED Distribution , RDMA for this.
    HDFS really serves a different purpose from other distributed
    filesystems. I'd argue that RDMA is a premature optimization that
    would introduce a lot of complexity in to the code base. If you are
    using it right, much of your data traffic with HDFS doesn't go over
    the network at all.

    --
    Chris
  • Brian Bockelman at Feb 22, 2011 at 1:35 pm

    On Feb 18, 2011, at 12:32 PM, Rajat Sharma wrote:

    Hi Guys

    I am trying to work on HDFS to improve its performance by adding RDMA
    functionality to its code. But i cannot find any sort of documentation or
    help regarding this topic except some information about Socket Direct
    Protocol or Allocated buffers. since RDMA is the premier protocol which is
    highly preferred for Distributed File Systems , why is it not preferred in
    case of HDFS or what are the factors which makes it tough. Any suggestions
    or papers or ideas regarding the ways we can get some way to work on it. I
    am using OFED Distribution , RDMA for this.

    Thanks
    Hi,

    It's an overused analogy, but HDFS is a freight train (high throughput) while the file systems you are thinking of are more like race cars (high performance).

    While HDFS provides decent latency, it's not optimized for latency. RDMA would provide little benefit for the type of hardware it is designed for; it would introduce a plethora of headaches.

    Brian

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedFeb 18, '11 at 6:34p
activeFeb 22, '11 at 1:35p
posts3
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase