Hi guys,
I asked this question earlier but did not get any response. So, posting
again. Hope somebody can point to the right description:

When you do hadoop fs -copyFromLocal or use API to call fs.write() (when
Filesystem fs is HDFS), does it write to local filesystem first before
writing to HDFS ?

I read and found out that it writes on local file-system until block-size is
reached and then writes on HDFS.
Wouldn't HDFS Client choke if it writes to local filesystem if multiple such
fs -copyFromLocal commands are running. I thought atleast in fs.write(), if
you provide byte array, it should not write on local file-system ?

Some places I found out that hdfs client and datanode communicate through
rpc/sockets. Do they write on local file-systems also in this case or is it
just a buffer in memory that they write directly on HDFS.

Could somebody point me to some doc/code where I could find out how fs
-copyFromLocal and fs.write() work ? Do they write on local-filesystem
before block size is reached and then write to HDFS or write directly to

Thanks in advance,

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 1 | next ›
Discussion Overview
groupmapreduce-user @
postedMay 31, '11 at 11:57p
activeMay 31, '11 at 11:57p

1 user in discussion

Mapred Learn: 1 post



site design / logo © 2022 Grokbase