Hi guys,
I asked this question earlier but did not get any response. So, posting
again. Hope somebody can point to the right description:
When you do hadoop fs -copyFromLocal or use API to call fs.write() (when
Filesystem fs is HDFS), does it write to local filesystem first before
writing to HDFS ?
I read and found out that it writes on local file-system until block-size is
reached and then writes on HDFS.
Wouldn't HDFS Client choke if it writes to local filesystem if multiple such
fs -copyFromLocal commands are running. I thought atleast in fs.write(), if
you provide byte array, it should not write on local file-system ?
Some places I found out that hdfs client and datanode communicate through
rpc/sockets. Do they write on local file-systems also in this case or is it
just a buffer in memory that they write directly on HDFS.
Could somebody point me to some doc/code where I could find out how fs
-copyFromLocal and fs.write() work ? Do they write on local-filesystem
before block size is reached and then write to HDFS or write directly to
HDFS ?
Thanks in advance,
-JJ