FAQ
Hi Matthew,
of course, you can copy it directly to HDFS and vice versa. Use the
IOUtils (hadoop.io.IOUtils) like this:
FileSystem fileSystem = FileSystem.get(conf);
(org.apache.hadoop.fs.FileSystem)

"in" and "out" are the streams (out is in this example the HDFS outputstream)
IOUtils.copyBytes(in, out, fileSystem.getConf());

hope this helps,
sebastian

Zitat von Matthew John <tmatthewjohn1988@gmail.com>:
Hi all ,

I have been working with MapReduce and HDFS for sometime. So the procedure
what I normally follow is :

1) copy in the input file from Local File System to HDFS

2) run the map reduce module

3) copy the output file back to the Local File System from the HDFS

But I feel , step 1 and 3 is adding a lot of overhead to the entire process
!!

My queries are :

1) I am getting the files into the Local File System by establishing a port
connection with another node. So can I ensure that the data which is ported
into the hadoop node is directly written to the HDFS instead of going
through the Local File System and then performing a CopyFromLocal ???

2) Can I copy the reduce output (which creates the final output file)
directly to the Local File System instead of injecting it to the HDFS
(effectively into different nodes in HDFS), so that I can minimize the
overhead ?? I expect this procedure to take much lesser time than copying to
the HDFS and then performing a CopyToLocal.. Finally I should be able to
send this file back to another node using socket communication..

Looking forward to your suggestions !!

Thanks,

Matthew John

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 3 | next ›
Discussion Overview
groupcommon-user @
categorieshadoop
postedNov 15, '10 at 5:37a
activeNov 17, '10 at 1:51p
posts3
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase