you can use this pseudo code for loading data to HDFS.

import java.io.File;
import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hdfs.DistributedFileSystem;

* @author: Arockia Doss S
* @emailto: doss@intellipowerhive.com
* @url: http://www.intellipowerhive.com,http://www.dossinfotech.com
* @comments: You can use and modify this code for your use.
* @About this: This below code works in hadoop-0.19.0 version platform.
* If you want to test this code, you have set the hadoop
libraries in your class path.
* You need to give set of parameters before running it (like
Hadoop path, Host, Users).

public class HadoopConfiguration {
//Hadoop Absolute Path
private static final String CLUSTERPATH="/home/hadoop-0.19.0/";
private static final String SITEFILE = "conf/hadoop-site.xml";
private static final String DEFAULTFILE = "conf/hadoop-default.xml";
//Hadoop Name Node Host
private static final String HADOOPHOST = "";
//Hadoop Root and its users list
private static final String HOSTUSERS = "root,doss";
private static Configuration conf = new Configuration();
private static DistributedFileSystem dfs = new DistributedFileSystem();

public HadoopConfiguration() throws java.lang.Exception{
Path sitepath = new Path(CLUSTERPATH+SITEFILE);
Path defaultpath = new Path(CLUSTERPATH+DEFAULTFILE);
getConf().set("hadoop.job.ugi", HOSTUSERS);
dfs.initialize(new URI("hdfs://"+HADOOPHOST+":9000/"), conf);

public static Configuration getConf(){
return conf;

public static void main(String[] args){
HadoopConfiguration h = new HadoopConfiguration();
FileSystem fs = FileSystem.get(h.getConf());

//Copy sample.xls file to HDFS, The file will be there after
copying it.
fs.copyFromLocalFile(new Path("/home/sample.xls"),new

//Move sample.doc file to HDFS, The file will not be there after
moving it.
fs.moveFromLocalFile(new Path("/home/sample.doc"),new

//This below code gives to list the files from HDFS
FileStatus[] fileStatus = fs.listStatus(new Path("/home/xls"));
for(int i=0;i<fileStatus.length;i++){
Path path = fileStatus[i].getPath();

}catch(java.lang.Exception e){


shwitzu wrote:
Thanks for Responding.

I read about HDFS and understood how it works and I also installed hadoop
in my windows using cygwin and tried a sample driver code and made sure it

But my concern is, given the problem statement how should I proceed

Could you please give me some clue/ pseudo code or a design.

Thanks in anticipation.

Doss_IPH wrote:
First and for most, you need to understand about hadoop platform
Currently, I am working in real time application using hadoop. I think
that Hadoop will be fit to your requirements.
Hadoop is mainly for three things,
1. Scalability no limit for storage
2. Peta bytes of data processing in distributed parallel mode.
3. Fault tolerance (Automatically Block Replication) recovering data from

shwitzu wrote:
Hello Sir!

I am new to hadoop. I have a project based on webservices. I have my
information in 4 databases with different files in each one of them.
Say, images in one, video, documents etc. My task is to develop a web
service which accepts the keyword from the client and process the
request and send back the actual requested file back to the user. Now I
have to use Hadoop distributed file system in this project.

I have the following questions:

1) How should I start with the design?
2) Should I upload all the files and create Map, Reduce and Driver code
and once I run my application will it automatically go the file system
and get back the results to me?
3) How do i handle the binary data? I want to store binary format data
using MTOM in my databse.

Please let me know how I should proceed. I dont know much about this
hadoop and am I searching for some help. It would be great if you could
assist me. Thanks again
View this message in context: http://www.nabble.com/Need-Info-tp25901902p26003660.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 5 of 6 | next ›
Discussion Overview
groupcommon-user @
postedOct 15, '09 at 1:50a
activeOct 29, '09 at 11:42a



site design / logo © 2022 Grokbase