FAQ
Hello!

I am trying to transfer data from a remote node's filesystem to HDFS. But,
somehow, it's not working.!

***********************************************************************
I have a 7 node cluster, It's config file(hadoop-site.xml) is as follows::

<property>
<name>fs.default.name</name>
<value>hdfs://nikhilname:50130</value>
</property>

<property>
<name>dfs.http.address</name>
<value>nikhilname:50070</value>
</property>

For not getting too lengthy, I am sending u just the important tags. So
here, nikhilname is the namenode. I have specified its IP in /etc/hosts.

************************************************************************



**************************************************************************
Then, here is the 8th machine(client or remote) which has this config file::

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>
<name>fs.default.name</name>
<value>hdfs://nikhilname:50130</value>
</property>

<property>
<name>dfs.http.address</name>
<value>nikhilname:50070</value>
</property>

</configuration>

Here, I have pointed fs.default.name to the namenode

**********************************************************


**********************************************************
Then, here is the code that simply tries to copy a file from
localfilesystem(remote node) and place it into HDFS, thereby leading to
replication.

The path is /home/hadoop/Desktop/test.java(of remote node)
I want to place it in HDFS(/user/hadoop)

package data.pkg;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FileUtil;
import org.apache.hadoop.fs.Path;

public class Try
{
public static void main(String[] args)
{
Configuration conf_hdfs=new Configuration();
Configuration conf_remote=new Configuration();


try
{
FileSystem hdfs_filesystem=FileSystem.get(conf_hdfs);
FileSystem remote_filesystem=FileSystem.getLocal(conf_remote);

String
in_path_name=remote_filesystem+"/home/hadoop/Desktop/test.java";
Path in_path =new Path(in_path_name);

String out_path_name=hdfs_filesystem+"";
Path out_path=new Path("/user/hadoop");

FileUtil.copy(remote_filesystem,in_path,hdfs_filesystem,
out_path, false, false,conf_hdfs);

System.out.println("Done...!");
}
catch (IOException e)
{
e.printStackTrace();
}
}


}

********************************************************


********************************************************
But, following are the errors I am getting after it's execution....

java.io.FileNotFoundException: File
org.apache[email protected]/home/hadoop/Desktop/test.java
does not exist.
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:420)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:244)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:192)
at data.pkg.Try.main(Try.java:103)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
******************************************************************


Briefly what I have done as of now::


-> Got the instances of both the filesystems.
-> Passed the paths appropriately.
-> I have also taken care of proxy issues
-> The file is also placed in /home/hadoop/Desktop/test.java on remote node.

*******Also, Can you tel me the difference between LocalFileSystem and
RawFileSystem

Thanking You,


--
Regards!
Sugandha

Search Discussions

  • Todd Lipcon at Jun 11, 2009 at 3:33 pm

    On Thu, Jun 11, 2009 at 7:01 AM, Sugandha Naolekar wrote:

    Hello!

    I am trying to transfer data from a remote node's filesystem to HDFS. But,
    somehow, it's not working.!
    First, thanks for the good context and pasting of all of the revelant bits!

    ***********************************************************************
    I have a 7 node cluster, It's config file(hadoop-site.xml) is as follows::

    <property>
    <name>fs.default.name</name>
    <value>hdfs://nikhilname:50130</value>
    </property>

    <property>
    <name>dfs.http.address</name>
    <value>nikhilname:50070</value>
    </property>

    For not getting too lengthy, I am sending u just the important tags. So
    here, nikhilname is the namenode. I have specified its IP in /etc/hosts.

    ************************************************************************



    **************************************************************************
    Then, here is the 8th machine(client or remote) which has this config
    file::

    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

    <!-- Put site-specific property overrides in this file. -->

    <configuration>

    <property>
    <name>fs.default.name</name>
    <value>hdfs://nikhilname:50130</value>
    </property>

    <property>
    <name>dfs.http.address</name>
    <value>nikhilname:50070</value>
    </property>

    </configuration>

    Here, I have pointed fs.default.name to the namenode

    **********************************************************


    **********************************************************
    Then, here is the code that simply tries to copy a file from
    localfilesystem(remote node) and place it into HDFS, thereby leading to
    replication.

    The path is /home/hadoop/Desktop/test.java(of remote node)
    I want to place it in HDFS(/user/hadoop)

    package data.pkg;

    import java.io.IOException;

    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.FileUtil;
    import org.apache.hadoop.fs.Path;

    public class Try
    {
    public static void main(String[] args)
    {
    Configuration conf_hdfs=new Configuration();
    Configuration conf_remote=new Configuration();
    No need to have two different Configuration objects, but fine.

    try
    {
    FileSystem hdfs_filesystem=FileSystem.get(conf_hdfs);
    FileSystem remote_filesystem=FileSystem.getLocal(conf_remote);

    String
    in_path_name=remote_filesystem+"/home/hadoop/Desktop/test.java";
    Path in_path =new Path(in_path_name);

    String out_path_name=hdfs_filesystem+"";
    Your issues are here. You don't need to do this concatenation - simply use
    "/home/hadoop/Desktop/test.java" and "/user/hadoop/test.java" to construct
    the Path objects. They'll resolve to the right Filesystems by virtue of your
    passing them to FileUtil.copy.
    Path out_path=new Path("/user/hadoop");

    FileUtil.copy(remote_filesystem,in_path,hdfs_filesystem,
    out_path, false, false,conf_hdfs);

    System.out.println("Done...!");
    }
    catch (IOException e)
    {
    e.printStackTrace();
    }
    }


    }

    ********************************************************


    ********************************************************
    But, following are the errors I am getting after it's execution....

    java.io.FileNotFoundException: File
    org.apache[email protected]/home/hadoop/Desktop/test.java
    does not exist.
    Notice here that the filename created is the concatenation of a Java
    stringification (File ...LocalFileSystem@<address>) with your path. This
    obviously is not found.

    at

    org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:420)
    at

    org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:244)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:192)
    at data.pkg.Try.main(Try.java:103)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
    ******************************************************************


    Briefly what I have done as of now::


    -> Got the instances of both the filesystems.
    -> Passed the paths appropriately.
    -> I have also taken care of proxy issues
    -> The file is also placed in /home/hadoop/Desktop/test.java on remote
    node.

    *******Also, Can you tel me the difference between LocalFileSystem and
    RawFileSystem
    Raw is not checksummed, whereas Local is.

    -Todd
  • Sugandha Naolekar at Jun 12, 2009 at 6:54 am
    Hello!

    ****************************************************************8
    Following is the code that's not working::

    package data.pkg;

    import java.io.IOException;

    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.FileUtil;
    import org.apache.hadoop.fs.Path;

    public class Try
    {
    public static void main(String[] args)
    {
    Configuration conf_hdfs=new Configuration();

    try
    {
    FileSystem hdfs_filesystem=FileSystem.get(conf_hdfs);
    FileSystem remote_filesystem=FileSystem.getLocal(conf_hdfs);

    Path in_path =new Path("/home/hadoop/Desktop/test.java");
    Path out_path=new Path("/user/hadoop");

    FileUtil.copy(remote_filesystem,in_path,hdfs_filesystem,
    out_path, false, false,conf_hdfs);

    System.out.println("Done...!");
    }
    catch (IOException e)
    {
    e.printStackTrace();
    }
    }

    }

    ************************************************************


    What I am trying to do is simply copy a file from a remote node(not a part
    of master-slave config file) to HDFS (a cluster of 7 nodes).

    But, it's flanking the errors as follws::

    *********************************

    File /home/hadoop/Desktop/test.java does not exist.
    at
    org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:420)
    at
    org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:244)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:192)
    at data.pkg.Try.main(Try.java:24)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
    ******************************************************************************

    --
    Regards!
    Sugandha

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJun 11, '09 at 2:01p
activeJun 12, '09 at 6:54a
posts3
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Sugandha Naolekar: 2 posts Todd Lipcon: 1 post

People

Translate

site design / logo © 2023 Grokbase