FAQ
I'm attempting to load data into hadoop (version 0.17.1), from a
non-datanode machine in the cluster. I can run jobs and copyFromLocal works
fine, but when i try to use distcp i get the below. I'm don't understand
what the error, can anyone help?
Thanks

blue:hadoop-0.17.1 mdidomenico$ time bin/hadoop distcp -overwrite
file:///Users/mdidomenico/hadoop/1gTestfile /user/mdidomenico/1gTestfile
08/09/07 23:56:06 INFO util.CopyFiles:
srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
08/09/07 23:56:06 INFO util.CopyFiles:
destPath=/user/mdidomenico/1gTestfile1
08/09/07 23:56:07 INFO util.CopyFiles: srcCount=1
With failures, global counters are inaccurate; consider running with -i
Copy failed: org.apache.hadoop.ipc.RemoteException: java.io.IOException:
/tmp/hadoop-hadoop/mapred/system/job_200809072254_0005/job.xml: No such file
or directory
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:215)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:149)
at
org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1155)
at
org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1136)
at
org.apache.hadoop.mapred.JobInProgress.(JobTracker.java:1755)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)

at org.apache.hadoop.ipc.Client.call(Client.java:557)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
at $Proxy1.submitJob(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy1.submitJob(Unknown Source)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:758)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604)
at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)

Search Discussions

  • Aaron Kimball at Sep 8, 2008 at 11:05 pm
    It is likely that you mapred.system.dir and/or fs.default.name settings are
    incorrect on the non-datanode machine that you are launching the task from.
    These two settings (in your conf/hadoop-site.xml file) must match the
    settings on the cluster itself.

    - Aaron

    On Sun, Sep 7, 2008 at 8:58 PM, Michael Di Domenico
    wrote:
    I'm attempting to load data into hadoop (version 0.17.1), from a
    non-datanode machine in the cluster. I can run jobs and copyFromLocal
    works
    fine, but when i try to use distcp i get the below. I'm don't understand
    what the error, can anyone help?
    Thanks

    blue:hadoop-0.17.1 mdidomenico$ time bin/hadoop distcp -overwrite
    file:///Users/mdidomenico/hadoop/1gTestfile /user/mdidomenico/1gTestfile
    08/09/07 23:56:06 INFO util.CopyFiles:
    srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
    08/09/07 23:56:06 INFO util.CopyFiles:
    destPath=/user/mdidomenico/1gTestfile1
    08/09/07 23:56:07 INFO util.CopyFiles: srcCount=1
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: org.apache.hadoop.ipc.RemoteException: java.io.IOException:
    /tmp/hadoop-hadoop/mapred/system/job_200809072254_0005/job.xml: No such
    file
    or directory
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:215)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:149)
    at
    org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1155)
    at
    org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1136)
    at
    org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:175)
    at
    org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1755)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)

    at org.apache.hadoop.ipc.Client.call(Client.java:557)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
    at $Proxy1.submitJob(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:585)
    at

    org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at

    org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy1.submitJob(Unknown Source)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:758)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
    at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604)
    at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)
  • Michael Di Domenico at Sep 9, 2008 at 4:42 pm
    i'm not sure that's the issue, i basically tarred up the hadoop directory
    from the cluster and copied it over to the non-data node
    but i do agree i've likely got a setting wrong, since i can run distcp from
    the namenode and it works fine. the question is which one
    On Mon, Sep 8, 2008 at 7:04 PM, Aaron Kimball wrote:

    It is likely that you mapred.system.dir and/or fs.default.name settings
    are
    incorrect on the non-datanode machine that you are launching the task from.
    These two settings (in your conf/hadoop-site.xml file) must match the
    settings on the cluster itself.

    - Aaron

    On Sun, Sep 7, 2008 at 8:58 PM, Michael Di Domenico
    wrote:
    I'm attempting to load data into hadoop (version 0.17.1), from a
    non-datanode machine in the cluster. I can run jobs and copyFromLocal
    works
    fine, but when i try to use distcp i get the below. I'm don't understand
    what the error, can anyone help?
    Thanks

    blue:hadoop-0.17.1 mdidomenico$ time bin/hadoop distcp -overwrite
    file:///Users/mdidomenico/hadoop/1gTestfile /user/mdidomenico/1gTestfile
    08/09/07 23:56:06 INFO util.CopyFiles:
    srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
    08/09/07 23:56:06 INFO util.CopyFiles:
    destPath=/user/mdidomenico/1gTestfile1
    08/09/07 23:56:07 INFO util.CopyFiles: srcCount=1
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: org.apache.hadoop.ipc.RemoteException: java.io.IOException:
    /tmp/hadoop-hadoop/mapred/system/job_200809072254_0005/job.xml: No such
    file
    or directory
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:215)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:149)
    at
    org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1155)
    at
    org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1136)
    at
    org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:175)
    at
    org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1755)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)

    at org.apache.hadoop.ipc.Client.call(Client.java:557)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
    at $Proxy1.submitJob(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:585)
    at

    org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at

    org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy1.submitJob(Unknown Source)
    at
    org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:758)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
    at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604)
    at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)
  • Michael Di Domenico at Sep 9, 2008 at 5:03 pm
    a little more digging and it appears i cannot run distcp as someone other
    then hadoop on the namenode
    /tmp/hadoop-hadoop/mapred/system/job_200809091231_0005/job.xml

    looking at this directory from the error file the "system" directory does
    not exist on the namenode, i only have a "local" directory
    On Tue, Sep 9, 2008 at 12:41 PM, Michael Di Domenico wrote:

    i'm not sure that's the issue, i basically tarred up the hadoop directory
    from the cluster and copied it over to the non-data node
    but i do agree i've likely got a setting wrong, since i can run distcp from
    the namenode and it works fine. the question is which one
    On Mon, Sep 8, 2008 at 7:04 PM, Aaron Kimball wrote:

    It is likely that you mapred.system.dir and/or fs.default.name settings
    are
    incorrect on the non-datanode machine that you are launching the task
    from.
    These two settings (in your conf/hadoop-site.xml file) must match the
    settings on the cluster itself.

    - Aaron

    On Sun, Sep 7, 2008 at 8:58 PM, Michael Di Domenico
    wrote:
    I'm attempting to load data into hadoop (version 0.17.1), from a
    non-datanode machine in the cluster. I can run jobs and copyFromLocal
    works
    fine, but when i try to use distcp i get the below. I'm don't
    understand
    what the error, can anyone help?
    Thanks

    blue:hadoop-0.17.1 mdidomenico$ time bin/hadoop distcp -overwrite
    file:///Users/mdidomenico/hadoop/1gTestfile /user/mdidomenico/1gTestfile
    08/09/07 23:56:06 INFO util.CopyFiles:
    srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
    08/09/07 23:56:06 INFO util.CopyFiles:
    destPath=/user/mdidomenico/1gTestfile1
    08/09/07 23:56:07 INFO util.CopyFiles: srcCount=1
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: org.apache.hadoop.ipc.RemoteException: java.io.IOException:
    /tmp/hadoop-hadoop/mapred/system/job_200809072254_0005/job.xml: No such
    file
    or directory
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:215)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:149)
    at
    org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1155)
    at
    org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1136)
    at
    org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:175)
    at
    org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1755)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)

    at org.apache.hadoop.ipc.Client.call(Client.java:557)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
    at $Proxy1.submitJob(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:585)
    at

    org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at

    org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy1.submitJob(Unknown Source)
    at
    org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:758)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
    at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604)
    at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)
  • Michael Di Domenico at Sep 9, 2008 at 5:14 pm
    manually creating the "system" directory gets me past the first error, but
    now i get this. i'm not necessarily sure its a step forward though, because
    the map task never shows up in the jobtracker
    [[email protected] hadoop-0.17.1]$ bin/hadoop distcp
    "file:///home/mdidomenico/1gTestfile" "1gTestfile"
    08/09/09 13:12:06 INFO util.CopyFiles:
    srcPaths=[file:/home/mdidomenico/1gTestfile]
    08/09/09 13:12:06 INFO util.CopyFiles: destPath=1gTestfile
    08/09/09 13:12:07 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:07 INFO dfs.DFSClient: Abandoning block
    blk_5758513071638050362
    08/09/09 13:12:13 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:13 INFO dfs.DFSClient: Abandoning block
    blk_1691495306775808049
    08/09/09 13:12:17 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:17 INFO dfs.DFSClient: Abandoning block
    blk_1027634596973755899
    08/09/09 13:12:19 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:19 INFO dfs.DFSClient: Abandoning block
    blk_4535302510016050282
    08/09/09 13:12:23 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:23 INFO dfs.DFSClient: Abandoning block
    blk_7022658012001626339
    08/09/09 13:12:25 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:25 INFO dfs.DFSClient: Abandoning block
    blk_-4509681241839967328
    08/09/09 13:12:29 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:29 INFO dfs.DFSClient: Abandoning block
    blk_8318033979013580420
    08/09/09 13:12:31 WARN dfs.DFSClient: DataStreamer Exception:
    java.io.IOException: Unable to create new block.
    08/09/09 13:12:31 WARN dfs.DFSClient: Error Recovery for block
    blk_-4509681241839967328 bad datanode[0]
    08/09/09 13:12:35 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:35 INFO dfs.DFSClient: Abandoning block
    blk_2848354798649979411
    08/09/09 13:12:41 WARN dfs.DFSClient: DataStreamer Exception:
    java.io.IOException: Unable to create new block.
    08/09/09 13:12:41 WARN dfs.DFSClient: Error Recovery for block
    blk_2848354798649979411 bad datanode[0]
    Exception in thread "Thread-0" java.util.ConcurrentModificationException
    at java.util.TreeMap$PrivateEntryIterator.nextEntry(Unknown Source)
    at java.util.TreeMap$KeyIterator.next(Unknown Source)
    at org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217)
    at
    org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214)
    at
    org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1324)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:224)
    at
    org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:209)
    08/09/09 13:12:41 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:41 INFO dfs.DFSClient: Abandoning block
    blk_9189111926428577428

    On Tue, Sep 9, 2008 at 1:03 PM, Michael Di Domenico
    wrote:
    a little more digging and it appears i cannot run distcp as someone other
    then hadoop on the namenode
    /tmp/hadoop-hadoop/mapred/system/job_200809091231_0005/job.xml

    looking at this directory from the error file the "system" directory does
    not exist on the namenode, i only have a "local" directory


    On Tue, Sep 9, 2008 at 12:41 PM, Michael Di Domenico <
    [email protected]> wrote:
    i'm not sure that's the issue, i basically tarred up the hadoop directory
    from the cluster and copied it over to the non-data node
    but i do agree i've likely got a setting wrong, since i can run distcp
    from the namenode and it works fine. the question is which one
    On Mon, Sep 8, 2008 at 7:04 PM, Aaron Kimball wrote:

    It is likely that you mapred.system.dir and/or fs.default.name settings
    are
    incorrect on the non-datanode machine that you are launching the task
    from.
    These two settings (in your conf/hadoop-site.xml file) must match the
    settings on the cluster itself.

    - Aaron

    On Sun, Sep 7, 2008 at 8:58 PM, Michael Di Domenico
    wrote:
    I'm attempting to load data into hadoop (version 0.17.1), from a
    non-datanode machine in the cluster. I can run jobs and copyFromLocal
    works
    fine, but when i try to use distcp i get the below. I'm don't
    understand
    what the error, can anyone help?
    Thanks

    blue:hadoop-0.17.1 mdidomenico$ time bin/hadoop distcp -overwrite
    file:///Users/mdidomenico/hadoop/1gTestfile
    /user/mdidomenico/1gTestfile
    08/09/07 23:56:06 INFO util.CopyFiles:
    srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
    08/09/07 23:56:06 INFO util.CopyFiles:
    destPath=/user/mdidomenico/1gTestfile1
    08/09/07 23:56:07 INFO util.CopyFiles: srcCount=1
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: org.apache.hadoop.ipc.RemoteException:
    java.io.IOException:
    /tmp/hadoop-hadoop/mapred/system/job_200809072254_0005/job.xml: No such
    file
    or directory
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:215)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:149)
    at
    org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1155)
    at
    org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1136)
    at
    org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:175)
    at
    org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1755)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)

    at org.apache.hadoop.ipc.Client.call(Client.java:557)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
    at $Proxy1.submitJob(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:585)
    at

    org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at

    org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy1.submitJob(Unknown Source)
    at
    org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:758)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
    at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604)
    at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)
  • Michael Di Domenico at Sep 9, 2008 at 5:48 pm
    Apparently, the fix to my original error is because hadoop is setup for a
    single local machine out of the box and i had to change these directories
    <property>
    <name>mapred.local.dir</name>
    <value>/hadoop/mapred/local</value>
    </property>
    <property>
    <name>mapred.system.dir</name>
    <value>/hadoop/mapred/system</value>
    </property>
    <property>
    <name>mapred.temp.dir</name>
    <value>/hadoop/mapred/temp</value>
    </property>

    to be hdfs instead of "hadoop.tmp.dir"

    So now distcp works as a non-hadoop user and mapred works as a non-hadoop
    user from the name node, however, from a workstation i get this now

    blue:hadoop-0.17.1 mdidomenico$ bin/hadoop distcp
    "file:///Users/mdidomenico/hadoop/1gTestfile" "1gTestfile-1"
    08/09/09 13:44:19 INFO util.CopyFiles:
    srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
    08/09/09 13:44:19 INFO util.CopyFiles: destPath=1gTestfile-1
    08/09/09 13:44:20 INFO util.CopyFiles: srcCount=1
    08/09/09 13:44:22 INFO mapred.JobClient: Running job: job_200809091332_0004
    08/09/09 13:44:23 INFO mapred.JobClient: map 0% reduce 0%
    08/09/09 13:44:31 INFO mapred.JobClient: Task Id :
    task_200809091332_0004_m_000000_0, Status : FAILED
    java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
    at
    org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
    at
    org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

    08/09/09 13:44:50 INFO mapred.JobClient: Task Id :
    task_200809091332_0004_m_000000_1, Status : FAILED
    java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
    at
    org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
    at
    org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

    08/09/09 13:45:07 INFO mapred.JobClient: Task Id :
    task_200809091332_0004_m_000000_2, Status : FAILED
    java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
    at
    org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
    at
    org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

    08/09/09 13:45:26 INFO mapred.JobClient: map 100% reduce 100%
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1062)
    at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604)
    at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)


    On Tue, Sep 9, 2008 at 1:14 PM, Michael Di Domenico
    wrote:
    manually creating the "system" directory gets me past the first error, but
    now i get this. i'm not necessarily sure its a step forward though, because
    the map task never shows up in the jobtracker
    [[email protected] hadoop-0.17.1]$ bin/hadoop distcp
    "file:///home/mdidomenico/1gTestfile" "1gTestfile"
    08/09/09 13:12:06 INFO util.CopyFiles:
    srcPaths=[file:/home/mdidomenico/1gTestfile]
    08/09/09 13:12:06 INFO util.CopyFiles: destPath=1gTestfile
    08/09/09 13:12:07 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:07 INFO dfs.DFSClient: Abandoning block
    blk_5758513071638050362
    08/09/09 13:12:13 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:13 INFO dfs.DFSClient: Abandoning block
    blk_1691495306775808049
    08/09/09 13:12:17 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:17 INFO dfs.DFSClient: Abandoning block
    blk_1027634596973755899
    08/09/09 13:12:19 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:19 INFO dfs.DFSClient: Abandoning block
    blk_4535302510016050282
    08/09/09 13:12:23 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:23 INFO dfs.DFSClient: Abandoning block
    blk_7022658012001626339
    08/09/09 13:12:25 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:25 INFO dfs.DFSClient: Abandoning block
    blk_-4509681241839967328
    08/09/09 13:12:29 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:29 INFO dfs.DFSClient: Abandoning block
    blk_8318033979013580420
    08/09/09 13:12:31 WARN dfs.DFSClient: DataStreamer Exception:
    java.io.IOException: Unable to create new block.
    08/09/09 13:12:31 WARN dfs.DFSClient: Error Recovery for block
    blk_-4509681241839967328 bad datanode[0]
    08/09/09 13:12:35 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:35 INFO dfs.DFSClient: Abandoning block
    blk_2848354798649979411
    08/09/09 13:12:41 WARN dfs.DFSClient: DataStreamer Exception:
    java.io.IOException: Unable to create new block.
    08/09/09 13:12:41 WARN dfs.DFSClient: Error Recovery for block
    blk_2848354798649979411 bad datanode[0]
    Exception in thread "Thread-0" java.util.ConcurrentModificationException
    at java.util.TreeMap$PrivateEntryIterator.nextEntry(Unknown Source)
    at java.util.TreeMap$KeyIterator.next(Unknown Source)
    at org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217)
    at
    org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214)
    at
    org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1324)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:224)
    at
    org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:209)
    08/09/09 13:12:41 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:41 INFO dfs.DFSClient: Abandoning block
    blk_9189111926428577428

    On Tue, Sep 9, 2008 at 1:03 PM, Michael Di Domenico <
    [email protected]> wrote:
    a little more digging and it appears i cannot run distcp as someone other
    then hadoop on the namenode
    /tmp/hadoop-hadoop/mapred/system/job_200809091231_0005/job.xml

    looking at this directory from the error file the "system" directory does
    not exist on the namenode, i only have a "local" directory


    On Tue, Sep 9, 2008 at 12:41 PM, Michael Di Domenico <
    [email protected]> wrote:
    i'm not sure that's the issue, i basically tarred up the hadoop directory
    from the cluster and copied it over to the non-data node
    but i do agree i've likely got a setting wrong, since i can run distcp
    from the namenode and it works fine. the question is which one
    On Mon, Sep 8, 2008 at 7:04 PM, Aaron Kimball wrote:

    It is likely that you mapred.system.dir and/or fs.default.name settings
    are
    incorrect on the non-datanode machine that you are launching the task
    from.
    These two settings (in your conf/hadoop-site.xml file) must match the
    settings on the cluster itself.

    - Aaron

    On Sun, Sep 7, 2008 at 8:58 PM, Michael Di Domenico
    wrote:
    I'm attempting to load data into hadoop (version 0.17.1), from a
    non-datanode machine in the cluster. I can run jobs and copyFromLocal
    works
    fine, but when i try to use distcp i get the below. I'm don't
    understand
    what the error, can anyone help?
    Thanks

    blue:hadoop-0.17.1 mdidomenico$ time bin/hadoop distcp -overwrite
    file:///Users/mdidomenico/hadoop/1gTestfile
    /user/mdidomenico/1gTestfile
    08/09/07 23:56:06 INFO util.CopyFiles:
    srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
    08/09/07 23:56:06 INFO util.CopyFiles:
    destPath=/user/mdidomenico/1gTestfile1
    08/09/07 23:56:07 INFO util.CopyFiles: srcCount=1
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: org.apache.hadoop.ipc.RemoteException:
    java.io.IOException:
    /tmp/hadoop-hadoop/mapred/system/job_200809072254_0005/job.xml: No such
    file
    or directory
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:215)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:149)
    at
    org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1155)
    at
    org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1136)
    at
    org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:175)
    at
    org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1755)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)

    at org.apache.hadoop.ipc.Client.call(Client.java:557)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
    at $Proxy1.submitJob(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:585)
    at

    org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at

    org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy1.submitJob(Unknown Source)
    at
    org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:758)
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
    at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604)
    at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)
  • Michael Di Domenico at Sep 9, 2008 at 6:17 pm
    Looking in the task tracker log, i see this
    This file does exist on my local workstation, but it does not exist on the
    namenode/datanodes in my cluster. So it begs the question of if i
    misunderstood the use of distcp or is there still something wrong?

    I'm looking for something that will read a file from my workstation and load
    it into the dfs, but instead of going through the namenode like
    copyFromLocal seems to do, i'd like it to load the data via the datanodes
    directly, if distcp doesn't do it this way, is there anything that will?

    2008-09-09 14:00:54,418 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
    Initializing JVM Metrics with processName=MAP, sessionId=
    2008-09-09 14:00:54,662 INFO org.apache.hadoop.mapred.MapTask:
    numReduceTasks: 0
    2008-09-09 14:00:54,894 INFO org.apache.hadoop.util.CopyFiles: FAIL
    1gTestfile : java.io.FileNotFoundException: File
    file:/Users/mdidomenico/hadoop/1gTestfile does not exist.
    at
    org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:402)
    at
    org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:242)
    at
    org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:274)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:380)
    at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.copy(CopyFiles.java:366)
    at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.map(CopyFiles.java:493)
    at org.apache.hadoop.util.CopyFiles$CopyFilesMapper.map(CopyFiles.java:268)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

    2008-09-09 14:01:03,950 WARN org.apache.hadoop.mapred.TaskTracker: Error
    running child
    java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
    at
    org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)


    On Tue, Sep 9, 2008 at 1:47 PM, Michael Di Domenico
    wrote:
    Apparently, the fix to my original error is because hadoop is setup for a
    single local machine out of the box and i had to change these directories
    <property>
    <name>mapred.local.dir</name>
    <value>/hadoop/mapred/local</value>
    </property>
    <property>
    <name>mapred.system.dir</name>
    <value>/hadoop/mapred/system</value>
    </property>
    <property>
    <name>mapred.temp.dir</name>
    <value>/hadoop/mapred/temp</value>
    </property>

    to be hdfs instead of "hadoop.tmp.dir"

    So now distcp works as a non-hadoop user and mapred works as a non-hadoop
    user from the name node, however, from a workstation i get this now

    blue:hadoop-0.17.1 mdidomenico$ bin/hadoop distcp
    "file:///Users/mdidomenico/hadoop/1gTestfile" "1gTestfile-1"
    08/09/09 13:44:19 INFO util.CopyFiles:
    srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
    08/09/09 13:44:19 INFO util.CopyFiles: destPath=1gTestfile-1
    08/09/09 13:44:20 INFO util.CopyFiles: srcCount=1
    08/09/09 13:44:22 INFO mapred.JobClient: Running job: job_200809091332_0004
    08/09/09 13:44:23 INFO mapred.JobClient: map 0% reduce 0%
    08/09/09 13:44:31 INFO mapred.JobClient: Task Id :
    task_200809091332_0004_m_000000_0, Status : FAILED
    java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
    at
    org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
    at
    org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

    08/09/09 13:44:50 INFO mapred.JobClient: Task Id :
    task_200809091332_0004_m_000000_1, Status : FAILED
    java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
    at
    org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
    at
    org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

    08/09/09 13:45:07 INFO mapred.JobClient: Task Id :
    task_200809091332_0004_m_000000_2, Status : FAILED
    java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
    at
    org.apache.hadoop.util.CopyFiles$CopyFilesMapper.close(CopyFiles.java:527)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
    at
    org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

    08/09/09 13:45:26 INFO mapred.JobClient: map 100% reduce 100%
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1062)
    at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604)
    at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)


    On Tue, Sep 9, 2008 at 1:14 PM, Michael Di Domenico <
    [email protected]> wrote:
    manually creating the "system" directory gets me past the first error, but
    now i get this. i'm not necessarily sure its a step forward though, because
    the map task never shows up in the jobtracker
    [[email protected] hadoop-0.17.1]$ bin/hadoop distcp
    "file:///home/mdidomenico/1gTestfile" "1gTestfile"
    08/09/09 13:12:06 INFO util.CopyFiles:
    srcPaths=[file:/home/mdidomenico/1gTestfile]
    08/09/09 13:12:06 INFO util.CopyFiles: destPath=1gTestfile
    08/09/09 13:12:07 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:07 INFO dfs.DFSClient: Abandoning block
    blk_5758513071638050362
    08/09/09 13:12:13 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:13 INFO dfs.DFSClient: Abandoning block
    blk_1691495306775808049
    08/09/09 13:12:17 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:17 INFO dfs.DFSClient: Abandoning block
    blk_1027634596973755899
    08/09/09 13:12:19 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:19 INFO dfs.DFSClient: Abandoning block
    blk_4535302510016050282
    08/09/09 13:12:23 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:23 INFO dfs.DFSClient: Abandoning block
    blk_7022658012001626339
    08/09/09 13:12:25 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:25 INFO dfs.DFSClient: Abandoning block
    blk_-4509681241839967328
    08/09/09 13:12:29 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:29 INFO dfs.DFSClient: Abandoning block
    blk_8318033979013580420
    08/09/09 13:12:31 WARN dfs.DFSClient: DataStreamer Exception:
    java.io.IOException: Unable to create new block.
    08/09/09 13:12:31 WARN dfs.DFSClient: Error Recovery for block
    blk_-4509681241839967328 bad datanode[0]
    08/09/09 13:12:35 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:35 INFO dfs.DFSClient: Abandoning block
    blk_2848354798649979411
    08/09/09 13:12:41 WARN dfs.DFSClient: DataStreamer Exception:
    java.io.IOException: Unable to create new block.
    08/09/09 13:12:41 WARN dfs.DFSClient: Error Recovery for block
    blk_2848354798649979411 bad datanode[0]
    Exception in thread "Thread-0" java.util.ConcurrentModificationException
    at java.util.TreeMap$PrivateEntryIterator.nextEntry(Unknown
    Source)
    at java.util.TreeMap$KeyIterator.next(Unknown Source)
    at org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217)
    at
    org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214)
    at
    org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1324)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:224)
    at
    org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:209)
    08/09/09 13:12:41 INFO dfs.DFSClient: Exception in createBlockOutputStream
    java.io.IOException: Could not read from stream
    08/09/09 13:12:41 INFO dfs.DFSClient: Abandoning block
    blk_9189111926428577428

    On Tue, Sep 9, 2008 at 1:03 PM, Michael Di Domenico <
    [email protected]> wrote:
    a little more digging and it appears i cannot run distcp as someone other
    then hadoop on the namenode
    /tmp/hadoop-hadoop/mapred/system/job_200809091231_0005/job.xml

    looking at this directory from the error file the "system" directory does
    not exist on the namenode, i only have a "local" directory


    On Tue, Sep 9, 2008 at 12:41 PM, Michael Di Domenico <
    [email protected]> wrote:
    i'm not sure that's the issue, i basically tarred up the hadoop
    directory from the cluster and copied it over to the non-data node
    but i do agree i've likely got a setting wrong, since i can run distcp
    from the namenode and it works fine. the question is which one
    On Mon, Sep 8, 2008 at 7:04 PM, Aaron Kimball wrote:

    It is likely that you mapred.system.dir and/or fs.default.namesettings are
    incorrect on the non-datanode machine that you are launching the task
    from.
    These two settings (in your conf/hadoop-site.xml file) must match the
    settings on the cluster itself.

    - Aaron

    On Sun, Sep 7, 2008 at 8:58 PM, Michael Di Domenico
    wrote:
    I'm attempting to load data into hadoop (version 0.17.1), from a
    non-datanode machine in the cluster. I can run jobs and
    copyFromLocal
    works
    fine, but when i try to use distcp i get the below. I'm don't
    understand
    what the error, can anyone help?
    Thanks

    blue:hadoop-0.17.1 mdidomenico$ time bin/hadoop distcp -overwrite
    file:///Users/mdidomenico/hadoop/1gTestfile
    /user/mdidomenico/1gTestfile
    08/09/07 23:56:06 INFO util.CopyFiles:
    srcPaths=[file:/Users/mdidomenico/hadoop/1gTestfile]
    08/09/07 23:56:06 INFO util.CopyFiles:
    destPath=/user/mdidomenico/1gTestfile1
    08/09/07 23:56:07 INFO util.CopyFiles: srcCount=1
    With failures, global counters are inaccurate; consider running with -i
    Copy failed: org.apache.hadoop.ipc.RemoteException:
    java.io.IOException:
    /tmp/hadoop-hadoop/mapred/system/job_200809072254_0005/job.xml: No such
    file
    or directory
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:215)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:149)
    at
    org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1155)
    at
    org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1136)
    at
    org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:175)
    at
    org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1755)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)

    at org.apache.hadoop.ipc.Client.call(Client.java:557)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
    at $Proxy1.submitJob(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at

    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at

    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:585)
    at

    org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at

    org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy1.submitJob(Unknown Source)
    at
    org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:758)
    at
    org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
    at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:604)
    at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763)

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedSep 8, '08 at 3:58a
activeSep 9, '08 at 6:17p
posts7
users2
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2023 Grokbase