FAQ
hi,

I am encountering a problem when running a hadoop job with a relative large dataset(about 400M) in single-node hadoop environment.
The error said that dfs fail to create new block. However, the size of physical disk is large enough. So is there any reason for this failure? any limitation for the size of disk space a job can occupy? The following is the snippet of exception stack. Thanks for your attention.

Regards,
Jianmin


2009-07-28 18:00:31,757 INFO org.apache.hadoop.mapred.Merger: Merging 1 sorted segments
2009-07-28 18:00:31,792 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1272809137 bytes
2009-07-28 18:01:06,521 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException
2009-07-28 18:01:06,521 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_2149418359249628613_12378
2009-07-28 18:01:12,578 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException
2009-07-28 18:01:12,578 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-4276450909968435375_12378
2009-07-28 18:01:18,581 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException
2009-07-28 18:01:18,581 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_1370666846409896923_12378
2009-07-28 18:01:24,584 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException
2009-07-28 18:01:24,584 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-8117322104093252360_12378
2009-07-28 18:01:30,621 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block.
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2781)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)

2009-07-28 18:01:30,622 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-8117322104093252360_12378 bad datanode[0] nodes == null
2009-07-28 18:01:30,622 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file "/data/segment/dat_4_8" - Aborting...
2009-07-28 18:01:30,635 WARN org.apache.hadoop.mapred.TaskTracker: Error running child
java.io.EOFException
at java.io.DataInputStream.readByte(DataInputStream.java:250)
at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
at org.apache.hadoop.io.Text.readString(Text.java:400)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2837)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2762)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)


2009-07-28 18:01:30,645 INFO org.apache.hadoop.mapred.TaskRunner: Runnning cleanup for the task

Search Discussions

  • Jason Venner at Jul 28, 2009 at 12:37 pm
    Looks like a possible communication failure with your Datanode, possibly out
    of file descriptors or some networking issue? What version of hadoop are you
    running?

    2009-07-28 18:01:30,622 WARN org.apache.hadoop.hdfs.
    DFSClient: Could not get block locations. Source file
    "/data/segment/dat_4_8" - Aborting...
    2009-07-28 18:01:30,635 WARN org.apache.hadoop.mapred.TaskTracker: Error
    running child
    On Tue, Jul 28, 2009 at 3:16 AM, Jianmin Woo wrote:

    hi,

    I am encountering a problem when running a hadoop job with a relative large
    dataset(about 400M) in single-node hadoop environment.
    The error said that dfs fail to create new block. However, the size of
    physical disk is large enough. So is there any reason for this failure? any
    limitation for the size of disk space a job can occupy? The following is the
    snippet of exception stack. Thanks for your attention.

    Regards,
    Jianmin


    2009-07-28 18:00:31,757 INFO org.apache.hadoop.mapred.Merger: Merging 1
    sorted segments
    2009-07-28 18:00:31,792 INFO org.apache.hadoop.mapred.Merger: Down to the
    last merge-pass, with 1 segments left of total size: 1272809137 bytes
    2009-07-28 18:01:06,521 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
    createBlockOutputStream java.io.EOFException
    2009-07-28 18:01:06,521 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
    block blk_2149418359249628613_12378
    2009-07-28 18:01:12,578 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
    createBlockOutputStream java.io.EOFException
    2009-07-28 18:01:12,578 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
    block blk_-4276450909968435375_12378
    2009-07-28 18:01:18,581 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
    createBlockOutputStream java.io.EOFException
    2009-07-28 18:01:18,581 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
    block blk_1370666846409896923_12378
    2009-07-28 18:01:24,584 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
    createBlockOutputStream java.io.EOFException
    2009-07-28 18:01:24,584 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
    block blk_-8117322104093252360_12378
    2009-07-28 18:01:30,621 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
    Exception: java.io.IOException: Unable to create new block.
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2781)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)

    2009-07-28 18:01:30,622 WARN org.apache.hadoop.hdfs.DFSClient: Error
    Recovery for block blk_-8117322104093252360_12378 bad datanode[0] nodes ==
    null
    2009-07-28 18:01:30,622 WARN org.apache.hadoop.hdfs.DFSClient: Could not
    get block locations. Source file "/data/segment/dat_4_8" - Aborting...
    2009-07-28 18:01:30,635 WARN org.apache.hadoop.mapred.TaskTracker: Error
    running child
    java.io.EOFException
    at java.io.DataInputStream.readByte(DataInputStream.java:250)
    at
    org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
    at
    org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
    at org.apache.hadoop.io.Text.readString(Text.java:400)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2837)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2762)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)


    2009-07-28 18:01:30,645 INFO org.apache.hadoop.mapred.TaskRunner: Runnning
    cleanup for the task





    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
  • Jianmin Woo at Jul 29, 2009 at 3:19 am
    Thanks a lot for your response, Venner.

    I am running hadoop-0.20 on a standalone machine. It seems that the data node is still in running. How can I check the if it is file descriptor or networks issue?

    Thanks,
    Jianmin




    ________________________________
    From: Jason Venner <jason.hadoop@gmail.com>
    To: common-user@hadoop.apache.org
    Sent: Tuesday, July 28, 2009 8:30:23 PM
    Subject: Re: dfs fail to Unable to create new block

    Looks like a possible communication failure with your Datanode, possibly out
    of file descriptors or some networking issue? What version of hadoop are you
    running?

    2009-07-28 18:01:30,622 WARN org.apache.hadoop.hdfs.
    DFSClient: Could not get block locations. Source file
    "/data/segment/dat_4_8" - Aborting...
    2009-07-28 18:01:30,635 WARN org.apache.hadoop.mapred.TaskTracker: Error
    running child
    On Tue, Jul 28, 2009 at 3:16 AM, Jianmin Woo wrote:

    hi,

    I am encountering a problem when running a hadoop job with a relative large
    dataset(about 400M) in single-node hadoop environment.
    The error said that dfs fail to create new block. However, the size of
    physical disk is large enough. So is there any reason for this failure? any
    limitation for the size of disk space a job can occupy? The following is the
    snippet of exception stack. Thanks for your attention.

    Regards,
    Jianmin


    2009-07-28 18:00:31,757 INFO org.apache.hadoop.mapred.Merger: Merging 1
    sorted segments
    2009-07-28 18:00:31,792 INFO org.apache.hadoop.mapred.Merger: Down to the
    last merge-pass, with 1 segments left of total size: 1272809137 bytes
    2009-07-28 18:01:06,521 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
    createBlockOutputStream java.io.EOFException
    2009-07-28 18:01:06,521 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
    block blk_2149418359249628613_12378
    2009-07-28 18:01:12,578 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
    createBlockOutputStream java.io.EOFException
    2009-07-28 18:01:12,578 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
    block blk_-4276450909968435375_12378
    2009-07-28 18:01:18,581 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
    createBlockOutputStream java.io.EOFException
    2009-07-28 18:01:18,581 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
    block blk_1370666846409896923_12378
    2009-07-28 18:01:24,584 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
    createBlockOutputStream java.io.EOFException
    2009-07-28 18:01:24,584 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
    block blk_-8117322104093252360_12378
    2009-07-28 18:01:30,621 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
    Exception: java.io.IOException: Unable to create new block.
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2781)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)

    2009-07-28 18:01:30,622 WARN org.apache.hadoop.hdfs.DFSClient: Error
    Recovery for block blk_-8117322104093252360_12378 bad datanode[0] nodes ==
    null
    2009-07-28 18:01:30,622 WARN org.apache.hadoop.hdfs.DFSClient: Could not
    get block locations. Source file "/data/segment/dat_4_8" - Aborting...
    2009-07-28 18:01:30,635 WARN org.apache.hadoop.mapred.TaskTracker: Error
    running child
    java.io.EOFException
    at java.io.DataInputStream.readByte(DataInputStream.java:250)
    at
    org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
    at
    org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
    at org.apache.hadoop.io.Text.readString(Text.java:400)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2837)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2762)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)


    2009-07-28 18:01:30,645 INFO org.apache.hadoop.mapred.TaskRunner: Runnning
    cleanup for the task





    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
  • Jason Venner at Jul 30, 2009 at 2:56 pm
    The default number of file descriptors is usually quite small, 256 for
    solaris or 1024 for linux.

    Simply try changing it, please use your favorite search engine to find
    instructions for your operating system, and restart your cluster and your
    job.
    The problem may go away, that is the simplest way to tell for file
    descriptors.
    On Tue, Jul 28, 2009 at 8:18 PM, Jianmin Woo wrote:

    Thanks a lot for your response, Venner.

    I am running hadoop-0.20 on a standalone machine. It seems that the data
    node is still in running. How can I check the if it is file descriptor or
    networks issue?

    Thanks,
    Jianmin




    ________________________________
    From: Jason Venner <jason.hadoop@gmail.com>
    To: common-user@hadoop.apache.org
    Sent: Tuesday, July 28, 2009 8:30:23 PM
    Subject: Re: dfs fail to Unable to create new block

    Looks like a possible communication failure with your Datanode, possibly
    out
    of file descriptors or some networking issue? What version of hadoop are
    you
    running?

    2009-07-28 18:01:30,622 WARN org.apache.hadoop.hdfs.
    DFSClient: Could not get block locations. Source file
    "/data/segment/dat_4_8" - Aborting...
    2009-07-28 18:01:30,635 WARN org.apache.hadoop.mapred.TaskTracker: Error
    running child
    On Tue, Jul 28, 2009 at 3:16 AM, Jianmin Woo wrote:

    hi,

    I am encountering a problem when running a hadoop job with a relative large
    dataset(about 400M) in single-node hadoop environment.
    The error said that dfs fail to create new block. However, the size of
    physical disk is large enough. So is there any reason for this failure? any
    limitation for the size of disk space a job can occupy? The following is the
    snippet of exception stack. Thanks for your attention.

    Regards,
    Jianmin


    2009-07-28 18:00:31,757 INFO org.apache.hadoop.mapred.Merger: Merging 1
    sorted segments
    2009-07-28 18:00:31,792 INFO org.apache.hadoop.mapred.Merger: Down to the
    last merge-pass, with 1 segments left of total size: 1272809137 bytes
    2009-07-28 18:01:06,521 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
    createBlockOutputStream java.io.EOFException
    2009-07-28 18:01:06,521 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
    block blk_2149418359249628613_12378
    2009-07-28 18:01:12,578 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
    createBlockOutputStream java.io.EOFException
    2009-07-28 18:01:12,578 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
    block blk_-4276450909968435375_12378
    2009-07-28 18:01:18,581 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
    createBlockOutputStream java.io.EOFException
    2009-07-28 18:01:18,581 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
    block blk_1370666846409896923_12378
    2009-07-28 18:01:24,584 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
    createBlockOutputStream java.io.EOFException
    2009-07-28 18:01:24,584 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
    block blk_-8117322104093252360_12378
    2009-07-28 18:01:30,621 WARN org.apache.hadoop.hdfs.DFSClient:
    DataStreamer
    Exception: java.io.IOException: Unable to create new block.
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2781)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)
    2009-07-28 18:01:30,622 WARN org.apache.hadoop.hdfs.DFSClient: Error
    Recovery for block blk_-8117322104093252360_12378 bad datanode[0] nodes ==
    null
    2009-07-28 18:01:30,622 WARN org.apache.hadoop.hdfs.DFSClient: Could not
    get block locations. Source file "/data/segment/dat_4_8" - Aborting...
    2009-07-28 18:01:30,635 WARN org.apache.hadoop.mapred.TaskTracker: Error
    running child
    java.io.EOFException
    at java.io.DataInputStream.readByte(DataInputStream.java:250)
    at
    org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
    at
    org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
    at org.apache.hadoop.io.Text.readString(Text.java:400)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2837)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2762)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046)
    at
    org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)

    2009-07-28 18:01:30,645 INFO org.apache.hadoop.mapred.TaskRunner: Runnning
    cleanup for the task





    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals





    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 28, '09 at 10:15a
activeJul 30, '09 at 2:56p
posts4
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Jason Venner: 2 posts Jianmin Woo: 2 posts

People

Translate

site design / logo © 2022 Grokbase