FAQ
Hi all

I am trying to write multiple files to HDFS in each Mapper. Each mapper
parses the text inputs, and writes different outputs to 16 different files
(other than 1 file in general OuputFormat). However, this leads the
java.io.EOFException from "DataInputStream" or java.io.IOException about
"bad connection".

Currently, each slave node maximumly runs 3 mappers, and each
mappers simultaneously write 16 files (about 3 minutes per map-task).
Replication is 3. Hadoop version is 0.20.2. The writer I used is
SequenceFile.writer.

Such exception may be reduced or eliminated, if i reduce the maptask
capacity or reduce the dfs.replication to 2, .

May I know how to avoid such exceptions if I hope to write multiple files in
my OutputFormat? Any suggestions would be very appreciated!

Here are the Exceptions:

java.io.EOFException
at java.io.DataInputStream.readByte(DataInputStream.java:250)
at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
at org.apache.hadoop.io.Text.readString(Text.java:400)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2913)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2838)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2114)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2300)

java.io.IOException: Bad connect ack with firstBadLink 192.168.22.1:55610
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2915)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2838)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2114)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2300)

Thanks!
-
Regards
Yuting

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 12, '10 at 4:34p
activeJul 12, '10 at 4:34p
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Yuting Lin: 1 post

People

Translate

site design / logo © 2022 Grokbase