FAQ
Hi,
I are trying to upload multiple files in order 100K thru map reduce.
These files are a product of our map phase.
However, Most of tasks just times-out.
After taking a look at jstack of these map-task, I see typically see :
"DataStreamer for file some_file block blk_-395146032100195880" prio=10
tid=0x0a112c00 nid=0x4f58 in Object.wait() [0x8d4d4000..0x8d4
d50d0]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1577)
- locked <0xb193d508> (a java.util.LinkedList)

"ResponseProcessor for block blk_4324263046489005957" prio=10
tid=0x0a112000 nid=0x4e69 runnable [0x8d067000..0x8d0671d0]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.DataInputStream.readFully(DataInputStream.java:178)
at java.io.DataInputStream.readLong(DataInputStream.java:399)
at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:1726)


And it looks like these threads remain in the above state for a loooong
time. Eventually the task times out

I am not sure, if uploading 100K file thru map-reduce is a good idea.
But, I have to do it somewhere ( external dependencies )

I would really appreciate any help/suggestions/pointers to avoid these
timeouts.

BTW, I am on Hadoop 0.16

Thanks
- Sagar

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 1 | next ›
Discussion Overview
groupcommon-user @
categorieshadoop
postedOct 13, '08 at 7:05a
activeOct 13, '08 at 7:05a
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Sagar Naik: 1 post

People

Translate

site design / logo © 2022 Grokbase