Hi,

In the past few weeks we evaluated and partially migrated from Hadoop
0.20.203.0 to 0.22.0. Most stuff works fine locally and simple jobs do well on
the cluster. However, the most essential part of Nutch, the fetcher, seems to
be very unstable on 0.22.0. In every crawl i can no be almost certain that at
least some mappers mysteriously freeze and eventually time out. Other mappers
are killed straight away or after a few minutes because of OOM errors. Memory
consumption is also a lot higher on 0.22.0.

Right now we have three clusters, an old 0.20.203 cluster and the unstable
0.22.0 and a 0.20.205 running on the same new cluster. When we run identical
jobs on all three clusters 0.22.0 almost always fails, eating RAM and
occasionally freezing a mapper. Stack traces of those mappers show all threads
are blocked and sometimes we see jstack unable to print deadlocks (null).

I tried many settings for 0.22.0 and very conservative settings for Nutch such
as few threads to spare resources (which are abundant actually) but i cannot
seem to find the issue. The fetcher job still uses the old mapred API.

I'd like to present a better issue report but i don't know what component in
all this mess is actually responsible. It looks like the tasktracker but i'm
unsure.

If anyone can point us in the right direction so we can find the issue and
assist in fixing it that would be great.

Thanks

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedDec 23, '11 at 2:19p
activeDec 23, '11 at 2:19p
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Markus Jelsma: 1 post

People

Translate

site design / logo © 2022 Grokbase