Hi all,

I'm running 0.19.2 in EC2, and running into an occasional problem with

The call to getTaskTrackers() is being made in the job jar's main
function, before the job starts running I need to control some aspects
of my job, for example setting the number of reduce tasks to be
exactly equal to the number of servers, which should be equal to the
number of task trackers.

Every so often (currently < 5%) the call to getTaskTrackers() will
return a value less than expected - e.g. 2 instead of 6. This happens
even when ClusterStatus.getJobTrackerState() returns State.RUNNING.

I'm assuming the problem is that some of the task trackers are taking
extra time to spin up. I saw HADOOP-5337 (https://issues.apache.org/jira/browse/HADOOP-5337
), which seems related, though that's for restarts vs. initial startup.

Given that the JobTracker waits for slaves to self-report, there
doesn't seem to be a totally reliable, automatic solution to this
issue, but I thought I'd ask to see if there's something I'm missing.


-- Ken

Ken Krugler
+1 530-210-6378
e l a s t i c w e b m i n i n g

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
postedApr 16, '10 at 5:00p
activeApr 16, '10 at 5:00p

1 user in discussion

Ken Krugler: 1 post



site design / logo © 2022 Grokbase