Consider NLineInputFormat. -C
On Fri, Dec 4, 2009 at 5:34 PM, Ted Xu wrote:
Hi Daniel,

I think there are better solutions, but simply chop the input file into
pieces ( i.e. 10 urls per file ) shall work.

2009/12/4 Daniel Garcia <daniel@danielgarcia.info>
Hello!
I'm trying to rewrite an image resizing program in terms of
map/reduce. The problem I see is that the job is not broken up in to small
enough tasks. If I only have 1 input file with 10,000 urls (the file is much
less than the HDFS block size) how can I ensure that the job is distributed
amongst all the nodes. In other words how can I ensure that the task size is
small enough so that all nodes process a proportional size of the input.
Regards,
Daniel
Best Regards,

Tex Xu

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 3 | next ›
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedDec 3, '09 at 4:26p
activeDec 5, '09 at 2:06a
posts3
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase