FAQ
Hi everyone,


Currently I got a MapReduce program that soring input records and Map-Reduce them to output records with priority information for each of them. So far the program is running on 1 mainnode and 3 datanodes.

And I got data something like following:

--------------------------------------
number of records: 1000000 records
time to process: 100 seconds
input bytes : 20MB
number of datanodes: 3
-------------------------------------


I am wondering if I could make some assumption like "giving me 2000000 records" and the program could finish that in "200 seconds" ?

Just any kind of feasibility of scability will be helpful, as it is important to my analysis on the master thesis.

Any idea is well appreciated!

Thanks,

-Kun

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedNov 30, '09 at 9:16p
activeNov 30, '09 at 9:16p
posts1
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Kunsheng Chen: 1 post

People

Translate

site design / logo © 2022 Grokbase