|| at Mar 13, 2010 at 8:22 am
From our test of hadoop-0.20.1 on 10 nodes, we find the setup period is
longer as more jobs are submitted. I don't know why maptask for setup is
needed, why not jobtracker or one thread takes over this work?
2010/3/11 Jeff Zhang <email@example.com>
I look at the source code, it seems it is the JobTracker initiate the
and cleanup task.
And why do you think the setup and cleanup phases consume a lot of time,
actually the time cost is depend on the OutputCommitter
On Thu, Mar 11, 2010 at 11:04 AM, Min Zhou wrote:
Why hadoop jobs need setup and cleanup phases which would consume a
lot of time ? Why could not us archieve it like a distributed RDBMS
does a master process coordinates all salve nodes through socket.
I think that will save plenty of time if there won't be any setups and
cleanups. What's hadoop philosophy on this?
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.