FAQ
Given the goal of a shared data accessable across the Map instances,
can someone please explain some of the differences between using:
- setNumTasksToExecutePerJvm() and then having statically declared
data initialised in Mapper.configure(); and
- a MultithreadedMapRunner?

Regards,
Shane

On Wed, Nov 26, 2008 at 6:41 AM, Doug Cutting wrote:
tim robertson wrote:
Thanks Alex - this will allow me to share the shapefile, but I need to
"one time only per job per jvm" read it, parse it and store the
objects in the index.
Is the Mapper.configure() the best place to do this? E.g. will it
only be called once per job?
In 0.19, with HADOOP-249, all tasks from a job can be run in a single JVM.
So, yes, you could access a static cache from Mapper.configure().

Doug

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 10 of 11 | next ›
Discussion Overview
groupcommon-user @
categorieshadoop
postedNov 25, '08 at 7:10p
activeDec 1, '08 at 6:42a
posts11
users6
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase