If you can do with read only constants, then you can define static finals
somewhere or other. They won't really be global, but since you never change
them, that won't matter.
If you just want global status indicators, then look at what the reporter
If you really want read/write global variables, then you have a real
problem. In fact, that is the shared memory emulation problem all over
again and that is what map-reduce is intended to side step. Such programs
can often be re-written so that you have an extra map reduce step or you
have additional input that gets sorted out to the mapper or reducer that
needs the values.
If you really, really can't restate your program in this fashion, then you
probably don't have a problem that is suitable for map-reduce. You might be
able to make use of something like hbase to give you database like
operations, but you may just have different kind of problem. You might be
surprised at what a wide variety of problems are amenable to map-reduce
What is it that makes you want these global variables?
On 10/12/07 5:09 PM, "James Yu" wrote:
What is the best practice if I DO need to have some global variables
accessible to ALL mappers and ALL reducers which are distributed? Is there
On 10/12/07, Owen O'Malley wrote:
On Oct 11, 2007, at 9:54 PM, James Yu wrote:
I put all user global variables in a class I called MyGlobals.
Since map/reduce is distributed in general, you should be careful of
using global variables. I find it to be better practice to keep all
of the state variables in either the Mapper or Reducer itself to
remind myself that it is _not_ shared between Mappers, Reducers, and
the launching program.