On Nov 24, 2008, at 9:49 AM, Steve Loughran wrote:
Scott Whitecross wrote:
Thanks Brian. So you have had luck w/ log4j?
We grab logs off machines by not using lo4j and routing to our own
logging infrastructure that can feed events to other boxes via RMI
and queues. This stuff slots in behind commons-logging, with a
custom commons-logging bridge specified on the command line. To get
this into Hadoop I had to patch hadoop.jar and remove the properties
file that bound it only to log4j. The central receiver/SPOF logs
events by sent time and received time and can store all results into
text files intermixed for post-processing. It's good for testing,
but on a big production cluster you'd want something more robust and
Sounds like a cool setup, but might be a little much for Scott's
purposes (trying to debug a single Map phase...). Scott, I have been
able to successfully add new log4j loggers, but in Hadoop code, not in
a M-R task. If you try things in local mode, you'll be guaranteed to
have the same JVM, so the configuration should be loaded the same way.
Then again, I might be putting words into Scott's mouth: maybe he does
indeed want to scale this way up and turn it into a "logging
Scott, did you have any luck debugging the job through the wiki
document on debugging mapreduce? I'd make sure to start there before
you take too much of a detour into log4j-land.