|| at Apr 10, 2009 at 7:24 pm
On Apr 10, 2009, at 2:06 PM, Todd Lipcon wrote:
On Fri, Apr 10, 2009 at 12:03 PM, Brian Bockelman <firstname.lastname@example.org
0.19.1 with a few convenience patches (mostly, they improve logging
local file system researchers can play around with our data
I'm curious about this. Could you elaborate a bit on what kind of
you're logging? I'm interested in what FS metrics you're looking at
you instrumented the code.
No clue what they're doing *with* the data, but I know what we've
applied to HDFS to get the data. We apply both of these patches:http://issues.apache.org/jira/browse/HADOOP-5222https://issues.apache.org/jira/browse/HADOOP-5625
This adds the duration and offset to each read. Each read is then
logged through the HDFS audit mechanisms. We've been pulling the logs
through the web interface and putting them back into HDFS, then
processing them (actually, today we've been playing with log
collection via Chukwa).
There is a student who is looking at our cluster's I/O access
patterns, and there's a few folks who do work in designing metadata
caching algorithms that love to see application traces. Personally,
I'm interested in hooking the logfiles up to our I/O accounting system
so I can keep historical records of transfers and compare it to our
other file systems.