Hello Friends,
I am seeing Hadoop log timestamps & file timestamps not same as system
time. I found that this problem was discussed on on mailing list earlier but
there was no solution posted.
http://www.mail-archive.com/[email protected]/msg00310.html
I am wondering what is causing this behaviour and how to fix it.
As you can see below system time is in UTC and Hadoop is showing EST.
[[email protected]~]$ date
Mon Jan 10 21:54:17 UTC 2011
[[email protected] ~]$ hadoop jar $HADOOP_HOME/*examples.jar teragen 1000
tera-dir
Generating 1000 using 2 maps with step of 500
11/01/10 16:50:33 INFO mapred.JobClient: Running job: job_201101061958_2590
11/01/10 16:50:34 INFO mapred.JobClient: map 0% reduce 0%
11/01/10 16:50:40 INFO mapred.JobClient: map 100% reduce 0%
11/01/10 16:50:40 INFO mapred.JobClient: Job complete: job_201101061958_2590
11/01/10 16:50:40 INFO mapred.JobClient: Counters: 6
11/01/10 16:50:40 INFO mapred.JobClient: Job Counters
11/01/10 16:50:40 INFO mapred.JobClient: Launched map tasks=2
11/01/10 16:50:40 INFO mapred.JobClient: FileSystemCounters
I thought possible workaround for this issue to add -Duser.timezone in
conf/hadoop-env.sh but it works only on my localsystem but not on cluster.
Why is hadoop reporting wrong EST time format? From where hadoop reads time?
Any help is greatly appreciated.
Thanks in advance,
Ravi