FAQ
I am interested in Hadoop usage and internal mechanism investigation. Hope can contribute to this community and learn more from all of you.

Thanks!
starlee

Search Discussions

  • Gang Luo at Mar 31, 2010 at 3:51 am
    Hi all,
    I find there is a directory "_log/history/..." under the output directory of a mapreduce job. Is the file in that directory a log file? Is the information there sufficient to allow me to figure out what nodes the job runs on? Besides, not every job has such a directory. Is there such settings controlling this? Or is there other ways to get the nodes my job runs on?

    Thanks,
    -Gang
  • Abhishek sharma at Mar 31, 2010 at 5:16 pm
    Gang,

    In the log/history directory, two files are created for each job--one
    xml file that records the configuration and the other file has log
    entries. These log entries have all the information about the
    individual map and reduce tasks related to a job--which nodes they ran
    on, duration, input size, etc.

    A single log/history directory is created by Hadoop and files related
    to all the jobs executed are stored there.

    Abhishek
    On Tue, Mar 30, 2010 at 8:50 PM, Gang Luo wrote:
    Hi all,
    I find there is a directory "_log/history/..." under the output directory of a mapreduce job. Is the file in that directory a log file? Is the information there sufficient to allow me to figure out what nodes the job runs on? Besides, not every job has such a directory. Is there such settings controlling this? Or is there other ways to get the nodes my job runs on?

    Thanks,
    -Gang


  • Gang Luo at Mar 31, 2010 at 6:44 pm
    Thanks Abhishek.
    but I observe that some of my job output has no such _log directory. Actually, I run a script which launch 100+ jobs. I didn't find the log for any of the output. Any ideas?

    Thanks,
    -Gang


    ----- 原始邮件 ----
    发件人: abhishek sharma <absharma@usc.edu>
    收件人: common-user@hadoop.apache.org
    发送日期: 2010/3/31 (周三) 1:15:48 下午
    主 题: Re: log

    Gang,

    In the log/history directory, two files are created for each job--one
    xml file that records the configuration and the other file has log
    entries. These log entries have all the information about the
    individual map and reduce tasks related to a job--which nodes they ran
    on, duration, input size, etc.

    A single log/history directory is created by Hadoop and files related
    to all the jobs executed are stored there.

    Abhishek
    On Tue, Mar 30, 2010 at 8:50 PM, Gang Luo wrote:
    Hi all,
    I find there is a directory "_log/history/..." under the output directory of a mapreduce job. Is the file in that directory a log file? Is the information there sufficient to allow me to figure out what nodes the job runs on? Besides, not every job has such a directory. Is there such settings controlling this? Or is there other ways to get the nodes my job runs on?

    Thanks,
    -Gang


  • Amareshwari Sri Ramadasu at Apr 1, 2010 at 4:27 am
    Along with JobTracker maintaining history in ${hadoop.log.dir}/logs/history, in branch 0.20, the job history is available in a user location also. User location can be specified for configuration “hadoop.job.history.user.location”. By default, if nothing is specified for the configuration, the history will be created in output directory of the job. The user history can be disabled by specifying the value “none” for configuration.

    Gang, if you are not seeing the history for some of your jobs, there could be a couple of reasons.

    1. Your job does not have any output directory. You can specify a different location for user history.
    2. Job history got disabled for some problem with Job’s configuration. You can check JobTracker logs here and verify if the history got disabled.

    Thanks
    Amareshwari

    On 4/1/10 12:13 AM, "Gang Luo" wrote:

    Thanks Abhishek.
    but I observe that some of my job output has no such _log directory. Actually, I run a script which launch 100+ jobs. I didn't find the log for any of the output. Any ideas?

    Thanks,
    -Gang


    ----- 原始邮件 ----
    发件人: abhishek sharma <absharma@usc.edu>
    收件人: common-user@hadoop.apache.org
    发送日期: 2010/3/31 (周三) 1:15:48 下午
    主 题: Re: log

    Gang,

    In the log/history directory, two files are created for each job--one
    xml file that records the configuration and the other file has log
    entries. These log entries have all the information about the
    individual map and reduce tasks related to a job--which nodes they ran
    on, duration, input size, etc.

    A single log/history directory is created by Hadoop and files related
    to all the jobs executed are stored there.

    Abhishek
    On Tue, Mar 30, 2010 at 8:50 PM, Gang Luo wrote:
    Hi all,
    I find there is a directory "_log/history/..." under the output directory of a mapreduce job. Is the file in that directory a log file? Is the information there sufficient to allow me to figure out what nodes the job runs on? Besides, not every job has such a directory. Is there such settings controlling this? Or is there other ways to get the nodes my job runs on?

    Thanks,
    -Gang


Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedMar 29, '10 at 4:58a
activeApr 1, '10 at 4:27a
posts5
users4
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase