FAQ
Hue is using a plugin on the JobTracker to access the MapReduce logs. It
has a Thrift interface:

https://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/java/if/jobtracker.thrift
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_15_4.html#topic_15_4_2_unique_1

Romain

On Fri, Aug 2, 2013 at 3:02 AM, wrote:

Hi All,

As I know when we submit a Oozie job, we are able to retrieve the job log
through Oozie web services API (eg. GET /oozie/v1/job/job-3?show=log). However,
the logs that I got are Oozie job logs. From Hue, I can click view logs to
drill down and check on the syslog which contains all the map reduce logs.

Is there any way for me to get the log of the task tracker showing all the
map reduce logs? Anyone got any idea on this? I am planning to query real
time map reduce log.

Appreciate your help. Thanks.

Regards,
CYea

--

---
You received this message because you are subscribed to the Google Groups
"CDH Users" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to cdh-user+unsubscribe@cloudera.org.
For more options, visit
https://groups.google.com/a/cloudera.org/groups/opt_out.

Search Discussions

  • Chiewyea at Aug 4, 2013 at 2:33 pm
    Hi Romain,

    Thanks for your reply. However, is it I can use the same plugin and call
    the thrift interface? Actually i still don't have any idea to call the
    thrift interface from the program, could you shed me some light if
    possible? Thanks a lot.

    Regards,
    CYea
    On Saturday, August 3, 2013 12:20:44 AM UTC+8, Romain Rigaux wrote:

    Hue is using a plugin on the JobTracker to access the MapReduce logs. It
    has a Thrift interface:


    https://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/java/if/jobtracker.thrift

    http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_15_4.html#topic_15_4_2_unique_1

    Romain

    On Fri, Aug 2, 2013 at 3:02 AM, <chie...@gmail.com <javascript:>> wrote:

    Hi All,

    As I know when we submit a Oozie job, we are able to retrieve the job log
    through Oozie web services API (eg. GET /oozie/v1/job/job-3?show=log). However,
    the logs that I got are Oozie job logs. From Hue, I can click view logs to
    drill down and check on the syslog which contains all the map reduce logs.

    Is there any way for me to get the log of the task tracker showing all
    the map reduce logs? Anyone got any idea on this? I am planning to query
    real time map reduce log.

    Appreciate your help. Thanks.

    Regards,
    CYea

    --

    ---
    You received this message because you are subscribed to the Google Groups
    "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cdh-user+u...@cloudera.org <javascript:>.
    For more options, visit
    https://groups.google.com/a/cloudera.org/groups/opt_out.

  • Chiewyea at Aug 5, 2013 at 3:40 am
    Hi Romain,

    I go through apache thrift and get to know a little about thrift (The
    Apache Thrift software framework, for scalable cross-language services
    development, combines a software stack with a code generation engine to
    build services that work efficiently and seamlessly between C++, Java,
    Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js,
    Smalltalk, OCaml and Delphi and other languages). So is this means that I
    am able to generate C# code from the thrift file and then put the generated
    code in hadoop environment. Then i will need to modify the below to point
    to the source file that i created?

    <property>
       <name>mapred.jobtracker.plugins</name>
       <value>*org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin*</value>
       <description>Comma-separated list of jobtracker plug-ins to be activated.</description>
    </property>


    Appreciate your help. Thanks.

    Regards,
    CYea

    On Sunday, August 4, 2013 10:33:32 PM UTC+8, chie...@gmail.com wrote:

    Hi Romain,

    Thanks for your reply. However, is it I can use the same plugin and call
    the thrift interface? Actually i still don't have any idea to call the
    thrift interface from the program, could you shed me some light if
    possible? Thanks a lot.

    Regards,
    CYea
    On Saturday, August 3, 2013 12:20:44 AM UTC+8, Romain Rigaux wrote:

    Hue is using a plugin on the JobTracker to access the MapReduce logs. It
    has a Thrift interface:


    https://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/java/if/jobtracker.thrift

    http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_15_4.html#topic_15_4_2_unique_1

    Romain

    On Fri, Aug 2, 2013 at 3:02 AM, wrote:

    Hi All,

    As I know when we submit a Oozie job, we are able to retrieve the job
    log through Oozie web services API (eg. GET /oozie/v1/job/job-3?show=log
    ). However, the logs that I got are Oozie job logs. From Hue, I can
    click view logs to drill down and check on the syslog which contains all
    the map reduce logs.

    Is there any way for me to get the log of the task tracker showing all
    the map reduce logs? Anyone got any idea on this? I am planning to query
    real time map reduce log.

    Appreciate your help. Thanks.

    Regards,
    CYea

    --

    ---
    You received this message because you are subscribed to the Google
    Groups "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to cdh-user+u...@cloudera.org.
    For more options, visit
    https://groups.google.com/a/cloudera.org/groups/opt_out.

  • Abraham Elmahrek at Aug 5, 2013 at 4:25 am
    Hey There,

    With thrift, there are servers and clients. Hue comes with the server code
    in the form of a jar. The
    "org.apache.ahdoop.thriftfs.ThriftJobTrackerPlugin" property can use this
    jar to start a Thrift service (server).

    The server and client communicate using a language agnostic protocol. Thus,
    all you need to do is generate client "stubs". Then, in what ever program
    you are writing, call these stub methods. Take a look at
    http://diwakergupta.github.io/thrift-missing-guide/ and
    https://alireza-noori.com/blog/programming/thriftpart-six-implementing-c-client/
    .

    Here is what your C# code might look like (ripped directly from link above):
    using System;
    using Thrift.Transport;
    using Thrift.Protocol;
    using Calculator;

    namespace CalculatorClient
    {
         class Program
         {
             static void Main(string[] args)
             {
                 try
                 {
                     var socket = new TSocket("localhost", 9888);
                     var transport = new TBufferedTransport(socket);
                     var protocol = new TBinaryProtocol(transport);
                     var client = new CalculatorService.Client(protocol);
                     transport.Open();
                     Console.WriteLine(client.add(2, 3));
                 }
                 catch (Exception ex)
                 {
                     Console.WriteLine(ex.Message);
                 }
             }
         }
    }

    Notice how a socket, transport, protocol, and Client are instantiated. The
    Client should be generated for you.

    -Abe

    On Sun, Aug 4, 2013 at 8:40 PM, wrote:

    Hi Romain,

    I go through apache thrift and get to know a little about thrift (The
    Apache Thrift software framework, for scalable cross-language services
    development, combines a software stack with a code generation engine to
    build services that work efficiently and seamlessly between C++, Java,
    Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js,
    Smalltalk, OCaml and Delphi and other languages). So is this means that I
    am able to generate C# code from the thrift file and then put the generated
    code in hadoop environment. Then i will need to modify the below to point
    to the source file that i created?

    <property>
    <name>mapred.jobtracker.plugins</name>
    <value>*org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin*</value>
    <description>Comma-separated list of jobtracker plug-ins to be activated.</description>
    </property>


    Appreciate your help. Thanks.

    Regards,
    CYea

    On Sunday, August 4, 2013 10:33:32 PM UTC+8, chie...@gmail.com wrote:

    Hi Romain,

    Thanks for your reply. However, is it I can use the same plugin and call
    the thrift interface? Actually i still don't have any idea to call the
    thrift interface from the program, could you shed me some light if
    possible? Thanks a lot.

    Regards,
    CYea
    On Saturday, August 3, 2013 12:20:44 AM UTC+8, Romain Rigaux wrote:

    Hue is using a plugin on the JobTracker to access the MapReduce logs. It
    has a Thrift interface:

    https://github.com/cloudera/**hue/blob/master/desktop/libs/**
    hadoop/java/if/jobtracker.**thrift<https://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/java/if/jobtracker.thrift>
    http://www.cloudera.com/**content/cloudera-content/**
    cloudera-docs/CDH4/latest/**CDH4-Installation-Guide/**
    cdh4ig_topic_15_4.html#topic_**15_4_2_unique_1<http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_15_4.html#topic_15_4_2_unique_1>

    Romain

    On Fri, Aug 2, 2013 at 3:02 AM, wrote:

    Hi All,

    As I know when we submit a Oozie job, we are able to retrieve the job
    log through Oozie web services API (eg. GET
    /oozie/v1/job/job-3?show=log).** However, the logs that I got are
    Oozie job logs. From Hue, I can click view logs to drill down and check on
    the syslog which contains all the map reduce logs.

    Is there any way for me to get the log of the task tracker showing all
    the map reduce logs? Anyone got any idea on this? I am planning to query
    real time map reduce log.

    Appreciate your help. Thanks.

    Regards,
    CYea

    --

    ---
    You received this message because you are subscribed to the Google
    Groups "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to cdh-user+u...@cloudera.org.
    For more options, visit https://groups.google.com/a/**
    cloudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
    .

  • Chiewyea at Aug 5, 2013 at 9:21 am
    Hi Abe,

    Thanks for your clear explanation. I will try it again using the reference
    provided. Appreciate your help.

    Regards,
    CYea

    On Monday, August 5, 2013 12:25:11 PM UTC+8, abe wrote:

    Hey There,

    With thrift, there are servers and clients. Hue comes with the server code
    in the form of a jar. The
    "org.apache.ahdoop.thriftfs.ThriftJobTrackerPlugin" property can use this
    jar to start a Thrift service (server).

    The server and client communicate using a language agnostic protocol.
    Thus, all you need to do is generate client "stubs". Then, in what ever
    program you are writing, call these stub methods. Take a look at
    http://diwakergupta.github.io/thrift-missing-guide/ and
    https://alireza-noori.com/blog/programming/thriftpart-six-implementing-c-client/
    .

    Here is what your C# code might look like (ripped directly from link
    above):
    using System;
    using Thrift.Transport;
    using Thrift.Protocol;
    using Calculator;

    namespace CalculatorClient
    {
    class Program
    {
    static void Main(string[] args)
    {
    try
    {
    var socket = new TSocket("localhost", 9888);
    var transport = new TBufferedTransport(socket);
    var protocol = new TBinaryProtocol(transport);
    var client = new CalculatorService.Client(protocol);
    transport.Open();
    Console.WriteLine(client.add(2, 3));
    }
    catch (Exception ex)
    {
    Console.WriteLine(ex.Message);
    }
    }
    }
    }

    Notice how a socket, transport, protocol, and Client are instantiated. The
    Client should be generated for you.

    -Abe

    On Sun, Aug 4, 2013 at 8:40 PM, <chie...@gmail.com <javascript:>> wrote:

    Hi Romain,

    I go through apache thrift and get to know a little about thrift (The
    Apache Thrift software framework, for scalable cross-language services
    development, combines a software stack with a code generation engine to
    build services that work efficiently and seamlessly between C++, Java,
    Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js,
    Smalltalk, OCaml and Delphi and other languages). So is this means that I
    am able to generate C# code from the thrift file and then put the generated
    code in hadoop environment. Then i will need to modify the below to point
    to the source file that i created?

    <property>
    <name>mapred.jobtracker.plugins</name>
    <value>*org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin*</value>
    <description>Comma-separated list of jobtracker plug-ins to be activated.</description>
    </property>


    Appreciate your help. Thanks.

    Regards,
    CYea

    On Sunday, August 4, 2013 10:33:32 PM UTC+8, chie...@gmail.com wrote:

    Hi Romain,

    Thanks for your reply. However, is it I can use the same plugin and call
    the thrift interface? Actually i still don't have any idea to call the
    thrift interface from the program, could you shed me some light if
    possible? Thanks a lot.

    Regards,
    CYea
    On Saturday, August 3, 2013 12:20:44 AM UTC+8, Romain Rigaux wrote:

    Hue is using a plugin on the JobTracker to access the MapReduce logs.
    It has a Thrift interface:

    https://github.com/cloudera/**hue/blob/master/desktop/libs/**
    hadoop/java/if/jobtracker.**thrift<https://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/java/if/jobtracker.thrift>
    http://www.cloudera.com/**content/cloudera-content/**
    cloudera-docs/CDH4/latest/**CDH4-Installation-Guide/**
    cdh4ig_topic_15_4.html#topic_**15_4_2_unique_1<http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_15_4.html#topic_15_4_2_unique_1>

    Romain

    On Fri, Aug 2, 2013 at 3:02 AM, wrote:

    Hi All,

    As I know when we submit a Oozie job, we are able to retrieve the job
    log through Oozie web services API (eg. GET
    /oozie/v1/job/job-3?show=log).** However, the logs that I got are
    Oozie job logs. From Hue, I can click view logs to drill down and check on
    the syslog which contains all the map reduce logs.

    Is there any way for me to get the log of the task tracker showing all
    the map reduce logs? Anyone got any idea on this? I am planning to query
    real time map reduce log.

    Appreciate your help. Thanks.

    Regards,
    CYea

    --

    ---
    You received this message because you are subscribed to the Google
    Groups "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to cdh-user+u...@cloudera.org.
    For more options, visit https://groups.google.com/a/**
    cloudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
    .

  • Chiewyea at Aug 6, 2013 at 10:49 am
    Hi All,

    I am able to generate c# client stubs and connect to job tracker thrift
    server. However, when i get the job or task, I am not able to find a
    function that i can get the log. Can anyone let me know which function from
    thrift interface to query the logs for each task? Please find attached
    picture (circles in red) for the logs that i want to retrieve.

    Thanks a lot !!

    Regards,
    CYea
    On Monday, August 5, 2013 5:20:57 PM UTC+8, chie...@gmail.com wrote:

    Hi Abe,

    Thanks for your clear explanation. I will try it again using the reference
    provided. Appreciate your help.

    Regards,
    CYea

    On Monday, August 5, 2013 12:25:11 PM UTC+8, abe wrote:

    Hey There,

    With thrift, there are servers and clients. Hue comes with the server
    code in the form of a jar. The
    "org.apache.ahdoop.thriftfs.ThriftJobTrackerPlugin" property can use this
    jar to start a Thrift service (server).

    The server and client communicate using a language agnostic protocol.
    Thus, all you need to do is generate client "stubs". Then, in what ever
    program you are writing, call these stub methods. Take a look at
    http://diwakergupta.github.io/thrift-missing-guide/ and
    https://alireza-noori.com/blog/programming/thriftpart-six-implementing-c-client/
    .

    Here is what your C# code might look like (ripped directly from link
    above):
    using System;
    using Thrift.Transport;
    using Thrift.Protocol;
    using Calculator;

    namespace CalculatorClient
    {
    class Program
    {
    static void Main(string[] args)
    {
    try
    {
    var socket = new TSocket("localhost", 9888);
    var transport = new TBufferedTransport(socket);
    var protocol = new TBinaryProtocol(transport);
    var client = new CalculatorService.Client(protocol);
    transport.Open();
    Console.WriteLine(client.add(2, 3));
    }
    catch (Exception ex)
    {
    Console.WriteLine(ex.Message);
    }
    }
    }
    }

    Notice how a socket, transport, protocol, and Client are instantiated.
    The Client should be generated for you.

    -Abe

    On Sun, Aug 4, 2013 at 8:40 PM, wrote:

    Hi Romain,

    I go through apache thrift and get to know a little about thrift (The
    Apache Thrift software framework, for scalable cross-language services
    development, combines a software stack with a code generation engine to
    build services that work efficiently and seamlessly between C++, Java,
    Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js,
    Smalltalk, OCaml and Delphi and other languages). So is this means that I
    am able to generate C# code from the thrift file and then put the generated
    code in hadoop environment. Then i will need to modify the below to point
    to the source file that i created?

    <property>
    <name>mapred.jobtracker.plugins</name>
    <value>*org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin*</value>
    <description>Comma-separated list of jobtracker plug-ins to be activated.</description>
    </property>


    Appreciate your help. Thanks.

    Regards,
    CYea

    On Sunday, August 4, 2013 10:33:32 PM UTC+8, chie...@gmail.com wrote:

    Hi Romain,

    Thanks for your reply. However, is it I can use the same plugin and
    call the thrift interface? Actually i still don't have any idea to call the
    thrift interface from the program, could you shed me some light if
    possible? Thanks a lot.

    Regards,
    CYea
    On Saturday, August 3, 2013 12:20:44 AM UTC+8, Romain Rigaux wrote:

    Hue is using a plugin on the JobTracker to access the MapReduce logs.
    It has a Thrift interface:

    https://github.com/cloudera/**hue/blob/master/desktop/libs/**
    hadoop/java/if/jobtracker.**thrift<https://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/java/if/jobtracker.thrift>
    http://www.cloudera.com/**content/cloudera-content/**
    cloudera-docs/CDH4/latest/**CDH4-Installation-Guide/**
    cdh4ig_topic_15_4.html#topic_**15_4_2_unique_1<http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_15_4.html#topic_15_4_2_unique_1>

    Romain

    On Fri, Aug 2, 2013 at 3:02 AM, wrote:

    Hi All,

    As I know when we submit a Oozie job, we are able to retrieve the job
    log through Oozie web services API (eg. GET
    /oozie/v1/job/job-3?show=log).** However, the logs that I got are
    Oozie job logs. From Hue, I can click view logs to drill down and check on
    the syslog which contains all the map reduce logs.

    Is there any way for me to get the log of the task tracker showing
    all the map reduce logs? Anyone got any idea on this? I am planning to
    query real time map reduce log.

    Appreciate your help. Thanks.

    Regards,
    CYea

    --

    ---
    You received this message because you are subscribed to the Google
    Groups "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to cdh-user+u...@cloudera.org.
    For more options, visit https://groups.google.com/a/**
    cloudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
    .

  • Abraham Elmahrek at Aug 6, 2013 at 4:26 pm
    Hey There,

    You should be able to use "getTask", which returns an object of type
    "ThriftTaskStatus". You're looking for the member "diagnosticInfo" in that
    object.

    -Abe

    On Tue, Aug 6, 2013 at 3:49 AM, wrote:

    Hi All,

    I am able to generate c# client stubs and connect to job tracker thrift
    server. However, when i get the job or task, I am not able to find a
    function that i can get the log. Can anyone let me know which function from
    thrift interface to query the logs for each task? Please find attached
    picture (circles in red) for the logs that i want to retrieve.

    Thanks a lot !!

    Regards,
    CYea
    On Monday, August 5, 2013 5:20:57 PM UTC+8, chie...@gmail.com wrote:

    Hi Abe,

    Thanks for your clear explanation. I will try it again using the
    reference provided. Appreciate your help.

    Regards,
    CYea

    On Monday, August 5, 2013 12:25:11 PM UTC+8, abe wrote:

    Hey There,

    With thrift, there are servers and clients. Hue comes with the server
    code in the form of a jar. The "org.apache.ahdoop.thriftfs.**ThriftJobTrackerPlugin"
    property can use this jar to start a Thrift service (server).

    The server and client communicate using a language agnostic protocol.
    Thus, all you need to do is generate client "stubs". Then, in what ever
    program you are writing, call these stub methods. Take a look at
    http://diwakergupta.github.**io/thrift-missing-guide/<http://diwakergupta.github.io/thrift-missing-guide/>and
    https://alireza-noori.com/**blog/programming/thriftpart-**
    six-implementing-c-client/<https://alireza-noori.com/blog/programming/thriftpart-six-implementing-c-client/>
    .

    Here is what your C# code might look like (ripped directly from link
    above):
    using System;
    using Thrift.Transport;
    using Thrift.Protocol;
    using Calculator;

    namespace CalculatorClient
    {
    class Program
    {
    static void Main(string[] args)
    {
    try
    {
    var socket = new TSocket("localhost", 9888);
    var transport = new TBufferedTransport(socket);
    var protocol = new TBinaryProtocol(transport);
    var client = new CalculatorService.Client(**protocol);
    transport.Open();
    Console.WriteLine(client.add(**2, 3));
    }
    catch (Exception ex)
    {
    Console.WriteLine(ex.Message);
    }
    }
    }
    }

    Notice how a socket, transport, protocol, and Client are instantiated.
    The Client should be generated for you.

    -Abe

    On Sun, Aug 4, 2013 at 8:40 PM, wrote:

    Hi Romain,

    I go through apache thrift and get to know a little about thrift (The
    Apache Thrift software framework, for scalable cross-language services
    development, combines a software stack with a code generation engine to
    build services that work efficiently and seamlessly between C++, Java,
    Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js,
    Smalltalk, OCaml and Delphi and other languages). So is this means that I
    am able to generate C# code from the thrift file and then put the generated
    code in hadoop environment. Then i will need to modify the below to point
    to the source file that i created?

    <property>
    <name>mapred.jobtracker.**plugins</name>
    <value>*org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin*</value>
    <description>Comma-separated list of jobtracker plug-ins to be activated.</description>
    </property>


    Appreciate your help. Thanks.

    Regards,
    CYea

    On Sunday, August 4, 2013 10:33:32 PM UTC+8, chie...@gmail.com wrote:

    Hi Romain,

    Thanks for your reply. However, is it I can use the same plugin and
    call the thrift interface? Actually i still don't have any idea to call the
    thrift interface from the program, could you shed me some light if
    possible? Thanks a lot.

    Regards,
    CYea
    On Saturday, August 3, 2013 12:20:44 AM UTC+8, Romain Rigaux wrote:

    Hue is using a plugin on the JobTracker to access the MapReduce logs.
    It has a Thrift interface:

    https://github.com/cloudera/**hu**e/blob/master/desktop/libs/**had**
    oop/java/if/jobtracker.**thrift<https://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/java/if/jobtracker.thrift>
    http://www.cloudera.com/**conten**t/cloudera-content/**cloudera-**
    docs/CDH4/latest/**CDH4-**Installation-Guide/**cdh4ig_**
    topic_15_4.html#topic_**15_4_2_**unique_1<http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_15_4.html#topic_15_4_2_unique_1>

    Romain

    On Fri, Aug 2, 2013 at 3:02 AM, wrote:

    Hi All,

    As I know when we submit a Oozie job, we are able to retrieve the
    job log through Oozie web services API (eg. GET
    /oozie/v1/job/job-3?show=log).**** However, the logs that I got are
    Oozie job logs. From Hue, I can click view logs to drill down and check on
    the syslog which contains all the map reduce logs.

    Is there any way for me to get the log of the task tracker showing
    all the map reduce logs? Anyone got any idea on this? I am planning to
    query real time map reduce log.

    Appreciate your help. Thanks.

    Regards,
    CYea

    --

    ---
    You received this message because you are subscribed to the Google
    Groups "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to cdh-user+u...@cloudera.org.
    For more options, visit https://groups.google.com/a/**cl**
    oudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
    .

  • Chiewyea at Aug 7, 2013 at 7:50 am
    Hi Abe,

    Thanks for your help. However, when my code try to get "diagnosticInfo",
    this value is empty, no logs inside, is there anything i need to set in
    order to get the log from job? Below is the C# code that i developed:

                                     *var socket = new TSocket("172.16.104.151",
    9290);*
    * var transport = new TBufferedTransport(socket);*
    * var protocol = new TBinaryProtocol(transport);*
    * var client = new Jobtracker.Client(protocol);*
    * transport.Open();*
    *
    *
    * RequestContext ctx = new RequestContext();*
    * ctx.ConfOptions = new Dictionary<string, string>();*
    * ctx.ConfOptions.Add("effective_user", "root");*
    *
    *
    * ThriftJobID thriftJobID = new ThriftJobID();*
    * thriftJobID.AsString = "job_201308051202_0057";*
    * thriftJobID.JobID = 57;*
    * thriftJobID.JobTrackerID = "201308051202";*
    *
    *
    * var job = client.getJob(ctx, thriftJobID);*
    *
    *
    * ThriftTaskInProgressList thriftTaskInProgressList =
    job.Tasks;*
    * List <ThriftTaskInProgress> taskList =
    thriftTaskInProgressList.Tasks;*
    * ThriftTaskStatus thriftTaskStatus = new ThriftTaskStatus();
    *
    *
    *
    * taskList.ForEach(l =>*
    * {*
    * Dictionary<string, ThriftTaskStatus>
    thriftTaskStatusDic = l.TaskStatuses;*
    * *
    * foreach (KeyValuePair<string, ThriftTaskStatus>
    status in thriftTaskStatusDic)*
    * {*
    * thriftTaskStatus = status.Value;*
                                 Console.WriteLine("======>>" +
    thriftTaskStatus.DiagnosticInfo); *//No logs shown here*
    * }*
    * });*


    Any idea why the "diagnosticInfo" is empty?? Besides that, i have another
    question. When i getAllJob, it will always return only 5 latest job, can we
    set it to keep more than 5 jobs? You may refer to attached picture when i
    try to retrieve the job, it will hit the error due to it only keep the
    latest 5 jobs.

    Thanks a lot !!

    Regards,
    CYea

    On Wednesday, August 7, 2013 12:26:05 AM UTC+8, abe wrote:

    Hey There,

    You should be able to use "getTask", which returns an object of type
    "ThriftTaskStatus". You're looking for the member "diagnosticInfo" in that
    object.

    -Abe

    On Tue, Aug 6, 2013 at 3:49 AM, <chie...@gmail.com <javascript:>> wrote:

    Hi All,

    I am able to generate c# client stubs and connect to job tracker thrift
    server. However, when i get the job or task, I am not able to find a
    function that i can get the log. Can anyone let me know which function from
    thrift interface to query the logs for each task? Please find attached
    picture (circles in red) for the logs that i want to retrieve.

    Thanks a lot !!

    Regards,
    CYea
    On Monday, August 5, 2013 5:20:57 PM UTC+8, chie...@gmail.com wrote:

    Hi Abe,

    Thanks for your clear explanation. I will try it again using the
    reference provided. Appreciate your help.

    Regards,
    CYea

    On Monday, August 5, 2013 12:25:11 PM UTC+8, abe wrote:

    Hey There,

    With thrift, there are servers and clients. Hue comes with the server
    code in the form of a jar. The "org.apache.ahdoop.thriftfs.**ThriftJobTrackerPlugin"
    property can use this jar to start a Thrift service (server).

    The server and client communicate using a language agnostic protocol.
    Thus, all you need to do is generate client "stubs". Then, in what ever
    program you are writing, call these stub methods. Take a look at
    http://diwakergupta.github.**io/thrift-missing-guide/<http://diwakergupta.github.io/thrift-missing-guide/>and
    https://alireza-noori.com/**blog/programming/thriftpart-**
    six-implementing-c-client/<https://alireza-noori.com/blog/programming/thriftpart-six-implementing-c-client/>
    .

    Here is what your C# code might look like (ripped directly from link
    above):
    using System;
    using Thrift.Transport;
    using Thrift.Protocol;
    using Calculator;

    namespace CalculatorClient
    {
    class Program
    {
    static void Main(string[] args)
    {
    try
    {
    var socket = new TSocket("localhost", 9888);
    var transport = new TBufferedTransport(socket);
    var protocol = new TBinaryProtocol(transport);
    var client = new CalculatorService.Client(**protocol);
    transport.Open();
    Console.WriteLine(client.add(**2, 3));
    }
    catch (Exception ex)
    {
    Console.WriteLine(ex.Message);
    }
    }
    }
    }

    Notice how a socket, transport, protocol, and Client are instantiated.
    The Client should be generated for you.

    -Abe

    On Sun, Aug 4, 2013 at 8:40 PM, wrote:

    Hi Romain,

    I go through apache thrift and get to know a little about thrift (The
    Apache Thrift software framework, for scalable cross-language services
    development, combines a software stack with a code generation engine to
    build services that work efficiently and seamlessly between C++, Java,
    Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js,
    Smalltalk, OCaml and Delphi and other languages). So is this means that I
    am able to generate C# code from the thrift file and then put the generated
    code in hadoop environment. Then i will need to modify the below to point
    to the source file that i created?

    <property>
    <name>mapred.jobtracker.**plugins</name>
    <value>*org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin*</value>
    <description>Comma-separated list of jobtracker plug-ins to be activated.</description>
    </property>


    Appreciate your help. Thanks.

    Regards,
    CYea

    On Sunday, August 4, 2013 10:33:32 PM UTC+8, chie...@gmail.com wrote:

    Hi Romain,

    Thanks for your reply. However, is it I can use the same plugin and
    call the thrift interface? Actually i still don't have any idea to call the
    thrift interface from the program, could you shed me some light if
    possible? Thanks a lot.

    Regards,
    CYea
    On Saturday, August 3, 2013 12:20:44 AM UTC+8, Romain Rigaux wrote:

    Hue is using a plugin on the JobTracker to access the MapReduce
    logs. It has a Thrift interface:

    https://github.com/cloudera/**hu**e/blob/master/desktop/libs/**had**
    oop/java/if/jobtracker.**thrift<https://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/java/if/jobtracker.thrift>
    http://www.cloudera.com/**conten**t/cloudera-content/**cloudera-**
    docs/CDH4/latest/**CDH4-**Installation-Guide/**cdh4ig_**
    topic_15_4.html#topic_**15_4_2_**unique_1<http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_15_4.html#topic_15_4_2_unique_1>

    Romain

    On Fri, Aug 2, 2013 at 3:02 AM, wrote:

    Hi All,

    As I know when we submit a Oozie job, we are able to retrieve the
    job log through Oozie web services API (eg. GET
    /oozie/v1/job/job-3?show=log).**** However, the logs that I got
    are Oozie job logs. From Hue, I can click view logs to drill down and check
    on the syslog which contains all the map reduce logs.

    Is there any way for me to get the log of the task tracker showing
    all the map reduce logs? Anyone got any idea on this? I am planning to
    query real time map reduce log.

    Appreciate your help. Thanks.

    Regards,
    CYea

    --

    ---
    You received this message because you are subscribed to the Google
    Groups "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to cdh-user+u...@cloudera.org.
    For more options, visit https://groups.google.com/a/**cl**
    oudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
    .

  • Romain Rigaux at Aug 8, 2013 at 4:26 pm
    Diagnostic info is most of the time empty. Try 'stdout', 'stderr',
    'syslog'.

    About seeing only the 5 latests job, do you see all of the in the Hadoop JT
    UI?

    Romain

    On Wed, Aug 7, 2013 at 12:50 AM, wrote:

    Hi Abe,

    Thanks for your help. However, when my code try to get "diagnosticInfo",
    this value is empty, no logs inside, is there anything i need to set in
    order to get the log from job? Below is the C# code that i developed:

    *var socket = new
    TSocket("172.16.104.151", 9290);*
    * var transport = new TBufferedTransport(socket);*
    * var protocol = new TBinaryProtocol(transport);*
    * var client = new Jobtracker.Client(protocol);*
    * transport.Open();*
    *
    *
    * RequestContext ctx = new RequestContext();*
    * ctx.ConfOptions = new Dictionary<string, string>();*
    * ctx.ConfOptions.Add("effective_user", "root");*
    *
    *
    * ThriftJobID thriftJobID = new ThriftJobID();*
    * thriftJobID.AsString = "job_201308051202_0057";*
    * thriftJobID.JobID = 57;*
    * thriftJobID.JobTrackerID = "201308051202";*
    *
    *
    * var job = client.getJob(ctx, thriftJobID);*
    *
    *
    * ThriftTaskInProgressList thriftTaskInProgressList =
    job.Tasks;*
    * List <ThriftTaskInProgress> taskList =
    thriftTaskInProgressList.Tasks;*
    * ThriftTaskStatus thriftTaskStatus = new
    ThriftTaskStatus();*
    *
    *
    * taskList.ForEach(l =>*
    * {*
    * Dictionary<string, ThriftTaskStatus>
    thriftTaskStatusDic = l.TaskStatuses;*
    * *
    * foreach (KeyValuePair<string, ThriftTaskStatus>
    status in thriftTaskStatusDic)*
    * {*
    * thriftTaskStatus = status.Value;*
    Console.WriteLine("======>>" +
    thriftTaskStatus.DiagnosticInfo); *//No logs shown here*
    * }*
    * });*


    Any idea why the "diagnosticInfo" is empty?? Besides that, i have another
    question. When i getAllJob, it will always return only 5 latest job, can we
    set it to keep more than 5 jobs? You may refer to attached picture when i
    try to retrieve the job, it will hit the error due to it only keep the
    latest 5 jobs.

    Thanks a lot !!

    Regards,
    CYea

    On Wednesday, August 7, 2013 12:26:05 AM UTC+8, abe wrote:

    Hey There,

    You should be able to use "getTask", which returns an object of type
    "ThriftTaskStatus". You're looking for the member "diagnosticInfo" in that
    object.

    -Abe

    On Tue, Aug 6, 2013 at 3:49 AM, wrote:

    Hi All,

    I am able to generate c# client stubs and connect to job tracker thrift
    server. However, when i get the job or task, I am not able to find a
    function that i can get the log. Can anyone let me know which function from
    thrift interface to query the logs for each task? Please find attached
    picture (circles in red) for the logs that i want to retrieve.

    Thanks a lot !!

    Regards,
    CYea
    On Monday, August 5, 2013 5:20:57 PM UTC+8, chie...@gmail.com wrote:

    Hi Abe,

    Thanks for your clear explanation. I will try it again using the
    reference provided. Appreciate your help.

    Regards,
    CYea

    On Monday, August 5, 2013 12:25:11 PM UTC+8, abe wrote:

    Hey There,

    With thrift, there are servers and clients. Hue comes with the server
    code in the form of a jar. The "org.apache.ahdoop.thriftfs.**Th**riftJobTrackerPlugin"
    property can use this jar to start a Thrift service (server).

    The server and client communicate using a language agnostic protocol.
    Thus, all you need to do is generate client "stubs". Then, in what ever
    program you are writing, call these stub methods. Take a look at
    http://diwakergupta.github.****io/thrift-missing-guide/<http://diwakergupta.github.io/thrift-missing-guide/>and
    https://alireza-noori.com/****blog/programming/thriftpart-**si**
    x-implementing-c-client/<https://alireza-noori.com/blog/programming/thriftpart-six-implementing-c-client/>
    .

    Here is what your C# code might look like (ripped directly from link
    above):
    using System;
    using Thrift.Transport;
    using Thrift.Protocol;
    using Calculator;

    namespace CalculatorClient
    {
    class Program
    {
    static void Main(string[] args)
    {
    try
    {
    var socket = new TSocket("localhost", 9888);
    var transport = new TBufferedTransport(socket);
    var protocol = new TBinaryProtocol(transport);
    var client = new CalculatorService.Client(**proto**
    col);
    transport.Open();
    Console.WriteLine(client.add(**2**, 3));
    }
    catch (Exception ex)
    {
    Console.WriteLine(ex.Message);
    }
    }
    }
    }

    Notice how a socket, transport, protocol, and Client are instantiated.
    The Client should be generated for you.

    -Abe

    On Sun, Aug 4, 2013 at 8:40 PM, wrote:

    Hi Romain,

    I go through apache thrift and get to know a little about thrift (The
    Apache Thrift software framework, for scalable cross-language services
    development, combines a software stack with a code generation engine to
    build services that work efficiently and seamlessly between C++, Java,
    Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js,
    Smalltalk, OCaml and Delphi and other languages). So is this means that I
    am able to generate C# code from the thrift file and then put the generated
    code in hadoop environment. Then i will need to modify the below to point
    to the source file that i created?

    <property>
    <name>mapred.jobtracker.**plugin**s</name>
    <value>*org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin*</**value>
    <description>Comma-separated list of jobtracker plug-ins to be activated.</description>
    </property>


    Appreciate your help. Thanks.

    Regards,
    CYea

    On Sunday, August 4, 2013 10:33:32 PM UTC+8, chie...@gmail.com wrote:

    Hi Romain,

    Thanks for your reply. However, is it I can use the same plugin and
    call the thrift interface? Actually i still don't have any idea to call the
    thrift interface from the program, could you shed me some light if
    possible? Thanks a lot.

    Regards,
    CYea
    On Saturday, August 3, 2013 12:20:44 AM UTC+8, Romain Rigaux wrote:

    Hue is using a plugin on the JobTracker to access the MapReduce
    logs. It has a Thrift interface:

    https://github.com/cloudera/**hu****e/blob/master/desktop/libs/**
    had****oop/java/if/jobtracker.**thrift<https://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/java/if/jobtracker.thrift>
    http://www.cloudera.com/**conten****t/cloudera-content/**cloudera-*
    *do**cs/CDH4/latest/**CDH4-**Installati**on-Guide/**cdh4ig_**
    topic_15_4.**html#topic_**15_4_2_**unique_1<http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_15_4.html#topic_15_4_2_unique_1>

    Romain

    On Fri, Aug 2, 2013 at 3:02 AM, wrote:

    Hi All,

    As I know when we submit a Oozie job, we are able to retrieve the
    job log through Oozie web services API (eg. GET
    /oozie/v1/job/job-3?show=log).****** However, the logs that I got
    are Oozie job logs. From Hue, I can click view logs to drill down and check
    on the syslog which contains all the map reduce logs.

    Is there any way for me to get the log of the task tracker showing
    all the map reduce logs? Anyone got any idea on this? I am planning to
    query real time map reduce log.

    Appreciate your help. Thanks.

    Regards,
    CYea

    --

    ---
    You received this message because you are subscribed to the Google
    Groups "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to cdh-user+u...@cloudera.org.
    For more options, visit https://groups.google.com/a/**cl****
    oudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
    .

  • Chiewyea at Aug 9, 2013 at 2:55 pm
    Hi Romain,

    By the way, how could i get the "stdout", "stderr" and "syslog" from job
    tracker thrift? I couldn't find this parameter in ThriftTaskStatus object.
    Could you give me some idea. Thanks for the help.

    I do see all of the jobs in job tracker but when i tried to view the log,
    it said older logs may be cleaned up by task tracker. Could we set the job
    tracker log to be cleanup after one day?

    Thanks a lot.


    Regards,
    CYea

    On Friday, August 9, 2013 12:25:58 AM UTC+8, Romain Rigaux wrote:

    Diagnostic info is most of the time empty. Try 'stdout', 'stderr',
    'syslog'.

    About seeing only the 5 latests job, do you see all of the in the Hadoop
    JT UI?

    Romain

    On Wed, Aug 7, 2013 at 12:50 AM, <chie...@gmail.com <javascript:>> wrote:

    Hi Abe,

    Thanks for your help. However, when my code try to get "diagnosticInfo",
    this value is empty, no logs inside, is there anything i need to set in
    order to get the log from job? Below is the C# code that i developed:

    *var socket = new
    TSocket("172.16.104.151", 9290);*
    * var transport = new TBufferedTransport(socket);*
    * var protocol = new TBinaryProtocol(transport);*
    * var client = new Jobtracker.Client(protocol);*
    * transport.Open();*
    *
    *
    * RequestContext ctx = new RequestContext();*
    * ctx.ConfOptions = new Dictionary<string, string>();*
    * ctx.ConfOptions.Add("effective_user", "root");*
    *
    *
    * ThriftJobID thriftJobID = new ThriftJobID();*
    * thriftJobID.AsString = "job_201308051202_0057";*
    * thriftJobID.JobID = 57;*
    * thriftJobID.JobTrackerID = "201308051202";*
    *
    *
    * var job = client.getJob(ctx, thriftJobID);*
    *
    *
    * ThriftTaskInProgressList thriftTaskInProgressList =
    job.Tasks;*
    * List <ThriftTaskInProgress> taskList =
    thriftTaskInProgressList.Tasks;*
    * ThriftTaskStatus thriftTaskStatus = new
    ThriftTaskStatus();*
    *
    *
    * taskList.ForEach(l =>*
    * {*
    * Dictionary<string, ThriftTaskStatus>
    thriftTaskStatusDic = l.TaskStatuses;*
    * *
    * foreach (KeyValuePair<string, ThriftTaskStatus>
    status in thriftTaskStatusDic)*
    * {*
    * thriftTaskStatus = status.Value;*
    Console.WriteLine("======>>" +
    thriftTaskStatus.DiagnosticInfo); *//No logs shown here*
    * }*
    * });*


    Any idea why the "diagnosticInfo" is empty?? Besides that, i have another
    question. When i getAllJob, it will always return only 5 latest job, can we
    set it to keep more than 5 jobs? You may refer to attached picture when i
    try to retrieve the job, it will hit the error due to it only keep the
    latest 5 jobs.

    Thanks a lot !!

    Regards,
    CYea

    On Wednesday, August 7, 2013 12:26:05 AM UTC+8, abe wrote:

    Hey There,

    You should be able to use "getTask", which returns an object of type
    "ThriftTaskStatus". You're looking for the member "diagnosticInfo" in that
    object.

    -Abe

    On Tue, Aug 6, 2013 at 3:49 AM, wrote:

    Hi All,

    I am able to generate c# client stubs and connect to job tracker thrift
    server. However, when i get the job or task, I am not able to find a
    function that i can get the log. Can anyone let me know which function from
    thrift interface to query the logs for each task? Please find attached
    picture (circles in red) for the logs that i want to retrieve.

    Thanks a lot !!

    Regards,
    CYea
    On Monday, August 5, 2013 5:20:57 PM UTC+8, chie...@gmail.com wrote:

    Hi Abe,

    Thanks for your clear explanation. I will try it again using the
    reference provided. Appreciate your help.

    Regards,
    CYea

    On Monday, August 5, 2013 12:25:11 PM UTC+8, abe wrote:

    Hey There,

    With thrift, there are servers and clients. Hue comes with the server
    code in the form of a jar. The "org.apache.ahdoop.thriftfs.**Th**riftJobTrackerPlugin"
    property can use this jar to start a Thrift service (server).

    The server and client communicate using a language agnostic protocol.
    Thus, all you need to do is generate client "stubs". Then, in what ever
    program you are writing, call these stub methods. Take a look at
    http://diwakergupta.github.****io/thrift-missing-guide/<http://diwakergupta.github.io/thrift-missing-guide/>and
    https://alireza-noori.com/****blog/programming/thriftpart-**si**
    x-implementing-c-client/<https://alireza-noori.com/blog/programming/thriftpart-six-implementing-c-client/>
    .

    Here is what your C# code might look like (ripped directly from link
    above):
    using System;
    using Thrift.Transport;
    using Thrift.Protocol;
    using Calculator;

    namespace CalculatorClient
    {
    class Program
    {
    static void Main(string[] args)
    {
    try
    {
    var socket = new TSocket("localhost", 9888);
    var transport = new TBufferedTransport(socket);
    var protocol = new TBinaryProtocol(transport);
    var client = new CalculatorService.Client(**proto**
    col);
    transport.Open();
    Console.WriteLine(client.add(**2**, 3));
    }
    catch (Exception ex)
    {
    Console.WriteLine(ex.Message);
    }
    }
    }
    }

    Notice how a socket, transport, protocol, and Client are
    instantiated. The Client should be generated for you.

    -Abe

    On Sun, Aug 4, 2013 at 8:40 PM, wrote:

    Hi Romain,

    I go through apache thrift and get to know a little about thrift (The
    Apache Thrift software framework, for scalable cross-language services
    development, combines a software stack with a code generation engine to
    build services that work efficiently and seamlessly between C++, Java,
    Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js,
    Smalltalk, OCaml and Delphi and other languages). So is this means that I
    am able to generate C# code from the thrift file and then put the generated
    code in hadoop environment. Then i will need to modify the below to point
    to the source file that i created?

    <property>
    <name>mapred.jobtracker.**plugin**s</name>
    <value>*org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin*</**value>
    <description>Comma-separated list of jobtracker plug-ins to be activated.</description>
    </property>


    Appreciate your help. Thanks.

    Regards,
    CYea


    On Sunday, August 4, 2013 10:33:32 PM UTC+8, chie...@gmail.comwrote:
    Hi Romain,

    Thanks for your reply. However, is it I can use the same plugin and
    call the thrift interface? Actually i still don't have any idea to call the
    thrift interface from the program, could you shed me some light if
    possible? Thanks a lot.

    Regards,
    CYea
    On Saturday, August 3, 2013 12:20:44 AM UTC+8, Romain Rigaux wrote:

    Hue is using a plugin on the JobTracker to access the MapReduce
    logs. It has a Thrift interface:

    https://github.com/cloudera/**hu****e/blob/master/desktop/libs/**
    had****oop/java/if/jobtracker.**thrift<https://github.com/cloudera/hue/blob/master/desktop/libs/hadoop/java/if/jobtracker.thrift>
    http://www.cloudera.com/**conten****t/cloudera-content/**cloudera-
    **do**cs/CDH4/latest/**CDH4-**Installati**on-Guide/**cdh4ig_**
    topic_15_4.**html#topic_**15_4_2_**unique_1<http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_15_4.html#topic_15_4_2_unique_1>

    Romain

    On Fri, Aug 2, 2013 at 3:02 AM, wrote:

    Hi All,

    As I know when we submit a Oozie job, we are able to retrieve the
    job log through Oozie web services API (eg. GET
    /oozie/v1/job/job-3?show=log).****** However, the logs that I
    got are Oozie job logs. From Hue, I can click view logs to drill down and
    check on the syslog which contains all the map reduce logs.

    Is there any way for me to get the log of the task tracker
    showing all the map reduce logs? Anyone got any idea on this? I am planning
    to query real time map reduce log.

    Appreciate your help. Thanks.

    Regards,
    CYea

    --

    ---
    You received this message because you are subscribed to the
    Google Groups "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to cdh-user+u...@cloudera.org.
    For more options, visit https://groups.google.com/a/**cl****
    oudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
    .

  • Romain Rigaux at Aug 22, 2013 at 11:19 pm
    Yes, when using CM, you should update the config in CM or with the safety
    valve as the actual files are regenerated when restarting the services:

    http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Free/4.5.4/Cloudera-Manager-Free-Edition-User-Guide/cmfeug_topic_5_3.html#topic_5_3_1_unique_1__title_85_unique_5

    Could it be related to this?
       <property>
         <name>mapred.userlog.retain.hours</name>
         <value>24</value>
       </property>

    Romain


    On Thu, Aug 15, 2013 at 9:12 PM, wrote:

    Hi Romain,

    Initially I modified *mapred-site.xml* under *
    /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/conf* directory, but
    it did not reflect any changes. After that I
    modified mapreduce1 configuration in cloudera manager interface (as picture
    attached), then only it reflect the changes.

    So now i am able to get status of jobs based on the count i set in *
    mapred.jobtracker.completeuserjobs.maximum*. Currently the
    mapred-site.xml and core-site.xml in my environment is as the following.

    *mapred-site.xml *
    <?xml version="1.0" encoding="UTF-8"?>

    <!--Autogenerated by Cloudera CM on 2013-06-28T04:18:07.430Z-->
    <configuration>
    <property>
    <name>mapred.job.tracker</name>
    <value>MYRNDSVRVM311:8021</value>
    </property>
    <property>
    <name>mapred.job.tracker.http.address</name>
    <value>0.0.0.0:50030</value>
    </property>
    <property>
    <name>mapreduce.job.counters.max</name>
    <value>120</value>
    </property>
    <property>
    <name>mapred.output.compress</name>
    <value>false</value>
    </property>
    <property>
    <name>mapred.output.compression.type</name>
    <value>BLOCK</value>
    </property>
    <property>
    <name>mapred.output.compression.codec</name>
    <value>org.apache.hadoop.io.compress.DefaultCodec</value>
    </property>
    <property>
    <name>mapred.map.output.compression.codec</name>
    <value>org.apache.hadoop.io.compress.SnappyCodec</value>
    </property>
    <property>
    <name>mapred.compress.map.output</name>
    <value>true</value>
    </property>
    <property>
    <name>zlib.compress.level</name>
    <value>DEFAULT_COMPRESSION</value>
    </property>
    <property>
    <name>io.sort.factor</name>
    <value>64</value>
    </property>
    <property>
    <name>io.sort.record.percent</name>
    <value>0.05</value>
    </property>
    <property>
    <name>io.sort.spill.percent</name>
    <value>0.8</value>
    </property>
    <property>
    <name>mapred.reduce.parallel.copies</name>
    <value>10</value>
    </property>
    <property>
    <name>mapred.submit.replication</name>
    <value>2</value>
    </property>
    <property>
    <name>mapred.reduce.tasks</name>
    <value>2</value>
    </property>
    <property>
    <name>mapred.userlog.retain.hours</name>
    <value>24</value>
    </property>
    <property>
    <name>io.sort.mb</name>
    <value>69</value>
    </property>
    <property>
    <name>mapred.child.java.opts</name>
    <value> -Xmx292094945</value>
    </property>
    <property>
    <name>mapred.job.reuse.jvm.num.tasks</name>
    <value>1</value>
    </property>
    <property>
    <name>mapred.map.tasks.speculative.execution</name>
    <value>false</value>
    </property>
    <property>
    <name>mapred.reduce.tasks.speculative.execution</name>
    <value>false</value>
    </property>
    <property>
    <name>mapred.reduce.slowstart.completed.maps</name>
    <value>0.8</value>
    </property>
    <property>
    <name>jobtracker.thrift.address</name>
    <value>0.0.0.0:9290</value>
    </property>
    <property>
    <name>mapred.jobtracker.plugins</name>
    <value>org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin</value>
    <description>Comma-separated list of jobtracker plug-ins to be
    activated.</description>
    </property>
    </configuration>

    *core-site.xml *
    <?xml version="1.0" encoding="UTF-8"?>

    <!--Autogenerated by Cloudera CM on 2013-06-28T04:18:07.431Z-->
    <configuration>
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://MYRNDSVRVM311:8020</value>
    </property>
    <property>
    <name>fs.trash.interval</name>
    <value>1</value>
    </property>
    <property>
    <name>io.file.buffer.size</name>
    <value>65536</value>
    </property>
    <property>
    <name>io.compression.codecs</name>

    <value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoo

    p.io.compress.DeflateCodec,org.apache.hadoop.io.compress.SnappyCodec,org.apache.hadoop.io.compress.Lz4Codec</value>
    </property>
    <property>
    <name>hadoop.security.authentication</name>
    <value>simple</value>
    </property>
    <property>
    <name>hadoop.rpc.protection</name>
    <value>authentication</value>
    </property>
    <property>
    <name>hadoop.security.auth_to_local</name>
    <value>DEFAULT</value>
    </property>
    </configuration>


    Hue is not getting the configuration from *
    /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/conf* directory?
    Anyway, thanks a lot for the help.

    Regards,
    CYea
    To unsubscribe from this group and stop receiving emails from it, send an email to hue-user+unsubscribe@cloudera.org.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphue-user @
categorieshadoop
postedAug 2, '13 at 4:21p
activeAug 22, '13 at 11:19p
posts11
users3
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase