FAQ
Dear All,

I am porting code from the old API to the new API (Context objects)
and run on Hadoop 0.20.203.

Job job_first = new Job();

job_first.setJarByClass(My.class);
job_first.setNumReduceTasks(no_of_reduce_tasks);
job_first.setJobName("My_Job");

FileInputFormat.addInputPath( job_first, new Path (Input_Path) );
FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) );

job_first.setMapperClass(Map_First.class);
job_first.setReducerClass(Reduce_First.class);

job_first.setMapOutputKeyClass(IntWritable.class);
job_first.setMapOutputValueClass(Text.class);

job_first.setOutputKeyClass(NullWritable.class);
job_first.setOutputValueClass(Text.class);

job_first.waitForCompletion(true);

The problem I am facing is that instead of emitting values to
reducers, the mappers are directly writing their output in the
OutputPath and the reducers and not processing anything.

As read from the online materials that are available both my Map and
Reduce method uses the context.write method to emit the values.

Please help. Thanks a lot in advance!!

Warm regards
Arko

Search Discussions

  • Devaraj k at Apr 17, 2012 at 5:49 am
    Hi Arko,

    What is value of 'no_of_reduce_tasks'?

    If no of reduce tasks are 0, then the map task will directly write map output into the Job output path.

    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com]
    Sent: Tuesday, April 17, 2012 10:32 AM
    To: mapreduce-user@hadoop.apache.org
    Subject: Reducer not firing

    Dear All,

    I am porting code from the old API to the new API (Context objects)
    and run on Hadoop 0.20.203.

    Job job_first = new Job();

    job_first.setJarByClass(My.class);
    job_first.setNumReduceTasks(no_of_reduce_tasks);
    job_first.setJobName("My_Job");

    FileInputFormat.addInputPath( job_first, new Path (Input_Path) );
    FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) );

    job_first.setMapperClass(Map_First.class);
    job_first.setReducerClass(Reduce_First.class);

    job_first.setMapOutputKeyClass(IntWritable.class);
    job_first.setMapOutputValueClass(Text.class);

    job_first.setOutputKeyClass(NullWritable.class);
    job_first.setOutputValueClass(Text.class);

    job_first.waitForCompletion(true);

    The problem I am facing is that instead of emitting values to
    reducers, the mappers are directly writing their output in the
    OutputPath and the reducers and not processing anything.

    As read from the online materials that are available both my Map and
    Reduce method uses the context.write method to emit the values.

    Please help. Thanks a lot in advance!!

    Warm regards
    Arko
  • Arko Provo Mukherjee at Apr 17, 2012 at 8:38 am
    Hello,

    Many thanks for the reply.

    The 'no_of_reduce_tasks' is set to 2. I have a print statement before
    the code I pasted below to check that.

    Also I can find two output files part-r-00000 and part-r-00001. But
    they contain the values that has been outputted by the Mapper logic.

    Please let me know what I can check further.

    Thanks a lot in advance!

    Warm regards
    Arko
    On Tue, Apr 17, 2012 at 12:48 AM, Devaraj k wrote:
    Hi Arko,

    What is value of  'no_of_reduce_tasks'?

    If no of reduce tasks are 0, then the map task will directly write map output  into the Job output path.

    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com]
    Sent: Tuesday, April 17, 2012 10:32 AM
    To: mapreduce-user@hadoop.apache.org
    Subject: Reducer not firing

    Dear All,

    I am porting code from the old API to the new API (Context objects)
    and run on Hadoop 0.20.203.

    Job job_first = new Job();

    job_first.setJarByClass(My.class);
    job_first.setNumReduceTasks(no_of_reduce_tasks);
    job_first.setJobName("My_Job");

    FileInputFormat.addInputPath( job_first, new Path (Input_Path) );
    FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) );

    job_first.setMapperClass(Map_First.class);
    job_first.setReducerClass(Reduce_First.class);

    job_first.setMapOutputKeyClass(IntWritable.class);
    job_first.setMapOutputValueClass(Text.class);

    job_first.setOutputKeyClass(NullWritable.class);
    job_first.setOutputValueClass(Text.class);

    job_first.waitForCompletion(true);

    The problem I am facing is that instead of emitting values to
    reducers, the mappers are directly writing their output in the
    OutputPath and the reducers and not processing anything.

    As read from the online materials that are available both my Map and
    Reduce method uses the context.write method to emit the values.

    Please help. Thanks a lot in advance!!

    Warm regards
    Arko
  • Devaraj k at Apr 17, 2012 at 9:47 am
    Can you check the task attempt logs in your cluster and find out what is happening in the reduce phase. By default task attempt logs present in $HADOOP_LOG_DIR/userlogs/<job-id>/. There could be some bug exist in your reducer which is leading to this output.

    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com]
    Sent: Tuesday, April 17, 2012 2:07 PM
    To: mapreduce-user@hadoop.apache.org
    Subject: Re: Reducer not firing

    Hello,

    Many thanks for the reply.

    The 'no_of_reduce_tasks' is set to 2. I have a print statement before
    the code I pasted below to check that.

    Also I can find two output files part-r-00000 and part-r-00001. But
    they contain the values that has been outputted by the Mapper logic.

    Please let me know what I can check further.

    Thanks a lot in advance!

    Warm regards
    Arko
    On Tue, Apr 17, 2012 at 12:48 AM, Devaraj k wrote:
    Hi Arko,

    What is value of 'no_of_reduce_tasks'?

    If no of reduce tasks are 0, then the map task will directly write map output into the Job output path.

    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com]
    Sent: Tuesday, April 17, 2012 10:32 AM
    To: mapreduce-user@hadoop.apache.org
    Subject: Reducer not firing

    Dear All,

    I am porting code from the old API to the new API (Context objects)
    and run on Hadoop 0.20.203.

    Job job_first = new Job();

    job_first.setJarByClass(My.class);
    job_first.setNumReduceTasks(no_of_reduce_tasks);
    job_first.setJobName("My_Job");

    FileInputFormat.addInputPath( job_first, new Path (Input_Path) );
    FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) );

    job_first.setMapperClass(Map_First.class);
    job_first.setReducerClass(Reduce_First.class);

    job_first.setMapOutputKeyClass(IntWritable.class);
    job_first.setMapOutputValueClass(Text.class);

    job_first.setOutputKeyClass(NullWritable.class);
    job_first.setOutputValueClass(Text.class);

    job_first.waitForCompletion(true);

    The problem I am facing is that instead of emitting values to
    reducers, the mappers are directly writing their output in the
    OutputPath and the reducers and not processing anything.

    As read from the online materials that are available both my Map and
    Reduce method uses the context.write method to emit the values.

    Please help. Thanks a lot in advance!!

    Warm regards
    Arko
  • Kasi subrahmanyam at Apr 17, 2012 at 1:41 pm
    Could you comment the property where you are setting the number of reducer
    tasks and see the behaviour of the program once.
    If you already tried could you share the output
    On Tue, Apr 17, 2012 at 3:00 PM, Devaraj k wrote:

    Can you check the task attempt logs in your cluster and find out what is
    happening in the reduce phase. By default task attempt logs present in
    $HADOOP_LOG_DIR/userlogs/<job-id>/. There could be some bug exist in your
    reducer which is leading to this output.

    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com]
    Sent: Tuesday, April 17, 2012 2:07 PM
    To: mapreduce-user@hadoop.apache.org
    Subject: Re: Reducer not firing

    Hello,

    Many thanks for the reply.

    The 'no_of_reduce_tasks' is set to 2. I have a print statement before
    the code I pasted below to check that.

    Also I can find two output files part-r-00000 and part-r-00001. But
    they contain the values that has been outputted by the Mapper logic.

    Please let me know what I can check further.

    Thanks a lot in advance!

    Warm regards
    Arko
    On Tue, Apr 17, 2012 at 12:48 AM, Devaraj k wrote:
    Hi Arko,

    What is value of 'no_of_reduce_tasks'?

    If no of reduce tasks are 0, then the map task will directly write map
    output into the Job output path.
    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com]
    Sent: Tuesday, April 17, 2012 10:32 AM
    To: mapreduce-user@hadoop.apache.org
    Subject: Reducer not firing

    Dear All,

    I am porting code from the old API to the new API (Context objects)
    and run on Hadoop 0.20.203.

    Job job_first = new Job();

    job_first.setJarByClass(My.class);
    job_first.setNumReduceTasks(no_of_reduce_tasks);
    job_first.setJobName("My_Job");

    FileInputFormat.addInputPath( job_first, new Path (Input_Path) );
    FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) );

    job_first.setMapperClass(Map_First.class);
    job_first.setReducerClass(Reduce_First.class);

    job_first.setMapOutputKeyClass(IntWritable.class);
    job_first.setMapOutputValueClass(Text.class);

    job_first.setOutputKeyClass(NullWritable.class);
    job_first.setOutputValueClass(Text.class);

    job_first.waitForCompletion(true);

    The problem I am facing is that instead of emitting values to
    reducers, the mappers are directly writing their output in the
    OutputPath and the reducers and not processing anything.

    As read from the online materials that are available both my Map and
    Reduce method uses the context.write method to emit the values.

    Please help. Thanks a lot in advance!!

    Warm regards
    Arko
  • Bejoy KS at Apr 17, 2012 at 2:03 pm
    Hi Akro
    From the naming of output files, your job has the reduce phase. But the reducer being used is the IdentityReducer instead of your custom reducer. That is the reason you are seeing the same map output in the output files as well. You need to evaluate your code and logs to see why IdentityReducer is being triggered.

    Regards
    Bejoy KS

    Sent from handheld, please excuse typos.

    -----Original Message-----
    From: kasi subrahmanyam <kasisubbu440@gmail.com>
    Date: Tue, 17 Apr 2012 19:10:33
    To: <mapreduce-user@hadoop.apache.org>
    Reply-To: mapreduce-user@hadoop.apache.org
    Subject: Re: Reducer not firing

    Could you comment the property where you are setting the number of reducer
    tasks and see the behaviour of the program once.
    If you already tried could you share the output
    On Tue, Apr 17, 2012 at 3:00 PM, Devaraj k wrote:

    Can you check the task attempt logs in your cluster and find out what is
    happening in the reduce phase. By default task attempt logs present in
    $HADOOP_LOG_DIR/userlogs/<job-id>/. There could be some bug exist in your
    reducer which is leading to this output.

    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com]
    Sent: Tuesday, April 17, 2012 2:07 PM
    To: mapreduce-user@hadoop.apache.org
    Subject: Re: Reducer not firing

    Hello,

    Many thanks for the reply.

    The 'no_of_reduce_tasks' is set to 2. I have a print statement before
    the code I pasted below to check that.

    Also I can find two output files part-r-00000 and part-r-00001. But
    they contain the values that has been outputted by the Mapper logic.

    Please let me know what I can check further.

    Thanks a lot in advance!

    Warm regards
    Arko
    On Tue, Apr 17, 2012 at 12:48 AM, Devaraj k wrote:
    Hi Arko,

    What is value of 'no_of_reduce_tasks'?

    If no of reduce tasks are 0, then the map task will directly write map
    output into the Job output path.
    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com]
    Sent: Tuesday, April 17, 2012 10:32 AM
    To: mapreduce-user@hadoop.apache.org
    Subject: Reducer not firing

    Dear All,

    I am porting code from the old API to the new API (Context objects)
    and run on Hadoop 0.20.203.

    Job job_first = new Job();

    job_first.setJarByClass(My.class);
    job_first.setNumReduceTasks(no_of_reduce_tasks);
    job_first.setJobName("My_Job");

    FileInputFormat.addInputPath( job_first, new Path (Input_Path) );
    FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) );

    job_first.setMapperClass(Map_First.class);
    job_first.setReducerClass(Reduce_First.class);

    job_first.setMapOutputKeyClass(IntWritable.class);
    job_first.setMapOutputValueClass(Text.class);

    job_first.setOutputKeyClass(NullWritable.class);
    job_first.setOutputValueClass(Text.class);

    job_first.waitForCompletion(true);

    The problem I am facing is that instead of emitting values to
    reducers, the mappers are directly writing their output in the
    OutputPath and the reducers and not processing anything.

    As read from the online materials that are available both my Map and
    Reduce method uses the context.write method to emit the values.

    Please help. Thanks a lot in advance!!

    Warm regards
    Arko
  • Steven Willis at Apr 17, 2012 at 8:20 pm
    Try putting @Override before your reduce method to make sure you're overriding the method properly. You'll get a compile time error if not.



    -Steven Willis


    From: Bejoy KS
    Sent: Tuesday, April 17, 2012 10:03 AM
    To: mapreduce-user@hadoop.apache.org
    Subject: Re: Reducer not firing

    Hi Akro
    From the naming of output files, your job has the reduce phase. But the reducer being used is the IdentityReducer instead of your custom reducer. That is the reason you are seeing the same map output in the output files as well. You need to evaluate your code and logs to see why IdentityReducer is being triggered.
    Regards
    Bejoy KS

    Sent from handheld, please excuse typos.
    ________________________________
    From: kasi subrahmanyam <kasisubbu440@gmail.com
    Date: Tue, 17 Apr 2012 19:10:33 +0530
    To: <mapreduce-user@hadoop.apache.org
    ReplyTo: mapreduce-user@hadoop.apache.org
    Subject: Re: Reducer not firing

    Could you comment the property where you are setting the number of reducer tasks and see the behaviour of the program once.
    If you already tried could you share the output
    On Tue, Apr 17, 2012 at 3:00 PM, Devaraj k wrote:
    Can you check the task attempt logs in your cluster and find out what is happening in the reduce phase. By default task attempt logs present in $HADOOP_LOG_DIR/userlogs/<job-id>/. There could be some bug exist in your reducer which is leading to this output.

    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com ]
    Sent: Tuesday, April 17, 2012 2:07 PM
    To: mapreduce-user@hadoop.apache.org
    Subject: Re: Reducer not firing

    Hello,

    Many thanks for the reply.

    The 'no_of_reduce_tasks' is set to 2. I have a print statement before
    the code I pasted below to check that.

    Also I can find two output files part-r-00000 and part-r-00001. But
    they contain the values that has been outputted by the Mapper logic.

    Please let me know what I can check further.

    Thanks a lot in advance!

    Warm regards
    Arko
    On Tue, Apr 17, 2012 at 12:48 AM, Devaraj k wrote:
    Hi Arko,

    What is value of 'no_of_reduce_tasks'?

    If no of reduce tasks are 0, then the map task will directly write map output into the Job output path.

    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com Sent: Tuesday, April 17, 2012 10:32 AM
    To: mapreduce-user@hadoop.apache.org Subject: Reducer not firing

    Dear All,

    I am porting code from the old API to the new API (Context objects)
    and run on Hadoop 0.20.203.

    Job job_first = new Job();

    job_first.setJarByClass(My.class);
    job_first.setNumReduceTasks(no_of_reduce_tasks);
    job_first.setJobName("My_Job");

    FileInputFormat.addInputPath( job_first, new Path (Input_Path) );
    FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) );

    job_first.setMapperClass(Map_First.class);
    job_first.setReducerClass(Reduce_First.class);

    job_first.setMapOutputKeyClass(IntWritable.class);
    job_first.setMapOutputValueClass(Text.class);

    job_first.setOutputKeyClass(NullWritable.class);
    job_first.setOutputValueClass(Text.class);

    job_first.waitForCompletion(true);

    The problem I am facing is that instead of emitting values to
    reducers, the mappers are directly writing their output in the
    OutputPath and the reducers and not processing anything.

    As read from the online materials that are available both my Map and
    Reduce method uses the context.write method to emit the values.

    Please help. Thanks a lot in advance!!

    Warm regards
    Arko
  • Arko Provo Mukherjee at Apr 17, 2012 at 11:17 pm
    Hello,

    Thanks everyone for helping me. Here are my observations:

    Devaraj - I didn't find any bug in the log files. In fact, none of the
    print statements in my reducer are even appearing in the logs. I can
    share the syslogs if you want. I didn't paste them here so that the
    email doesn't get cluttered.

    Kasi - Thanks for the suggestion. I tired but got the same output.
    The system just created 1 reducer as my test data set is small.

    Bejoy - Can you please advice how I can pinpoint whether the
    IdentityReducer is being used or not.

    Steven - I tried compiling with your suggestion. However if I put a
    @Override on top of my reduce method, I get the following error:
    "method does not override or implement a method from a supertype"
    The code compiles without it. I do have an @Override on top of my map
    method though.
    public class Reduce_First extends Reducer<IntWritable, Text,
    NullWritable, Text>
    {
    public void reduce (IntWritable key, Iterator<Text> values,
    Context context) throws IOException, InterruptedException
    {
    while ( values.hasNext() )
    // Process

    // Finally emit
    }
    }

    Thanks a lot again!
    Warm regards
    Arko

    On Tue, Apr 17, 2012 at 3:19 PM, Steven Willis wrote:
    Try putting @Override before your reduce method to make sure you're
    overriding the method properly. You’ll get a compile time error if not.



    -Steven Willis





    From: Bejoy KS
    Sent: Tuesday, April 17, 2012 10:03 AM


    To: mapreduce-user@hadoop.apache.org
    Subject: Re: Reducer not firing



    Hi Akro
    From the naming of output files, your job has the reduce phase. But the
    reducer being used is the IdentityReducer instead of your custom reducer.
    That is the reason you are seeing the same map output in the output files as
    well. You need to evaluate your code and logs to see why IdentityReducer is
    being triggered.

    Regards
    Bejoy KS

    Sent from handheld, please excuse typos.

    ________________________________

    From: kasi subrahmanyam <kasisubbu440@gmail.com>

    Date: Tue, 17 Apr 2012 19:10:33 +0530

    To: <mapreduce-user@hadoop.apache.org>

    ReplyTo: mapreduce-user@hadoop.apache.org

    Subject: Re: Reducer not firing



    Could you comment the property where you are setting the number of reducer
    tasks and see the behaviour of the program once.
    If you already tried could you share the output

    On Tue, Apr 17, 2012 at 3:00 PM, Devaraj k wrote:

    Can you check the task attempt logs in your cluster and find out what is
    happening in the reduce phase. By default task attempt logs present in
    $HADOOP_LOG_DIR/userlogs/<job-id>/. There could be some bug exist in your
    reducer which is leading to this output.


    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com]

    Sent: Tuesday, April 17, 2012 2:07 PM
    To: mapreduce-user@hadoop.apache.org
    Subject: Re: Reducer not firing


    Hello,

    Many thanks for the reply.

    The 'no_of_reduce_tasks' is set to 2. I have a print statement before
    the code I pasted below to check that.

    Also I can find two output files part-r-00000 and part-r-00001. But
    they contain the values that has been outputted by the Mapper logic.

    Please let me know what I can check further.

    Thanks a lot in advance!

    Warm regards
    Arko
    On Tue, Apr 17, 2012 at 12:48 AM, Devaraj k wrote:
    Hi Arko,

    What is value of  'no_of_reduce_tasks'?

    If no of reduce tasks are 0, then the map task will directly write map
    output  into the Job output path.

    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com]
    Sent: Tuesday, April 17, 2012 10:32 AM
    To: mapreduce-user@hadoop.apache.org
    Subject: Reducer not firing

    Dear All,

    I am porting code from the old API to the new API (Context objects)
    and run on Hadoop 0.20.203.

    Job job_first = new Job();

    job_first.setJarByClass(My.class);
    job_first.setNumReduceTasks(no_of_reduce_tasks);
    job_first.setJobName("My_Job");

    FileInputFormat.addInputPath( job_first, new Path (Input_Path) );
    FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) );

    job_first.setMapperClass(Map_First.class);
    job_first.setReducerClass(Reduce_First.class);

    job_first.setMapOutputKeyClass(IntWritable.class);
    job_first.setMapOutputValueClass(Text.class);

    job_first.setOutputKeyClass(NullWritable.class);
    job_first.setOutputValueClass(Text.class);

    job_first.waitForCompletion(true);

    The problem I am facing is that instead of emitting values to
    reducers, the mappers are directly writing their output in the
    OutputPath and the reducers and not processing anything.

    As read from the online materials that are available both my Map and
    Reduce method uses the context.write method to emit the values.

    Please help. Thanks a lot in advance!!

    Warm regards
    Arko
  • George Datskos at Apr 18, 2012 at 12:01 am
    Arko,

    Change Iterator to Iterable


    George

    On 2012/04/18 8:16, Arko Provo Mukherjee wrote:
    Hello,

    Thanks everyone for helping me. Here are my observations:

    Devaraj - I didn't find any bug in the log files. In fact, none of the
    print statements in my reducer are even appearing in the logs. I can
    share the syslogs if you want. I didn't paste them here so that the
    email doesn't get cluttered.

    Kasi - Thanks for the suggestion. I tired but got the same output.
    The system just created 1 reducer as my test data set is small.

    Bejoy - Can you please advice how I can pinpoint whether the
    IdentityReducer is being used or not.

    Steven - I tried compiling with your suggestion. However if I put a
    @Override on top of my reduce method, I get the following error:
    "method does not override or implement a method from a supertype"
    The code compiles without it. I do have an @Override on top of my map
    method though.
    public class Reduce_First extends Reducer<IntWritable, Text,
    NullWritable, Text>
    {
    public void reduce (IntWritable key, Iterator<Text> values,
    Context context) throws IOException, InterruptedException
    {
    while ( values.hasNext() )
    // Process

    // Finally emit
    }
    }

    Thanks a lot again!
    Warm regards
    Arko


    On Tue, Apr 17, 2012 at 3:19 PM, Steven Williswrote:
    Try putting @Override before your reduce method to make sure you're
    overriding the method properly. You’ll get a compile time error if not.



    -Steven Willis





    From: Bejoy KS
    Sent: Tuesday, April 17, 2012 10:03 AM


    To: mapreduce-user@hadoop.apache.org
    Subject: Re: Reducer not firing



    Hi Akro
    From the naming of output files, your job has the reduce phase. But the
    reducer being used is the IdentityReducer instead of your custom reducer.
    That is the reason you are seeing the same map output in the output files as
    well. You need to evaluate your code and logs to see why IdentityReducer is
    being triggered.

    Regards
    Bejoy KS

    Sent from handheld, please excuse typos.

    ________________________________

    From: kasi subrahmanyam<kasisubbu440@gmail.com>

    Date: Tue, 17 Apr 2012 19:10:33 +0530

    To:<mapreduce-user@hadoop.apache.org>

    ReplyTo: mapreduce-user@hadoop.apache.org

    Subject: Re: Reducer not firing



    Could you comment the property where you are setting the number of reducer
    tasks and see the behaviour of the program once.
    If you already tried could you share the output

    On Tue, Apr 17, 2012 at 3:00 PM, Devaraj kwrote:

    Can you check the task attempt logs in your cluster and find out what is
    happening in the reduce phase. By default task attempt logs present in
    $HADOOP_LOG_DIR/userlogs/<job-id>/. There could be some bug exist in your
    reducer which is leading to this output.


    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com]

    Sent: Tuesday, April 17, 2012 2:07 PM
    To: mapreduce-user@hadoop.apache.org
    Subject: Re: Reducer not firing


    Hello,

    Many thanks for the reply.

    The 'no_of_reduce_tasks' is set to 2. I have a print statement before
    the code I pasted below to check that.

    Also I can find two output files part-r-00000 and part-r-00001. But
    they contain the values that has been outputted by the Mapper logic.

    Please let me know what I can check further.

    Thanks a lot in advance!

    Warm regards
    Arko

    On Tue, Apr 17, 2012 at 12:48 AM, Devaraj kwrote:
    Hi Arko,

    What is value of 'no_of_reduce_tasks'?

    If no of reduce tasks are 0, then the map task will directly write map
    output into the Job output path.

    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com]
    Sent: Tuesday, April 17, 2012 10:32 AM
    To: mapreduce-user@hadoop.apache.org
    Subject: Reducer not firing

    Dear All,

    I am porting code from the old API to the new API (Context objects)
    and run on Hadoop 0.20.203.

    Job job_first = new Job();

    job_first.setJarByClass(My.class);
    job_first.setNumReduceTasks(no_of_reduce_tasks);
    job_first.setJobName("My_Job");

    FileInputFormat.addInputPath( job_first, new Path (Input_Path) );
    FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) );

    job_first.setMapperClass(Map_First.class);
    job_first.setReducerClass(Reduce_First.class);

    job_first.setMapOutputKeyClass(IntWritable.class);
    job_first.setMapOutputValueClass(Text.class);

    job_first.setOutputKeyClass(NullWritable.class);
    job_first.setOutputValueClass(Text.class);

    job_first.waitForCompletion(true);

    The problem I am facing is that instead of emitting values to
    reducers, the mappers are directly writing their output in the
    OutputPath and the reducers and not processing anything.

    As read from the online materials that are available both my Map and
    Reduce method uses the context.write method to emit the values.

    Please help. Thanks a lot in advance!!

    Warm regards
    Arko
  • Arko Provo Mukherjee at Apr 18, 2012 at 12:22 am
    Hello George,

    It worked. Thanks so much!! Bad typo while porting :(

    Thanks again to everyone who helped!!

    Warm regards
    Arko

    On Tue, Apr 17, 2012 at 6:59 PM, George Datskos
    wrote:
    Arko,

    Change Iterator to Iterable


    George


    On 2012/04/18 8:16, Arko Provo Mukherjee wrote:

    Hello,

    Thanks everyone for helping me. Here are my observations:

    Devaraj - I didn't find any bug in the log files. In fact, none of the
    print statements in my reducer are even appearing in the logs. I can
    share the syslogs if you want. I didn't paste them here so that the
    email doesn't get cluttered.

    Kasi -  Thanks for the suggestion. I tired but got the same output.
    The system just created 1 reducer as my test data set is small.

    Bejoy -  Can you please advice how I can pinpoint whether the
    IdentityReducer is being used or not.

    Steven - I tried compiling with your suggestion. However if I put a
    @Override on top of my reduce method, I get the following error:
    "method does not override or implement a method from a supertype"
    The code compiles without it. I do have an @Override on top of my map
    method though.
    public class Reduce_First extends Reducer<IntWritable, Text,
    NullWritable, Text>
    {
    public void reduce (IntWritable key, Iterator<Text>  values,
    Context context) throws IOException, InterruptedException
    {
    while ( values.hasNext() )
    // Process

    // Finally emit
    }
    }

    Thanks a lot again!
    Warm regards
    Arko


    On Tue, Apr 17, 2012 at 3:19 PM, Steven Willis<swillis@compete.com>
    wrote:
    Try putting @Override before your reduce method to make sure you're
    overriding the method properly. You’ll get a compile time error if not.



    -Steven Willis





    From: Bejoy KS
    Sent: Tuesday, April 17, 2012 10:03 AM


    To: mapreduce-user@hadoop.apache.org
    Subject: Re: Reducer not firing



    Hi Akro
    From the naming of output files, your job has the reduce phase. But the
    reducer being used is the IdentityReducer instead of your custom reducer.
    That is the reason you are seeing the same map output in the output files
    as
    well. You need to evaluate your code and logs to see why IdentityReducer
    is
    being triggered.

    Regards
    Bejoy KS

    Sent from handheld, please excuse typos.

    ________________________________

    From: kasi subrahmanyam<kasisubbu440@gmail.com>

    Date: Tue, 17 Apr 2012 19:10:33 +0530

    To:<mapreduce-user@hadoop.apache.org>

    ReplyTo: mapreduce-user@hadoop.apache.org

    Subject: Re: Reducer not firing



    Could you comment the property where you are setting the number of
    reducer
    tasks and see the behaviour of the program once.
    If you already tried could you share the output

    On Tue, Apr 17, 2012 at 3:00 PM, Devaraj kwrote:

    Can you check the task attempt logs in your cluster and find out what is
    happening in the reduce phase. By default task attempt logs present in
    $HADOOP_LOG_DIR/userlogs/<job-id>/. There could be some bug exist in your
    reducer which is leading to this output.


    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com]

    Sent: Tuesday, April 17, 2012 2:07 PM
    To: mapreduce-user@hadoop.apache.org
    Subject: Re: Reducer not firing


    Hello,

    Many thanks for the reply.

    The 'no_of_reduce_tasks' is set to 2. I have a print statement before
    the code I pasted below to check that.

    Also I can find two output files part-r-00000 and part-r-00001. But
    they contain the values that has been outputted by the Mapper logic.

    Please let me know what I can check further.

    Thanks a lot in advance!

    Warm regards
    Arko

    On Tue, Apr 17, 2012 at 12:48 AM, Devaraj kwrote:
    Hi Arko,

    What is value of  'no_of_reduce_tasks'?

    If no of reduce tasks are 0, then the map task will directly write map
    output  into the Job output path.

    Thanks
    Devaraj

    ________________________________________
    From: Arko Provo Mukherjee [arkoprovomukherjee@gmail.com]
    Sent: Tuesday, April 17, 2012 10:32 AM
    To: mapreduce-user@hadoop.apache.org
    Subject: Reducer not firing

    Dear All,

    I am porting code from the old API to the new API (Context objects)
    and run on Hadoop 0.20.203.

    Job job_first = new Job();

    job_first.setJarByClass(My.class);
    job_first.setNumReduceTasks(no_of_reduce_tasks);
    job_first.setJobName("My_Job");

    FileInputFormat.addInputPath( job_first, new Path (Input_Path) );
    FileOutputFormat.setOutputPath( job_first, new Path (Output_Path) );

    job_first.setMapperClass(Map_First.class);
    job_first.setReducerClass(Reduce_First.class);

    job_first.setMapOutputKeyClass(IntWritable.class);
    job_first.setMapOutputValueClass(Text.class);

    job_first.setOutputKeyClass(NullWritable.class);
    job_first.setOutputValueClass(Text.class);

    job_first.waitForCompletion(true);

    The problem I am facing is that instead of emitting values to
    reducers, the mappers are directly writing their output in the
    OutputPath and the reducers and not processing anything.

    As read from the online materials that are available both my Map and
    Reduce method uses the context.write method to emit the values.

    Please help. Thanks a lot in advance!!

    Warm regards
    Arko

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedApr 17, '12 at 5:02a
activeApr 18, '12 at 12:22a
posts10
users6
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2021 Grokbase