FAQ
Null Pointer Exceptions in the mappers leading to lot of retries
----------------------------------------------------------------

Key: PIG-445
URL: https://issues.apache.org/jira/browse/PIG-445
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: types_branch
Reporter: Shravan Matthur Narayanamurthy
Assignee: Shravan Matthur Narayanamurthy


Even with successfully completed jobs, usually with a large data set, we see that there are NPE produced in the mappers which lead to task failure. However, this problem goes away on retries. The problem occurs at places where we access the reporter to report progress.
From the analysis, this should happen with jobs that use combiner. The combiner is called whenever the mapper outputs a buffer full of data. So the combiner is called multiple times in between a map task. In the Combiner.close method we currently set the reporter to null as it was assumed that combiner is called only after the entire output of map is produced.
The fix is to not set the reporter to null in the Combiner.close() method

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Shravan Matthur Narayanamurthy (JIRA) at Sep 22, 2008 at 8:26 pm
    [ https://issues.apache.org/jira/browse/PIG-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Shravan Matthur Narayanamurthy updated PIG-445:
    -----------------------------------------------

    Attachment: mq.patch
    Null Pointer Exceptions in the mappers leading to lot of retries
    ----------------------------------------------------------------

    Key: PIG-445
    URL: https://issues.apache.org/jira/browse/PIG-445
    Project: Pig
    Issue Type: Bug
    Components: impl
    Affects Versions: types_branch
    Reporter: Shravan Matthur Narayanamurthy
    Assignee: Shravan Matthur Narayanamurthy
    Attachments: mq.patch


    Even with successfully completed jobs, usually with a large data set, we see that there are NPE produced in the mappers which lead to task failure. However, this problem goes away on retries. The problem occurs at places where we access the reporter to report progress.
    From the analysis, this should happen with jobs that use combiner. The combiner is called whenever the mapper outputs a buffer full of data. So the combiner is called multiple times in between a map task. In the Combiner.close method we currently set the reporter to null as it was assumed that combiner is called only after the entire output of map is produced.
    The fix is to not set the reporter to null in the Combiner.close() method
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Olga Natkovich (JIRA) at Sep 22, 2008 at 9:33 pm
    [ https://issues.apache.org/jira/browse/PIG-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Olga Natkovich resolved PIG-445.
    --------------------------------

    Resolution: Fixed

    patch committed. thanks, shravan
    Null Pointer Exceptions in the mappers leading to lot of retries
    ----------------------------------------------------------------

    Key: PIG-445
    URL: https://issues.apache.org/jira/browse/PIG-445
    Project: Pig
    Issue Type: Bug
    Components: impl
    Affects Versions: types_branch
    Reporter: Shravan Matthur Narayanamurthy
    Assignee: Shravan Matthur Narayanamurthy
    Attachments: mq.patch


    Even with successfully completed jobs, usually with a large data set, we see that there are NPE produced in the mappers which lead to task failure. However, this problem goes away on retries. The problem occurs at places where we access the reporter to report progress.
    From the analysis, this should happen with jobs that use combiner. The combiner is called whenever the mapper outputs a buffer full of data. So the combiner is called multiple times in between a map task. In the Combiner.close method we currently set the reporter to null as it was assumed that combiner is called only after the entire output of map is produced.
    The fix is to not set the reporter to null in the Combiner.close() method
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriespig, hadoop
postedSep 22, '08 at 7:57p
activeSep 22, '08 at 9:33p
posts3
users1
websitepig.apache.org

1 user in discussion

Olga Natkovich (JIRA): 3 posts

People

Translate

site design / logo © 2022 Grokbase