FAQ
KeyFieldBasedPartitioner would lost data if specifed field not exist
--------------------------------------------------------------------

Key: HADOOP-6052
URL: https://issues.apache.org/jira/browse/HADOOP-6052
Project: Hadoop Core
Issue Type: Bug
Components: mapred
Affects Versions: 0.20.0
Reporter: Amar Kamat
Assignee: Amar Kamat
Fix For: 0.21.0


When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Amar Kamat (JIRA) at Jun 16, 2009 at 9:51 am
    [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amar Kamat updated HADOOP-6052:
    -------------------------------

    Attachment: HADOOP-6052-v1.0.patch

    Attaching a fix. Incorporated Jothi's comments from HADOOP-5779. Result of test-patch
    [exec] +1 overall.
    [exec]
    [exec] +1 @author. The patch does not contain any @author tags.
    [exec]
    [exec] +1 tests included. The patch appears to include 3 new or modified tests.
    [exec]
    [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
    [exec]
    [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
    [exec]
    [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
    [exec]
    [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
    [exec]
    [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

    Running ant test now.
    KeyFieldBasedPartitioner would lost data if specifed field not exist
    --------------------------------------------------------------------

    Key: HADOOP-6052
    URL: https://issues.apache.org/jira/browse/HADOOP-6052
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.20.0
    Reporter: Amar Kamat
    Assignee: Amar Kamat
    Fix For: 0.21.0

    Attachments: HADOOP-6052-v1.0.patch


    When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amar Kamat (JIRA) at Jun 17, 2009 at 8:00 am
    [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720541#action_12720541 ]

    Amar Kamat commented on HADOOP-6052:
    ------------------------------------

    Following tests failed.
    Name||Type||Result||Resolution||
    org.apache.hadoop.mapred.TestReduceFetch|FAILED|Rerun also failed|HADOOP-6029|
    org.apache.hadoop.mapred.TestRunningTaskLimits|FAILED| Rerun passed|?|
    org.apache.hadoop.mapred.TestTaskLimits FAILED|(timeout)|Rerun also failed|HADOOP-5993/HADOOP-6061|

    Looking at TestRunningTaskLimits, I see the following code
    {code}

    JobConf jobConf = createWaitJobConf(mr, "job1", 20, 20);
    jobConf.setRunningMapLimit(5);
    jobConf.setRunningReduceLimit(3);

    // Submit the job
    RunningJob rJob = (new JobClient(jobConf)).submitJob(jobConf);

    // Wait 20 seconds for it to start up
    UtilsForTests.waitFor(20000);

    // Check the number of running tasks
    JobTracker jobTracker = mr.getJobTrackerRunner().getJobTracker();
    JobInProgress jip = jobTracker.getJob(rJob.getID());
    assertEquals(5, jip.runningMaps());
    assertEquals(3, jip.runningReduces());
    {code}
    I dont think waiting for 20 secs is a good thing to do. When I see the logs only one reducer was scheduled.

    Contrib tests passed except
    Name||Type||Result||Resolution||
    org.apache.hadoop.streaming.TestStreamingExitStatus|FAILED|Known issue|HADOOP-5906|
    org.apache.hadoop.streaming.TestStreamingStderr|FAILED (timeout)|Known issue|HADOOP-6062|
    org.apache.hadoop.mapred.TestCapacitySchedulerConf|FAILED|Second run passed after deleting capacity-scheduler.xml from conf|?|
    KeyFieldBasedPartitioner would lost data if specifed field not exist
    --------------------------------------------------------------------

    Key: HADOOP-6052
    URL: https://issues.apache.org/jira/browse/HADOOP-6052
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.20.0
    Reporter: Amar Kamat
    Assignee: Amar Kamat
    Fix For: 0.21.0

    Attachments: HADOOP-6052-v1.0.patch


    When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amar Kamat (JIRA) at Jun 17, 2009 at 9:06 am
    [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720573#action_12720573 ]

    Amar Kamat commented on HADOOP-6052:
    ------------------------------------

    Opened HADOOP-6065 to address the failure of TestRunningTaskLimit.
    KeyFieldBasedPartitioner would lost data if specifed field not exist
    --------------------------------------------------------------------

    Key: HADOOP-6052
    URL: https://issues.apache.org/jira/browse/HADOOP-6052
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.20.0
    Reporter: Amar Kamat
    Assignee: Amar Kamat
    Fix For: 0.21.0

    Attachments: HADOOP-6052-v1.0.patch


    When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amar Kamat (JIRA) at Jun 17, 2009 at 1:28 pm
    [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amar Kamat updated HADOOP-6052:
    -------------------------------

    Attachment: HADOOP-6052-v1.0-branch0.20.patch

    Attaching a patch for branch 0.20
    KeyFieldBasedPartitioner would lost data if specifed field not exist
    --------------------------------------------------------------------

    Key: HADOOP-6052
    URL: https://issues.apache.org/jira/browse/HADOOP-6052
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.20.0
    Reporter: Amar Kamat
    Assignee: Amar Kamat
    Fix For: 0.21.0

    Attachments: HADOOP-6052-v1.0-branch0.20.patch, HADOOP-6052-v1.0.patch


    When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Jun 17, 2009 at 1:38 pm
    [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720668#action_12720668 ]

    Devaraj Das commented on HADOOP-6052:
    -------------------------------------

    Sorry for commenting so late on this one - the check for (startChar < 0), should happen before endChar is evaluated, no? If startChar < 0, the endChar evaluation is redundant..
    KeyFieldBasedPartitioner would lost data if specifed field not exist
    --------------------------------------------------------------------

    Key: HADOOP-6052
    URL: https://issues.apache.org/jira/browse/HADOOP-6052
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.20.0
    Reporter: Amar Kamat
    Assignee: Amar Kamat
    Fix For: 0.21.0

    Attachments: HADOOP-6052-v1.0-branch0.20.patch, HADOOP-6052-v1.0.patch


    When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amar Kamat (JIRA) at Jun 18, 2009 at 4:43 am
    [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721055#action_12721055 ]

    Amar Kamat commented on HADOOP-6052:
    ------------------------------------

    Opened HADOOP-6075 to address TaskTaskTrackerMemoryManager failure.
    KeyFieldBasedPartitioner would lost data if specifed field not exist
    --------------------------------------------------------------------

    Key: HADOOP-6052
    URL: https://issues.apache.org/jira/browse/HADOOP-6052
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.20.0
    Reporter: Amar Kamat
    Assignee: Amar Kamat
    Fix For: 0.21.0

    Attachments: HADOOP-6052-v1.0-branch0.20.patch, HADOOP-6052-v1.0.patch


    When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amar Kamat (JIRA) at Jun 18, 2009 at 6:28 am
    [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Amar Kamat updated HADOOP-6052:
    -------------------------------

    Attachment: HADOOP-6052-v1.1.patch

    Attaching a new patch incorporating Devaraj's comments. Running test-patch. Waiting for HADOOP-6076.
    KeyFieldBasedPartitioner would lost data if specifed field not exist
    --------------------------------------------------------------------

    Key: HADOOP-6052
    URL: https://issues.apache.org/jira/browse/HADOOP-6052
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.20.0
    Reporter: Amar Kamat
    Assignee: Amar Kamat
    Fix For: 0.21.0

    Attachments: HADOOP-6052-v1.0-branch0.20.patch, HADOOP-6052-v1.0.patch, HADOOP-6052-v1.1.patch


    When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amar Kamat (JIRA) at Jun 18, 2009 at 10:19 am
    [ https://issues.apache.org/jira/browse/HADOOP-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721167#action_12721167 ]

    Amar Kamat commented on HADOOP-6052:
    ------------------------------------

    Result of test-patch
    [exec] +1 overall.
    [exec]
    [exec] +1 @author. The patch does not contain any @author tags.
    [exec]
    [exec] +1 tests included. The patch appears to include 3 new or modified tests.
    [exec]
    [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
    [exec]
    [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
    [exec]
    [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
    [exec]
    [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
    [exec]
    [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.


    Running ant test now.
    KeyFieldBasedPartitioner would lost data if specifed field not exist
    --------------------------------------------------------------------

    Key: HADOOP-6052
    URL: https://issues.apache.org/jira/browse/HADOOP-6052
    Project: Hadoop Core
    Issue Type: Bug
    Components: mapred
    Affects Versions: 0.20.0
    Reporter: Amar Kamat
    Assignee: Amar Kamat
    Fix For: 0.21.0

    Attachments: HADOOP-6052-v1.0-branch0.20.patch, HADOOP-6052-v1.0.patch, HADOOP-6052-v1.1.patch


    When using KeyFieldBasedPartitioner, if the record doesn't contain the specified field, the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losing that record!
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedJun 16, '09 at 8:38a
activeJun 18, '09 at 10:19a
posts9
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Amar Kamat (JIRA): 9 posts

People

Translate

site design / logo © 2022 Grokbase