FAQ
Using both small split combination and temporary file compression on a query of ORDER BY may cause crash
--------------------------------------------------------------------------------------------------------

Key: PIG-1645
URL: https://issues.apache.org/jira/browse/PIG-1645
Project: Pig
Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Fix For: 0.8.0


The stack looks like the following:

java.lang.NullPointerException at java.util.Arrays.binarySearch(Arrays.java:2043) at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72) at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52) at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116) at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238) at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) at
org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at
java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) at
org.apache.hadoop.mapred.Child.main(Child.java:211)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Yan Zhou (JIRA) at Sep 23, 2010 at 5:56 pm
    [ https://issues.apache.org/jira/browse/PIG-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914128#action_12914128 ]

    Yan Zhou commented on PIG-1645:
    -------------------------------

    The problem is that both RandomSampleLoader and PossionSampleLoader have internal states from the previous invocations that should be reset when a different underlying split is worked on under the same umbrella split when the split combination (PIG-1518) is on.

    When temporary file compression is disabled, Pig internal storage will create empty files which will be discarded by split combiner, making the only non-empty split as the only split to be worked on, so it is ok in this case.
    Using both small split combination and temporary file compression on a query of ORDER BY may cause crash
    --------------------------------------------------------------------------------------------------------

    Key: PIG-1645
    URL: https://issues.apache.org/jira/browse/PIG-1645
    Project: Pig
    Issue Type: Bug
    Affects Versions: 0.8.0
    Reporter: Yan Zhou
    Assignee: Yan Zhou
    Fix For: 0.8.0


    The stack looks like the following:
    java.lang.NullPointerException at java.util.Arrays.binarySearch(Arrays.java:2043) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52) at
    org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at
    org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) at
    org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at
    org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at
    org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at
    java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) at
    org.apache.hadoop.mapred.Child.main(Child.java:211)
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Yan Zhou (JIRA) at Sep 24, 2010 at 4:09 pm
    [ https://issues.apache.org/jira/browse/PIG-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Yan Zhou updated PIG-1645:
    --------------------------

    Attachment: PIG-1645.patch

    test-core passed.

    test-patch results:

    [exec] -1 overall.
    [exec]
    [exec] +1 @author. The patch does not contain any @author tags.
    [exec]
    [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
    [exec] Please justify why no tests are needed for this patch.
    [exec]
    [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
    [exec]
    [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
    [exec]
    [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
    [exec]
    [exec] -1 release audit. The applied patch generated 459 release audit warnings (more than the trunk's current 457 warnings).

    The scenario is trully a corner case. The following query *might* have caused the problem:

    A = load '/tmp/test/jsTst2.txt' as (fn, age:int);
    B = load '/tmp/test/sample.txt' as (fn, age:int);
    C = join A by fn, B by fn USING 'replicated';
    D = ORDER C BY B::age;
    dump D;

    where sample.txt has only one row that contains one record that has the same join key as a single record in jsTst2.txt which should have size of several HDFS blocks. Even so, it is random to see a failure, as it depends upon whether any of the logically empty files is placed in the first underlying split of the list of splits combined. Compute nodes' host names seem to play a role too. Running in local mode seems to see no failure.

    The 2 release audit warnings are due to jdiff. No new file added.
    Using both small split combination and temporary file compression on a query of ORDER BY may cause crash
    --------------------------------------------------------------------------------------------------------

    Key: PIG-1645
    URL: https://issues.apache.org/jira/browse/PIG-1645
    Project: Pig
    Issue Type: Bug
    Affects Versions: 0.8.0
    Reporter: Yan Zhou
    Assignee: Yan Zhou
    Fix For: 0.8.0

    Attachments: PIG-1645.patch


    The stack looks like the following:
    java.lang.NullPointerException at java.util.Arrays.binarySearch(Arrays.java:2043) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52) at
    org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at
    org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) at
    org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at
    org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at
    org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at
    java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) at
    org.apache.hadoop.mapred.Child.main(Child.java:211)
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Yan Zhou (JIRA) at Sep 24, 2010 at 4:09 pm
    [ https://issues.apache.org/jira/browse/PIG-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Yan Zhou updated PIG-1645:
    --------------------------

    Status: Patch Available (was: Open)
    Using both small split combination and temporary file compression on a query of ORDER BY may cause crash
    --------------------------------------------------------------------------------------------------------

    Key: PIG-1645
    URL: https://issues.apache.org/jira/browse/PIG-1645
    Project: Pig
    Issue Type: Bug
    Affects Versions: 0.8.0
    Reporter: Yan Zhou
    Assignee: Yan Zhou
    Fix For: 0.8.0

    Attachments: PIG-1645.patch


    The stack looks like the following:
    java.lang.NullPointerException at java.util.Arrays.binarySearch(Arrays.java:2043) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52) at
    org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at
    org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) at
    org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at
    org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at
    org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at
    java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) at
    org.apache.hadoop.mapred.Child.main(Child.java:211)
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Yan Zhou (JIRA) at Sep 24, 2010 at 5:07 pm
    [ https://issues.apache.org/jira/browse/PIG-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914541#action_12914541 ]

    Yan Zhou commented on PIG-1645:
    -------------------------------

    The possibility of failure also depends upon the block distribution since the split combination makes use of that info.
    Using both small split combination and temporary file compression on a query of ORDER BY may cause crash
    --------------------------------------------------------------------------------------------------------

    Key: PIG-1645
    URL: https://issues.apache.org/jira/browse/PIG-1645
    Project: Pig
    Issue Type: Bug
    Affects Versions: 0.8.0
    Reporter: Yan Zhou
    Assignee: Yan Zhou
    Fix For: 0.8.0

    Attachments: PIG-1645.patch


    The stack looks like the following:
    java.lang.NullPointerException at java.util.Arrays.binarySearch(Arrays.java:2043) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52) at
    org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at
    org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) at
    org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at
    org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at
    org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at
    java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) at
    org.apache.hadoop.mapred.Child.main(Child.java:211)
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Thejas M Nair (JIRA) at Sep 24, 2010 at 5:49 pm
    [ https://issues.apache.org/jira/browse/PIG-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914556#action_12914556 ]

    Thejas M Nair commented on PIG-1645:
    ------------------------------------

    +1
    Using both small split combination and temporary file compression on a query of ORDER BY may cause crash
    --------------------------------------------------------------------------------------------------------

    Key: PIG-1645
    URL: https://issues.apache.org/jira/browse/PIG-1645
    Project: Pig
    Issue Type: Bug
    Affects Versions: 0.8.0
    Reporter: Yan Zhou
    Assignee: Yan Zhou
    Fix For: 0.8.0

    Attachments: PIG-1645.patch


    The stack looks like the following:
    java.lang.NullPointerException at java.util.Arrays.binarySearch(Arrays.java:2043) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52) at
    org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at
    org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) at
    org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at
    org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at
    org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at
    java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) at
    org.apache.hadoop.mapred.Child.main(Child.java:211)
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Yan Zhou (JIRA) at Sep 25, 2010 at 4:00 am
    [ https://issues.apache.org/jira/browse/PIG-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Yan Zhou updated PIG-1645:
    --------------------------

    Status: Resolved (was: Patch Available)
    Resolution: Fixed

    Patch committed to both trunk and the 0.8 branch.
    Using both small split combination and temporary file compression on a query of ORDER BY may cause crash
    --------------------------------------------------------------------------------------------------------

    Key: PIG-1645
    URL: https://issues.apache.org/jira/browse/PIG-1645
    Project: Pig
    Issue Type: Bug
    Affects Versions: 0.8.0
    Reporter: Yan Zhou
    Assignee: Yan Zhou
    Fix For: 0.8.0

    Attachments: PIG-1645.patch


    The stack looks like the following:
    java.lang.NullPointerException at java.util.Arrays.binarySearch(Arrays.java:2043) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52) at
    org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at
    org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) at
    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) at
    org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at
    org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at
    org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at
    java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) at
    org.apache.hadoop.mapred.Child.main(Child.java:211)
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriespig, hadoop
postedSep 23, '10 at 1:51a
activeSep 25, '10 at 4:00a
posts7
users1
websitepig.apache.org

1 user in discussion

Yan Zhou (JIRA): 7 posts

People

Translate

site design / logo © 2022 Grokbase