I'm using Pig 0.6.0 and a fix for bug PIG-619 is causing a performance issue
with some of my Jobs. In Pig 0.3.0 a fix was added to create an empty slice
for any file with a zero file length. In some cases this can cause a number
of unneeded map jobs to run. I tried duplicate the problem in Pig-619 on PIG
0.6.0 running on Hadoop 0.20.2 on by removing change and running the
scenarios in the issues, but wasn't able to duplicate the problem. I have a
couple questions I'm hope someone can answer.
Does anybody know whether the issue was fixed by moving to Hadoop 0.20.2 and
could simply be removed? The fix was added in Pig 0.3.0 which ran on Hadoop
Can I change the code to just add one empty slice in the case where all the
files are empty instead of an empty slice for all empty files?
If I could duplicate the problem I would feel better about making the