Grokbase Groups Pig user October 2010
FAQ
Hi,

I came across this patch
(https://issues.apache.org/jira/browse/PIG-1518) which supports multifile input format from Pig 0.8 version on wards.

A patch is also available for Pig 0.7. I was wondering if any one tried out the patch with Pig 0.7 and if they could share any notes on performance improvements due to this.

Thanks,
-Rohini

Search Discussions

  • Mridul Muralidharan at Oct 30, 2010 at 5:28 am
    It would be a tradeoff between data-locality versus number of tasks
    executed. In some of our experiments, it performed much worse (dont have
    actual numbers, but it was in the 2x ballpark iirc) : ofcourse, ours was
    a highly constrained and specialized experiment anyway !

    On the other hand, the benefits in terms of number of tasks can be
    extremely useful for job times - in particular, for environments where
    there is quota enabled in terms of number of tasks, or number of files
    (if map-only output), etc : the benefits can be pretty good.


    I am yet to look at the patch in detail, but from what I recall,
    performance could be improved by being more intelligent in terms of
    clustering splits based on 'locations' returned for the combined
    multiple-split, etc : to ensure maximal data-locality for the contained
    splits, etc.
    Not sure if it is in there in final version ...

    Regards,
    Mridul

    On Friday 29 October 2010 05:31 PM, Uppuluri, Rohini wrote:
    Hi,

    I came across this patch
    (https://issues.apache.org/jira/browse/PIG-1518) which supports multifile input format from Pig 0.8 version on wards.

    A patch is also available for Pig 0.7. I was wondering if any one tried out the patch with Pig 0.7 and if they could share any notes on performance improvements due to this.

    Thanks,
    -Rohini

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedOct 29, '10 at 3:46p
activeOct 30, '10 at 5:28a
posts2
users2
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase