Grokbase Groups Pig user March 2012
FAQ
Hi,
I am running a pig query on around 500 GB input data.
The current block size is 128 MB and split size is the default 128 MB.
I have also specified 16 reducers and around 3800 mappers are running.

Now I observe that shuffling is taking a long time to complete execution,
approximately 25 mins per job.

Can anyone suggest how I can bring down the shuffling time? Is there any
property that I can tweak to improve performance?

Thanks & Regards,
Austin

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 5 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedMar 13, '12 at 12:25p
activeMar 14, '12 at 9:12p
posts5
users2
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase