Grokbase Groups Pig dev March 2013
This is a little different than how we've done such things before, but how
about a project to get Pig to run on Spark (aka, Spork)? The Twitter pig
folks have some code we'd love to share that got us half-way there, it was
looking pretty promising (if anyone is curious, it's the "spork" branch on
my github fork of pig: )

On Thu, Mar 21, 2013 at 2:05 PM, Prasanth J wrote:

One more idea for GSoC project.

YSmart uses correlation between multiple MR jobs to reduce the number of
MR jobs generated. I remember Dmitriy bringing this up early. The
techniques specified in this paper (Input, Job Flow, Transit correlations)
has been patched into Hive. If Pig doesn't use these optimizations then I
think it will be good to have them in Pig as well.

Here is the link to the paper

I think this can be a good candidate project for GSoC.

-- Prasanth
On Mar 21, 2013, at 3:51 PM, Olga Natkovich wrote:

+1 on that

From: Russell Jurney <>
To: "" <>
Sent: Thursday, March 21, 2013 11:54 AM
Subject: Re: Put a "Google summer of code 2013" cwiki page

Make Grunt use Antlr - high priority one for me. Once Grunt uses Antlr,
macros will flourish.

On Wed, Mar 20, 2013 at 6:25 PM, Daniel Dai wrote:

Feel free to add more project which could fit in the timeline of a
student summer project.

I remember there are several projects we discussed in our last meetup:
* Allow Pig use Hive UDFs, Alan, do we have a ticket for that?
* A general framework for Pig performance test, Rohini, do we have a


Russell Jurney

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 5 of 9 | next ›
Discussion Overview
groupdev @
categoriespig, hadoop
postedMar 21, '13 at 1:25a
activeApr 24, '13 at 10:34p



site design / logo © 2021 Grokbase