|| at Feb 20, 2011 at 5:15 am
Twitter uses Pig for analyzing log data. The uses cases are wide-ranging,
from performing statistical analysis on the results of feature a/b tests, to
examining usage patterns on the website and the wider platform, to building
background models for trending topics.
You can look online for slide decks from me (should be on the yahoo dev blog
Alan linked to) and Kevin Weil, those should have some additional details.
On Sat, Feb 19, 2011 at 7:28 PM, Alan Gates wrote:
There have been talks given at the Bay Area HUGs about how people use Pig.
I know for example Yahoo Mail did one on how it uses Pig for spam
detection. Presentations for those talks are posted to Yahoo's Hadoop blog:http://developer.yahoo.com/blogs/hadoop/
On Feb 19, 2011, at 1:12 PM, Charles Gonçalves wrote:
I'm working on my MSc now using pig/hadoop to process logs.
I'm basically using it to do some characterizations on a traffic analysis
from some of the greatest Media groups from Brazil.
One of my dissertation chapters will be from case studies where that
environment (pig/hadoop) is needed due to difficult techniques to handle
great amount of data.
I'm wonder if someone could help me and point works (academical,
technical reports or whatever) or even talk (privately or not) about their
works and how pig/hadoop helped on that.
I will gladly put the results of that chapter on pig wiki later!
Thanks in advance!
*Charles Ferreira Gonçalves *http://homepages.dcc.ufmg.br/~charles/
UFMG - ICEx - Dcc
Cel.: 55 31 87741485
Tel.: 55 31 34741485
Lab.: 55 31 34095840