Grokbase Groups Pig user October 2008
FAQ
I propose that pig develop a standard set of benchmark queries that can
be run from release to release to measure pig's (hopefully improving)
performance over time. This would be similar in nature to hadoop's
GridMix (see
http://svn.apache.org/viewvc/hadoop/core/tags/release-0.17.1/src/test/gridmix/
and http://developer.yahoo.com/blogs/hadoop/). This set should be
relatively small (probably under 10). But it should cover a range of
operations being done by pig users.

So, if you have queries that you think would be good candidates and that
you can share (or obfuscate and then share), please do so. In addition
to the query, please give some idea of the type of data it runs over.
In particular we need to know how much data, how many fields are in your
data, the cardinality and distribution of any fields used as a group,
cogroup, or sort key.

Thanks.

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedOct 2, '08 at 7:59p
activeOct 2, '08 at 7:59p
posts1
users1
websitepig.apache.org

1 user in discussion

Alan Gates: 1 post

People

Translate

site design / logo © 2023 Grokbase