Grokbase Groups Pig user October 2010
Alan means return a tuple of a single bag of many tuples (don't try to
make pig work with a loader that returns a bag instead of a tuple..
you'll be up to your neck in the visitor pattern in no time if you
start heading that direction).

Alternative is to change what constitutes a record your loader gets --
use a different inputformat/recordReader to produce the records as
needed, instead of feeding you lines.

On Thu, Oct 28, 2010 at 8:36 AM, John Hui wrote:
I look into the return data bag as an option.  The problem is the Loader
interface require me to return a Tuple object.

public Tuple getNext() throws IOException {

but the DataBag interface is not a derive class of Tuple so this means I
will need to change the internal code for pig for my loader to return a bag
of tuples.  Right?

On Wed, Oct 27, 2010 at 6:00 PM, John Hui wrote:

Hi Pig Users,

I am currently writing a UDF loader.  In one of my use case, one line in
the input stream results in multiple tuples.  Has anyone encounter or solve
this issue on their end.

The current structure of the code getNext method only return tuple but I
want it to return a List<tuple>.  Let me know if there's use case out there
like mine, I am coding it up to return List<tuple> which is more more
flexible than return only one tuple.



Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 4 of 7 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedOct 27, '10 at 10:39p
activeOct 28, '10 at 3:52p



site design / logo © 2021 Grokbase