[
https://issues.apache.org/jira/browse/PIG-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627261#action_12627261 ]
Arthur Zwiegincew commented on PIG-390:
---------------------------------------
Here's a workaround I'm using:
package com.cooliris.analytics;
import java.io.IOException;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.DataBag;
import org.apache.pig.data.Tuple;
/**
* Implements a UNIONALL Pig function. It accepts a tuple of the format <unused, {bag-1}, {bag-2}, {bag-3}, ...>
* and outputs a set of tuples corresponding to UNION bag-1, bag-2, bag-3, ... . This is intended as a workaround
* to bug PIG-390 — Union doesn't work.
*
* Instead of:
* combined = UNION data1, data2, data3;
* ...do the following:
* cg_combined = COGROUP data1 BY 1, data2 BY 1, data3 BY 1;
* combined = FOREACH cg_combined GENERATE FLATTEN(com.cooliris.analytics.UNIONALL(*));
*
* @author
[email protected]*/
public class UNIONALL extends EvalFunc<DataBag> {
@Override
public void exec(Tuple input, DataBag output) throws IOException {
for (int i = 1; i < input.arity(); ++i) {
for (Tuple nested : input.getBagField(i)) {
output.add(nested);
}
}
}
}
Union doesn't work
------------------
Key: PIG-390
URL:
https://issues.apache.org/jira/browse/PIG-390Project: Pig
Issue Type: Bug
Environment: Mac OS X
Reporter: Arthur Zwiegincew
data files:
$ cat ~/tmp/data
1 1
2 1
3 10
$ cat ~/tmp/data-2
4 20
5 20
pig script:
data = load '/Users/arthur/tmp/data' as (x, y);
data2 = load '/Users/arthur/tmp/data-2' as (x, y);
both = union data, data2;
dump both;
result:
(4, 20)
(5, 20)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.