[
https://issues.apache.org/jira/browse/PIG-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Viraj Bhat updated PIG-1586:
----------------------------
Description:
I have a Pig script as a template:
{code}
register Countwords.jar;
A = $INPUT;
B = FOREACH A GENERATE
examples.udf.SubString($0,0,1),
$1 as num;
C = GROUP B BY $0;
D = FOREACH C GENERATE group, SUM(B.num);
STORE D INTO $OUTPUT;
{code}
I attempt to do Parameter substitutions using the following:
Using Shell script:
{code}
#!/bin/bash
java -cp ~/pig-svn/trunk/pig.jar:$HADOOP_CONF_DIR org.apache.pig.Main -r -file sub.pig \
-param INPUT="(foreach (COGROUP(load '/user/viraj/dataset1' USING PigStorage() AS (word:chararray,num:int)) by (word),(load '/user/viraj/dataset2' USING PigStorage() AS (word:chararray,num:int)) by (word)) generate flatten(examples.udf.CountWords(\\$0,\\$1,\\$2)))" \
-param OUTPUT="\'/user/viraj/output\' USING PigStorage()"
{code}
{code}
register Countwords.jar;
A = (foreach (COGROUP(load '/user/viraj/dataset1' USING PigStorage() AS (word:chararray,num:int)) by (word),(load '/user/viraj/dataset2' USING PigStorage() AS (word:chararray,num:int)) by (word)) generate flatten(examples.udf.CountWords(runsub.sh,,)));
B = FOREACH A GENERATE
examples.udf.SubString($0,0,1),
$1 as num;
C = GROUP B BY $0;
D = FOREACH C GENERATE group, SUM(B.num);
STORE D INTO /user/viraj/output;
{code}
The shell substitutes the $0 before passing it to java.
a) Is there a workaround for this?
b) Is this is Pig param problem?
Viraj
was:
I have a Pig script as a template:
{code}
register Countwords.jar;
A = $INPUT;
B = FOREACH A GENERATE
examples.udf.SubString($0,0,1),
$1 as num;
C = GROUP B BY $0;
D = FOREACH C GENERATE group, SUM(B.num);
STORE D INTO $OUTPUT;
{code}
I attempt to do Parameter substitutions using the following:
Using Shell script:
{code}
#!/bin/bash
java -cp ~/pig-svn/trunk/pig.jar:$HADOOP_CONF_DIR org.apache.pig.Main -r -file sub.pig \
-param INPUT="(foreach (COGROUP(load '/user/viraj/dataset1' USING PigStorage() AS (word:chararray,num:int)) by (word),(load '/user/viraj/dataset2' USING PigStorage() AS (word:chararray,num:int)) by (word)) generate flatten(examples.udf.CountWords(\\$0,\\$1,\\$2)))" \
-param OUTPUT="\'/user/viraj/output\' USING PigStorage()"
{code}
register Countwords.jar;
A = (foreach (COGROUP(load '/user/viraj/dataset1' USING PigStorage() AS (word:chararray,num:int)) by (word),(load '/user/viraj/dataset2' USING PigStorage() AS (word:chararray,num:int)) by (word)) generate flatten(examples.udf.CountWords(runsub.sh,,)));
B = FOREACH A GENERATE
examples.udf.SubString($0,0,1),
$1 as num;
C = GROUP B BY $0;
D = FOREACH C GENERATE group, SUM(B.num);
STORE D INTO /user/viraj/output;
{code}
The shell substitutes the $0 before passing it to java.
a) Is there a workaround for this?
b) Is this is Pig param problem?
Viraj
Parameter subsitution using -param option runs into problems when substituing entire pig statements in a shell script (maybe this is a bash problem)
----------------------------------------------------------------------------------------------------------------------------------------------------
Key: PIG-1586
URL:
https://issues.apache.org/jira/browse/PIG-1586Project: Pig
Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Viraj Bhat
I have a Pig script as a template:
{code}
register Countwords.jar;
A = $INPUT;
B = FOREACH A GENERATE
examples.udf.SubString($0,0,1),
$1 as num;
C = GROUP B BY $0;
D = FOREACH C GENERATE group, SUM(B.num);
STORE D INTO $OUTPUT;
{code}
I attempt to do Parameter substitutions using the following:
Using Shell script:
{code}
#!/bin/bash
java -cp ~/pig-svn/trunk/pig.jar:$HADOOP_CONF_DIR org.apache.pig.Main -r -file sub.pig \
-param INPUT="(foreach (COGROUP(load '/user/viraj/dataset1' USING PigStorage() AS (word:chararray,num:int)) by (word),(load '/user/viraj/dataset2' USING PigStorage() AS (word:chararray,num:int)) by (word)) generate flatten(examples.udf.CountWords(\\$0,\\$1,\\$2)))" \
-param OUTPUT="\'/user/viraj/output\' USING PigStorage()"
{code}
{code}
register Countwords.jar;
A = (foreach (COGROUP(load '/user/viraj/dataset1' USING PigStorage() AS (word:chararray,num:int)) by (word),(load '/user/viraj/dataset2' USING PigStorage() AS (word:chararray,num:int)) by (word)) generate flatten(examples.udf.CountWords(runsub.sh,,)));
B = FOREACH A GENERATE
examples.udf.SubString($0,0,1),
$1 as num;
C = GROUP B BY $0;
D = FOREACH C GENERATE group, SUM(B.num);
STORE D INTO /user/viraj/output;
{code}
The shell substitutes the $0 before passing it to java.
a) Is there a workaround for this?
b) Is this is Pig param problem?
Viraj
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.