Grokbase Groups Pig user April 2010
FAQ
Guys, I have a row containing a map

'id','data', {((1,2)), ((2,3)), ((4,5))}

What is the expected behavior when I flatten on that bag? I had expected it
to result in

'id','data', (1,2)
'id','data', (2,3)
'id','data', (4,5)


But it appears to me that the result of applying FLATTEN to that bag is this
instead:

'id','data', 1,2
'id','data', 2,3
'id','data', 4,5


The latter is returned by the current cloudera's CDH2 and I've seen the
prior behavior on other versions of pig.

Which is the correct behavior by design?

What will pig 0.6 do when it is released?

thanks!

Search Discussions

  • Hc busy at Apr 2, 2010 at 6:32 pm
    doh!!!! s/map/bag/g

    I seem to get maps and bags mixed up or some reason...

    Guys, I have a row containing a *bag*

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had expected it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that bag is this
    instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've seen the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!
    On Fri, Apr 2, 2010 at 11:29 AM, hc busy wrote:

    Guys, I have a row containing a map

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had expected it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that bag is
    this instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've seen the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!
  • Zaki rahaman at Apr 2, 2010 at 6:42 pm
    If I'm not mistaken, the output is the expected behavior. Flatten should
    unnest bags. I'm assuming your statement is something like FOREACH ...
    GENERATE field1, field2, FLATTEN(bag1) which would 'duplicate' the first two
    fields of a tuple for every tuple in the nested bag.


    On Fri, Apr 2, 2010 at 2:02 PM, hc busy wrote:

    doh!!!! s/map/bag/g

    I seem to get maps and bags mixed up or some reason...

    Guys, I have a row containing a *bag*

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had expected it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that bag is
    this
    instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've seen the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!
    On Fri, Apr 2, 2010 at 11:29 AM, hc busy wrote:

    Guys, I have a row containing a map

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had expected it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that bag is
    this instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've seen the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!


    --
    Zaki Rahaman
  • Zaki rahaman at Apr 2, 2010 at 6:43 pm
    Stupid question but are you sure your bag has the dual sets of parentheses?
    (And if I may ask, why is that the case?)
    On Fri, Apr 2, 2010 at 2:11 PM, zaki rahaman wrote:

    If I'm not mistaken, the output is the expected behavior. Flatten should
    unnest bags. I'm assuming your statement is something like FOREACH ...
    GENERATE field1, field2, FLATTEN(bag1) which would 'duplicate' the first two
    fields of a tuple for every tuple in the nested bag.



    On Fri, Apr 2, 2010 at 2:02 PM, hc busy wrote:

    doh!!!! s/map/bag/g

    I seem to get maps and bags mixed up or some reason...

    Guys, I have a row containing a *bag*

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had expected
    it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that bag is
    this
    instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've seen the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!
    On Fri, Apr 2, 2010 at 11:29 AM, hc busy wrote:

    Guys, I have a row containing a map

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had expected it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that bag is
    this instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've seen the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!


    --
    Zaki Rahaman

    --
    Zaki Rahaman
  • Hc busy at Apr 2, 2010 at 7:38 pm
    Yeah, I'm sure it has nested tuples. Pig doesn't natively support
    introduction of tuples

    h = foreach g generate ((x,y,z)), (x), ((((x))))

    doesn't work, but i have a udf that does that.... don't ask why...., and
    I've seen it print double pair of paren's when I took a dump.

    Our hadoop guys here says it's CDH2 and that the "upgrade" was just
    re-installation of CDH2... ("same jars") But certainly my script suddenly
    started doing weird things when it flattened that all the way through.

    I'd support the prior behavior as well, because that seems to match my
    reading of documentation on behavior of FLATTEN.



    Has anybody else had this problem with recent cloudera/pig versions?


    thnx!!

    On Fri, Apr 2, 2010 at 11:43 AM, zaki rahaman wrote:

    Stupid question but are you sure your bag has the dual sets of parentheses?
    (And if I may ask, why is that the case?)
    On Fri, Apr 2, 2010 at 2:11 PM, zaki rahaman wrote:

    If I'm not mistaken, the output is the expected behavior. Flatten should
    unnest bags. I'm assuming your statement is something like FOREACH ...
    GENERATE field1, field2, FLATTEN(bag1) which would 'duplicate' the first two
    fields of a tuple for every tuple in the nested bag.



    On Fri, Apr 2, 2010 at 2:02 PM, hc busy wrote:

    doh!!!! s/map/bag/g

    I seem to get maps and bags mixed up or some reason...

    Guys, I have a row containing a *bag*

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had expected
    it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that bag is
    this
    instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've seen the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!
    On Fri, Apr 2, 2010 at 11:29 AM, hc busy wrote:

    Guys, I have a row containing a map

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had
    expected
    it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that bag
    is
    this instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've seen
    the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!


    --
    Zaki Rahaman

    --
    Zaki Rahaman
  • Hc busy at Apr 2, 2010 at 8:50 pm
    .... yeah, you have to implement outputSchema() method on the udf in order
    to make the content of the tuple visible... There's a nice example in the
    UDF Manual

    http://hadoop.apache.org/pig/docs/r0.6.0/udf.html

    <http://hadoop.apache.org/pig/docs/r0.6.0/udf.html>search for 'package
    myudf' until u find it.


    On Fri, Apr 2, 2010 at 12:52 PM, Russell Jurney wrote:

    Not sure if this is exactly the same, but when I've created tuples within
    tuples in UDFs (to preserve order of pairs), from bag input, Pig has
    allowed
    it - but I can't work with that data in subsequent steps.
    On Fri, Apr 2, 2010 at 12:37 PM, hc busy wrote:

    Yeah, I'm sure it has nested tuples. Pig doesn't natively support
    introduction of tuples

    h = foreach g generate ((x,y,z)), (x), ((((x))))

    doesn't work, but i have a udf that does that.... don't ask why...., and
    I've seen it print double pair of paren's when I took a dump.

    Our hadoop guys here says it's CDH2 and that the "upgrade" was just
    re-installation of CDH2... ("same jars") But certainly my script suddenly
    started doing weird things when it flattened that all the way through.

    I'd support the prior behavior as well, because that seems to match my
    reading of documentation on behavior of FLATTEN.



    Has anybody else had this problem with recent cloudera/pig versions?


    thnx!!


    On Fri, Apr 2, 2010 at 11:43 AM, zaki rahaman <zaki.rahaman@gmail.com
    wrote:
    Stupid question but are you sure your bag has the dual sets of
    parentheses?
    (And if I may ask, why is that the case?)

    On Fri, Apr 2, 2010 at 2:11 PM, zaki rahaman <zaki.rahaman@gmail.com>
    wrote:
    If I'm not mistaken, the output is the expected behavior. Flatten
    should
    unnest bags. I'm assuming your statement is something like FOREACH
    ...
    GENERATE field1, field2, FLATTEN(bag1) which would 'duplicate' the
    first
    two
    fields of a tuple for every tuple in the nested bag.



    On Fri, Apr 2, 2010 at 2:02 PM, hc busy wrote:

    doh!!!! s/map/bag/g

    I seem to get maps and bags mixed up or some reason...

    Guys, I have a row containing a *bag*

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had
    expected
    it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that bag
    is
    this
    instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've seen
    the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!
    On Fri, Apr 2, 2010 at 11:29 AM, hc busy wrote:

    Guys, I have a row containing a map

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had
    expected
    it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that
    bag
    is
    this instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've
    seen
    the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!


    --
    Zaki Rahaman

    --
    Zaki Rahaman
  • Hc busy at Apr 2, 2010 at 9:33 pm
    Okay guys some details after some digging. We've got this version of pig
    from CDH2 installed:

    hadoop-pig-0.5.0+11.1-1


    the list of patches that they applied on top of 0.5.0 are listed here:

    http://archive.cloudera.com/cdh/2/pig-0.5.0+11.1.CHANGES.txt

    <http://archive.cloudera.com/cdh/2/pig-0.5.0+11.1.CHANGES.txt>The patches
    listed there doesn't seem to deal with FLATTEN in any way.

    Any suggestions?



    On Fri, Apr 2, 2010 at 1:49 PM, hc busy wrote:


    .... yeah, you have to implement outputSchema() method on the udf in order
    to make the content of the tuple visible... There's a nice example in the
    UDF Manual

    http://hadoop.apache.org/pig/docs/r0.6.0/udf.html

    <http://hadoop.apache.org/pig/docs/r0.6.0/udf.html>search for 'package
    myudf' until u find it.


    On Fri, Apr 2, 2010 at 12:52 PM, Russell Jurney wrote:

    Not sure if this is exactly the same, but when I've created tuples within
    tuples in UDFs (to preserve order of pairs), from bag input, Pig has
    allowed
    it - but I can't work with that data in subsequent steps.
    On Fri, Apr 2, 2010 at 12:37 PM, hc busy wrote:

    Yeah, I'm sure it has nested tuples. Pig doesn't natively support
    introduction of tuples

    h = foreach g generate ((x,y,z)), (x), ((((x))))

    doesn't work, but i have a udf that does that.... don't ask why...., and
    I've seen it print double pair of paren's when I took a dump.

    Our hadoop guys here says it's CDH2 and that the "upgrade" was just
    re-installation of CDH2... ("same jars") But certainly my script suddenly
    started doing weird things when it flattened that all the way through.

    I'd support the prior behavior as well, because that seems to match my
    reading of documentation on behavior of FLATTEN.



    Has anybody else had this problem with recent cloudera/pig versions?


    thnx!!


    On Fri, Apr 2, 2010 at 11:43 AM, zaki rahaman <zaki.rahaman@gmail.com
    wrote:
    Stupid question but are you sure your bag has the dual sets of
    parentheses?
    (And if I may ask, why is that the case?)

    On Fri, Apr 2, 2010 at 2:11 PM, zaki rahaman <zaki.rahaman@gmail.com>
    wrote:
    If I'm not mistaken, the output is the expected behavior. Flatten
    should
    unnest bags. I'm assuming your statement is something like FOREACH
    ...
    GENERATE field1, field2, FLATTEN(bag1) which would 'duplicate' the
    first
    two
    fields of a tuple for every tuple in the nested bag.



    On Fri, Apr 2, 2010 at 2:02 PM, hc busy wrote:

    doh!!!! s/map/bag/g

    I seem to get maps and bags mixed up or some reason...

    Guys, I have a row containing a *bag*

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had
    expected
    it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that
    bag
    is
    this
    instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've seen
    the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!
    On Fri, Apr 2, 2010 at 11:29 AM, hc busy <hc.busy@gmail.com>
    wrote:
    Guys, I have a row containing a map

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had
    expected
    it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that
    bag
    is
    this instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've
    seen
    the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!


    --
    Zaki Rahaman

    --
    Zaki Rahaman
  • Hc busy at Apr 2, 2010 at 9:34 pm
    The hadoop version:

    hadoop-0.20-0.20.1+169.68-1
    On Fri, Apr 2, 2010 at 2:33 PM, hc busy wrote:

    Okay guys some details after some digging. We've got this version of pig
    from CDH2 installed:

    hadoop-pig-0.5.0+11.1-1


    the list of patches that they applied on top of 0.5.0 are listed here:

    http://archive.cloudera.com/cdh/2/pig-0.5.0+11.1.CHANGES.txt

    <http://archive.cloudera.com/cdh/2/pig-0.5.0+11.1.CHANGES.txt>The patches
    listed there doesn't seem to deal with FLATTEN in any way.

    Any suggestions?



    On Fri, Apr 2, 2010 at 1:49 PM, hc busy wrote:


    .... yeah, you have to implement outputSchema() method on the udf in order
    to make the content of the tuple visible... There's a nice example in the
    UDF Manual

    http://hadoop.apache.org/pig/docs/r0.6.0/udf.html

    <http://hadoop.apache.org/pig/docs/r0.6.0/udf.html>search for 'package
    myudf' until u find it.



    On Fri, Apr 2, 2010 at 12:52 PM, Russell Jurney <russell.jurney@gmail.com
    wrote:
    Not sure if this is exactly the same, but when I've created tuples within
    tuples in UDFs (to preserve order of pairs), from bag input, Pig has
    allowed
    it - but I can't work with that data in subsequent steps.
    On Fri, Apr 2, 2010 at 12:37 PM, hc busy wrote:

    Yeah, I'm sure it has nested tuples. Pig doesn't natively support
    introduction of tuples

    h = foreach g generate ((x,y,z)), (x), ((((x))))

    doesn't work, but i have a udf that does that.... don't ask why...., and
    I've seen it print double pair of paren's when I took a dump.

    Our hadoop guys here says it's CDH2 and that the "upgrade" was just
    re-installation of CDH2... ("same jars") But certainly my script suddenly
    started doing weird things when it flattened that all the way through.

    I'd support the prior behavior as well, because that seems to match my
    reading of documentation on behavior of FLATTEN.



    Has anybody else had this problem with recent cloudera/pig versions?


    thnx!!


    On Fri, Apr 2, 2010 at 11:43 AM, zaki rahaman <zaki.rahaman@gmail.com
    wrote:
    Stupid question but are you sure your bag has the dual sets of
    parentheses?
    (And if I may ask, why is that the case?)

    On Fri, Apr 2, 2010 at 2:11 PM, zaki rahaman <zaki.rahaman@gmail.com
    wrote:
    If I'm not mistaken, the output is the expected behavior. Flatten
    should
    unnest bags. I'm assuming your statement is something like FOREACH
    ...
    GENERATE field1, field2, FLATTEN(bag1) which would 'duplicate' the
    first
    two
    fields of a tuple for every tuple in the nested bag.



    On Fri, Apr 2, 2010 at 2:02 PM, hc busy wrote:

    doh!!!! s/map/bag/g

    I seem to get maps and bags mixed up or some reason...

    Guys, I have a row containing a *bag*

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had
    expected
    it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that
    bag
    is
    this
    instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've
    seen
    the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!
    On Fri, Apr 2, 2010 at 11:29 AM, hc busy <hc.busy@gmail.com>
    wrote:
    Guys, I have a row containing a map

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had
    expected
    it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that
    bag
    is
    this instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've
    seen
    the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!


    --
    Zaki Rahaman

    --
    Zaki Rahaman
  • Russell Jurney at Apr 2, 2010 at 10:30 pm
    Thanks. I did so, but I probably did it wrong. Couldn't make it work.
    On Fri, Apr 2, 2010 at 1:49 PM, hc busy wrote:

    .... yeah, you have to implement outputSchema() method on the udf in order
    to make the content of the tuple visible... There's a nice example in the
    UDF Manual

    http://hadoop.apache.org/pig/docs/r0.6.0/udf.html

    <http://hadoop.apache.org/pig/docs/r0.6.0/udf.html>search for 'package
    myudf' until u find it.



    On Fri, Apr 2, 2010 at 12:52 PM, Russell Jurney <russell.jurney@gmail.com
    wrote:
    Not sure if this is exactly the same, but when I've created tuples within
    tuples in UDFs (to preserve order of pairs), from bag input, Pig has
    allowed
    it - but I can't work with that data in subsequent steps.
    On Fri, Apr 2, 2010 at 12:37 PM, hc busy wrote:

    Yeah, I'm sure it has nested tuples. Pig doesn't natively support
    introduction of tuples

    h = foreach g generate ((x,y,z)), (x), ((((x))))

    doesn't work, but i have a udf that does that.... don't ask why....,
    and
    I've seen it print double pair of paren's when I took a dump.

    Our hadoop guys here says it's CDH2 and that the "upgrade" was just
    re-installation of CDH2... ("same jars") But certainly my script
    suddenly
    started doing weird things when it flattened that all the way through.

    I'd support the prior behavior as well, because that seems to match my
    reading of documentation on behavior of FLATTEN.



    Has anybody else had this problem with recent cloudera/pig versions?


    thnx!!


    On Fri, Apr 2, 2010 at 11:43 AM, zaki rahaman <zaki.rahaman@gmail.com
    wrote:
    Stupid question but are you sure your bag has the dual sets of
    parentheses?
    (And if I may ask, why is that the case?)

    On Fri, Apr 2, 2010 at 2:11 PM, zaki rahaman <zaki.rahaman@gmail.com
    wrote:
    If I'm not mistaken, the output is the expected behavior. Flatten
    should
    unnest bags. I'm assuming your statement is something like FOREACH
    ...
    GENERATE field1, field2, FLATTEN(bag1) which would 'duplicate' the
    first
    two
    fields of a tuple for every tuple in the nested bag.



    On Fri, Apr 2, 2010 at 2:02 PM, hc busy wrote:

    doh!!!! s/map/bag/g

    I seem to get maps and bags mixed up or some reason...

    Guys, I have a row containing a *bag*

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had
    expected
    it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that
    bag
    is
    this
    instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've
    seen
    the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!
    On Fri, Apr 2, 2010 at 11:29 AM, hc busy wrote:

    Guys, I have a row containing a map

    'id','data', {((1,2)), ((2,3)), ((4,5))}

    What is the expected behavior when I flatten on that bag? I had
    expected
    it
    to result in

    'id','data', (1,2)
    'id','data', (2,3)
    'id','data', (4,5)


    But it appears to me that the result of applying FLATTEN to that
    bag
    is
    this instead:

    'id','data', 1,2
    'id','data', 2,3
    'id','data', 4,5


    The latter is returned by the current cloudera's CDH2 and I've
    seen
    the
    prior behavior on other versions of pig.

    Which is the correct behavior by design?

    What will pig 0.6 do when it is released?

    thanks!


    --
    Zaki Rahaman

    --
    Zaki Rahaman

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedApr 2, '10 at 6:30p
activeApr 2, '10 at 10:30p
posts9
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase