Grokbase Groups Pig user July 2011
FAQ
Hey Ankur,

Zebra's TableLoader works with the data written out using Zebra's
TableStorer. So, you need to write the data first using Zebra and then
subsequently load using TableLoader and do merge-join.

Ashutosh
On Tue, Jul 19, 2011 at 14:28, Ankur Jain wrote:
Hi all,

I'm trying to do a map-side only merge join [1] in pig using Zebra's
TableLoader. (My data allows merge join.) But I'm being unable to use the
TableLoader. Even a simple script that loads a table and just stores it back
doesn't work -

----
A = load 'my_input' using org.apache.hadoop.zebra.pig.TableLoader('',
'sorted');
store A into 'my_output';
----


'my_input' is input directory containing a single file with just 1 column -
---
1
2
3
---

The error I get is -

"ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal
error. Failed to find deleted column groupsjava.io.IOException: BT Schema
file doesn't exist: *file:/......./my_input/.btschema*"


I have tried specifying the schema using the 'AS' clause and the DESCRIBE
statement as well, but its fetches me the same error. Is the .btschema file
required? Is there any documentation available on its format? (I tried
comma-separated column names with/without type info)


I am also willing to work with any other loader that satisfies the merge
join constraints. Thanks in anticipation.


Regards,
Ankur


[1] *http://pig.apache.org/docs/r0.8.0/piglatin_ref1.html#Merge+Joins*

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 8 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedJul 19, '11 at 9:29p
activeJul 20, '11 at 10:48p
posts8
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase