Basically, I want a way to be able to see the schema of something from
within a pig script outside of pig, ideally without having to connect to
hadoop to do so.
So for example, we take a random script...
a = LOAD blah AS (one:int, two:chararray, three:int);
b = FOREACH a GENERATE one, two;
ideally I want a way to get the result of DESCRIBE b; but from outside of
pig.
One ugly way I can think of would be to sort of create a temporary script,
append DESCRIBE b;, get rid of any stores and dumbs, run the job locally,
and then only take the result.
I was hoping there might be a nicer way to do it, OR, if not, how do I run
that sort of thing locally, forcing pig not to go onto my hadoop cluster?
I appreciate your help
Jon