Hi All,
I have a custom loader that returns a set of fields after
reading a log line. One of the fields returned is of type DataType.Map.
My question is how can I set the data types for this map's (key, value)
pair. In my script I try to generate a record from k,v of this map and
get the error
java.io.IOException: Unknown type Unknown
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:178)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
here is my script
raw = LOAD 'myfile' USING myUdf.MyCustomLoader() ;
filtered = FILTER raw BY (ARG_MAP#'key' is not null);
entry = FOREACH filtered GENERATE A, B, myUdf.MySplit(ARG_MAP#'key',
'|') as FIELDS; // This returns a map with String (key, value) pairs
// The MySplit UDF line splits the value in the map which is "|"
separated and puts the splits it into another Map and returns it. Each
split is keyed by 'field0', 'field1'...'fieldn' where n is the number of
splits.
result = FOREACH entry GENERATE A, B, FIELDS#'field0' as CLIENT_ID,
FIELDS#'field1' as CHANNEL_ID, FIELDS#'field2' as OTHER_ID;
// Here another tuple is generated
store results into 'location' using PigStorage();
Any help here is appreciated.
TIA
-Ankur