I'm having an issue where I get a 'ClassCastException: org.apache.pig.data.DataByteArray cannot be cast to java.lang.String' when passing in something of type chararray to REGEX_EXTRACT.
A = load '/path/to/some/data' ....
where A has a schema of something like ( f1:chararray, .... )
B = foreach A generate REGEX_EXTRACT( f1, <the regex>, 1 ) as regex_extract;
This gives me the above error.
Now, the kicker is that if f1 is of type bytearray, (i.e. the schema is ( f1:bytearray, ..... ) this works as expected.
What gives? Am I using REGEX_EXTRACT wrong? Is this a bug?
My understanding is that chararray is supposed to be used for things that are Strings, which is why I find the 'cannot cast to String' exception a bit funky. I've looked through the REGEX_EXTRACT source and looked over the JavaDoc's pertaining to DataTypes without being able to crack this.
Any help and information is appreciated!
Thanks for you time,