Disclaimer: I'm still learning my way around the Pig and Hadoop internals,
so this question is aimed at better understanding that and some of the pig
design choices...
Is there a reason why in Pig we are restricted to a set of types (roughly
corresponding to types in java), instead of having an abstract type like in
Hadoop ie Writable or WritableComparable? I guess I got to thinking about
this when thinking about the Algebraic interface... in Hadoop if you want to
have some crazy intermediate objects, you can do that easily as long as they
are serializable (ie Writable, and WritableComparable if they are going to
the reducer in the shuffle). In fact, in Hadoop there is no notion of some
special class of objects which we work with -- everything is simply Writable
or WritableComparable. In Pig we are more limited, and I was just thinking
about why that needs to be the case. Is there any reason why we can't have
abstract types at the same level as String or Integer? My guess would be it
has to do with how these objects are treated internally, but beyond that am
not sure.
Thanks for helping me think about this
Jon