Oh, gosh, well that makes me uneasy, since I was intending to really use
this, in production.

Is there something in particular about this class that makes it not
intended for real-world use? Performance? The way it's written (i.e.
still depends on old APIs, etc.)?

Is there a loader you suggest I look at using instead that has been more


Dmitriy Ryaboy wrote:
Perhaps I should've documented that better.
That class is *not intended for real use*. As far as I know, it's never been
used by anyone for anything in production.
It's a demo of how one would go about writing a real SequenceFileLoader for
whatever internal stuff you are using. Feel free to replace anything that
makes sense for you in your implementation.


On Mon, Sep 27, 2010 at 1:23 PM, Zach Baileywrote:
Hey folks,

Not sure if this has been discussed already or if this is due to some
limitation in pig, hadoop, or java - but is there a particular reason the
PiggyBank SequenceFileLoader doesn't support the BytesWritable type for
sequence file keys/values?


Looking at the code, it maps the pig-specific DataByteArray class to the
pig type "bytearray" - I don't understand this choice. Why use a
pig-specific class here (which is not very friendly for a mixed pig/non-pig
hadoop ecosystem)?

In fact, if you look at the SequenceFileLoader code you will see something
that looks very strange:

protected Object translateWritableToPigDataType(*Writable w*, byte
dataType) {
switch(dataType) {
case DataType.CHARARRAY: return ((Text) w).toString();
* case DataType.BYTEARRAY: return((DataByteArray) w).get();*
case DataType.INTEGER: return ((IntWritable) w).get();
case DataType.LONG: return ((LongWritable) w).get();
case DataType.FLOAT: return ((FloatWritable) w).get();
case DataType.DOUBLE: return ((DoubleWritable) w).get();
case DataType.BYTE: return ((ByteWritable) w).get();

return null;

This code smells - the method takes a Writeable - which makes sense, but
then for the BYTEARRAY type it's casting it to a DataByteArray, which
doesn't implement Writable! WTF, mate?

I'm going to try my hand at switching this to use BytesWritable instead and
see what explodes.


Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 4 | next ›
Discussion Overview
groupuser @
categoriespig, hadoop
postedSep 27, '10 at 8:30p
activeSep 28, '10 at 12:11a



site design / logo © 2021 Grokbase