FAQ
Let me first give some context, I would like to store a datum serialized
with a BinaryEncoder without having to place a schema with it (as the
DataFileWriter does). Instead I have created a container record that stores
a unique id for the schema version and a payload field of type "bytes". This
allows me to have a self-describing data object (for example, to place in a
cell in HBase) without the overhead of a schema per object. (Perhaps there
is a better way to do this, if so please let me know).

The code looks something like this:

GenericRecord container = new GenericData.Record(containerSchema);
writer.setSchema(containerSchema);
container.put(CONTAINER_SCHEMA_ID_FIELD,
datum.getSchema().getProp(SCHEMA_ID_PROPERTY));
container.put(CONTAINER_PAYLOAD_FIELD,
ByteBuffer.wrap(datumBits.toByteArray()));
ByteArrayOutputStream containerBits = new ByteArrayOutputStream();
encoder.init(containerBits);
writer.write(container, encoder);
encoder.flush();
containerBits.flush();
containerBits.close();

I am trying to reuse an encoder by calling init() to re-initialize it.
Perhaps this is what creates the problem. If I create a new encoder each
time everything works fine. However, if I just use init, then the
OutputStream for the encoder is reset but the OutputStream for the
SimpleByteWriter within the encoder is not. This seems to be causing the
problem because when the encoder is flushed, it does not write the bytes in
the ByteWriter. Perhaps the init() method is not supposed to be used this
way. But it would be nice to not have to create a new encoder each time.

Can you please let me know if the above looks right and advise me as to what
is the best way to do the serialization.

Thanks,
Dev


On Tue, Jan 18, 2011 at 4:14 AM, Scott Carey wrote:

BinaryEncoder buffers data, you may have to call flush() to see it in the
output stream.


On 1/17/11 4:53 AM, "Devajyoti Sarkar" wrote:

Hi,

I am just beginning to use Avro, so I apologize if this is a silly
question.

I would like to set a field of type "bytes" in Java. I am assuming that all
I need to do is wrap a byte[] in a ByteBuffer to set the value.
Unfortunately that does not seem to work. I am using a BinaryEncoder and
looking at its output, it has not written any the bytes that were in the
array. The first four values of the array are 0, -128, -128, -128.

Is it because Java uses 8-bit signed bytes while the Avro spec calls for
8-bit unsigned bytes in a field of type "bytes"? If so, how does one convert
Java bytes to the kind accepted by Avro?

Thanks in advance.

Dev

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 4 of 7 | next ›
Discussion Overview
groupuser @
categoriesavro
postedJan 17, '11 at 12:54p
activeJan 19, '11 at 7:19a
posts7
users4
websiteavro.apache.org
irc#avro

People

Translate

site design / logo © 2022 Grokbase