Grokbase Groups Avro user May 2011
FAQ
I am writing a Hadoop application whose values are objects called Records
which are serialized using Avro. (I specify a Serialization class for the
Records via the io.serializations property.)

I now need to expand my application so that instead of just a Record I need
to have a more complicated data structure, call it an Augmented Record. Say
that an Augmented Record contains integer N in addition to the record, so
now the value looks like (N, Record). Adding an integer field to the Record
schema just to support this one Hadoop process would be a hack, but I also
can't create a Writable (WritableInt, Record) object because Record uses its
own Avro serialization scheme and so is not Writable. What I want to do is
basically create a new schema of the form [Integer: N, Record: R], where the
Record schema is read in dynamically. Can I dynamically nest schema in this
manner? If not, what is the best approach to serializing an Augmented
Record?

Thanks.

Search Discussions

  • Scott Carey at May 16, 2011 at 10:22 pm
    You can dynamically create a record for this job:

    Schema.createRecord( … )
    create a field with the int,
    create a field with the Record,
    put these in a List,
    call setFields() on the record.

    Use that record for the job.

    The result is a record with two fields, the int and the nested Record.

    On 5/16/11 3:10 PM, "W.P. McNeill" wrote:

    I am writing a Hadoop application whose values are objects called Records which are serialized using Avro. (I specify a Serialization class for the Records via the io.serializations property.)

    I now need to expand my application so that instead of just a Record I need to have a more complicated data structure, call it an Augmented Record. Say that an Augmented Record contains integer N in addition to the record, so now the value looks like (N, Record). Adding an integer field to the Record schema just to support this one Hadoop process would be a hack, but I also can't create a Writable (WritableInt, Record) object because Record uses its own Avro serialization scheme and so is not Writable. What I want to do is basically create a new schema of the form [Integer: N, Record: R], where the Record schema is read in dynamically. Can I dynamically nest schema in this manner? If not, what is the best approach to serializing an Augmented Record?

    Thanks.
  • Sudharsan Sampath at May 17, 2011 at 4:37 am
    Hi,

    You can create a record with the integer and the record itself as fields and
    use this as the record for the job. Your schema would look something as
    follows.

    {
    "name" : "augmentedRecord",
    "type" : "record",
    "fields" : [{
    "name" : "index",
    "type" : "int"
    },{
    "name" : "actualRecord",
    "type" : "record",
    "fields" : [{
    <<your original schema>>
    }]
    }]
    }

    - Sudhan S
    On Tue, May 17, 2011 at 3:40 AM, W.P. McNeill wrote:

    I am writing a Hadoop application whose values are objects called Records
    which are serialized using Avro. (I specify a Serialization class for the
    Records via the io.serializations property.)

    I now need to expand my application so that instead of just a Record I need
    to have a more complicated data structure, call it an Augmented Record. Say
    that an Augmented Record contains integer N in addition to the record, so
    now the value looks like (N, Record). Adding an integer field to the Record
    schema just to support this one Hadoop process would be a hack, but I also
    can't create a Writable (WritableInt, Record) object because Record uses its
    own Avro serialization scheme and so is not Writable. What I want to do is
    basically create a new schema of the form [Integer: N, Record: R], where the
    Record schema is read in dynamically. Can I dynamically nest schema in this
    manner? If not, what is the best approach to serializing an Augmented
    Record?

    Thanks.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriesavro
postedMay 16, '11 at 10:10p
activeMay 17, '11 at 4:37a
posts3
users3
websiteavro.apache.org
irc#avro

People

Translate

site design / logo © 2021 Grokbase