FAQ
Hi Guys,

I have an Avro backed table. HIVE and the avro tools jar can read the files
and IMPALA can describe the table. However selecting from the table in
IMPALA causes the several deamons to crash?

I1021 11:01:18.022570 8623 status.cc:44] Failed to parse file schema:
Invalid JSON float in json_t_to_avro_value_helper
     @ 0x83af7d (unknown)
     @ 0x922a00 (unknown)
     @ 0x92309b (unknown)
     @ 0x95e44d (unknown)
     @ 0x910a8f (unknown)
     @ 0x90a680 (unknown)
     @ 0x9a36c4 (unknown)
     @ 0x3681c07851 (unknown)
     @ 0x36818e811d (unknown)
I1021 11:01:18.030833 5229 progress-updater.cc:56] Query
9c4f2e4eebf1c7a9:811b8dc272d75e8a: 6% Complete (1951 out of 29457)


My schema is


{
   "type" : "record",
   "name" : "points",
   "fields" : [ {
     "name" : "c1",
     "type" : [ "double", "null" ],
     "default" : null
   }, {
     "name" : "c2",
     "type" : [ "string", "null" ],
     "default" : null
   }, {
     "name" : "c3",
     "type" : [ "string", "null" ],
     "default" : null
   }, {
     "name" : "c4",
     "type" : [ "string", "null" ],
     "default" : null
   }, {
     "name" : "c5",
     "type" : [ "double", "null" ],
     "default" : null
   }, {
     "name" : "c6",
     "type" : [ "double", "null" ],
     "default" : null
   }, {
     "name" : "c7",
     "type" : [ "string", "null" ],
     "default" : null
   }, {
     "name" : "c8",
     "type" : [ "string", "null" ],
     "default" : null
   }, {
     "name" : "c9",
     "type" : [ "double", "null" ],
     "default" : null
   }, {
     "name" : "c10",
     "type" : [ "double", "null" ],
     "default" : null
   }, {
     "name" : "c11",
     "type" : [ "double", "null" ],
     "default" : null
   }, {
     "name" : "c12",
     "type" : [ "double", "null" ],
     "default" : null
   }, {
     "name" : "c13",
     "type" : [ "double", "null" ],
     "default" : null
   }, {
     "name" : "c14",
     "type" : [ "double", "null" ],
     "default" : null
   }, {
     "name" : "c15",
     "type" : [ "double", "null" ],
     "default" : null
   }, {
     "name" : "c16",
     "type" : [ "double", "null" ],
     "default" : null
   }, {
     "name" : "c17",
     "type" : [ "double", "null" ],
     "default" : null
   }, {
     "name" : "c18",
     "type" : [ "double", "null" ],
     "default" : null
   }, {
     "name" : "id1",
     "type" : "int"
   }, {
     "name" : "id2",
     "type" : "int"
   }, {
     "name" : "root_id",
     "type" : "string"
   } ]
}


Describing table in impala works, the table is partition by columns not in
the avro files (flume creates the directories).

Query: describe points
Query finished, fetching results ...
+----------------------------+--------+-------------------+
name | type | comment |
+----------------------------+--------+-------------------+
c1| double | from deserializer |
c2| string | from deserializer |
c3| string | from deserializer |
c4| string | from deserializer |
c5| double | from deserializer |
c6| double | from deserializer |
c7| string | from deserializer |
c8| string | from deserializer |
c9| double | from deserializer |
c10| double | from deserializer |
c11| double | from deserializer |
c12| double | from deserializer |
c13| double | from deserializer |
c14| double | from deserializer |
c15| double | from deserializer |
c16| double | from deserializer |
c17| double | from deserializer |
c18| double | from deserializer |
id1| int | from deserializer |
id2| int | from deserializer |
root_id | string | from deserializer |
deployment | string | |
date_id | int | |
hour | int | |
q_strategy | string | |
q_fund | string | |
q_expiry | string | |
+----------------------------+--------+-------------------+
Returned 27 row(s) in 29.33s

To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.

Search Discussions

  • Nong Li at Oct 21, 2013 at 4:50 pm
    Thanks for letting us know. This looks like an issue with our handling of
    some avro schemas. What version of Impala
    are you running?

    I've filed https://issues.cloudera.org/browse/IMPALA-635 to track the issue.

    On Mon, Oct 21, 2013 at 2:28 AM, Andrew Stevenson wrote:

    Hi Guys,

    I have an Avro backed table. HIVE and the avro tools jar can read the
    files and IMPALA can describe the table. However selecting from the table
    in IMPALA causes the several deamons to crash?

    I1021 11:01:18.022570 8623 status.cc:44] Failed to parse file schema:
    Invalid JSON float in json_t_to_avro_value_helper
    @ 0x83af7d (unknown)
    @ 0x922a00 (unknown)
    @ 0x92309b (unknown)
    @ 0x95e44d (unknown)
    @ 0x910a8f (unknown)
    @ 0x90a680 (unknown)
    @ 0x9a36c4 (unknown)
    @ 0x3681c07851 (unknown)
    @ 0x36818e811d (unknown)
    I1021 11:01:18.030833 5229 progress-updater.cc:56] Query
    9c4f2e4eebf1c7a9:811b8dc272d75e8a: 6% Complete (1951 out of 29457)


    My schema is


    {
    "type" : "record",
    "name" : "points",
    "fields" : [ {
    "name" : "c1",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c2",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c3",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c4",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c5",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c6",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c7",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c8",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c9",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c10",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c11",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c12",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c13",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c14",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c15",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c16",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c17",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c18",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "id1",
    "type" : "int"
    }, {
    "name" : "id2",
    "type" : "int"
    }, {
    "name" : "root_id",
    "type" : "string"
    } ]
    }


    Describing table in impala works, the table is partition by columns not in
    the avro files (flume creates the directories).

    Query: describe points
    Query finished, fetching results ...
    +----------------------------+--------+-------------------+
    name | type | comment |
    +----------------------------+--------+-------------------+
    c1| double | from deserializer |
    c2| string | from deserializer |
    c3| string | from deserializer |
    c4| string | from deserializer |
    c5| double | from deserializer |
    c6| double | from deserializer |
    c7| string | from deserializer |
    c8| string | from deserializer |
    c9| double | from deserializer |
    c10| double | from deserializer |
    c11| double | from deserializer |
    c12| double | from deserializer |
    c13| double | from deserializer |
    c14| double | from deserializer |
    c15| double | from deserializer |
    c16| double | from deserializer |
    c17| double | from deserializer |
    c18| double | from deserializer |
    id1| int | from deserializer |
    id2| int | from deserializer |
    root_id | string | from deserializer |
    deployment | string | |
    date_id | int | |
    hour | int | |
    q_strategy | string | |
    q_fund | string | |
    q_expiry | string | |
    +----------------------------+--------+-------------------+
    Returned 27 row(s) in 29.33s

    To unsubscribe from this group and stop receiving emails from it, send an
    email to impala-user+unsubscribe@cloudera.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.
  • Andrew Stevenson at Oct 21, 2013 at 5:03 pm
    Version 1.1.1

    Regards Andrew
    ________________________________
    From: Nong Li
    Sent: 21/10/2013 18:50
    To: impala-user@cloudera.org
    Subject: Re: Deamon crash with Avro backed tables.

    Thanks for letting us know. This looks like an issue with our handling of
    some avro schemas. What version of Impala
    are you running?

    I've filed https://issues.cloudera.org/browse/IMPALA-635 to track the issue.

    On Mon, Oct 21, 2013 at 2:28 AM, Andrew Stevenson wrote:

    Hi Guys,

    I have an Avro backed table. HIVE and the avro tools jar can read the
    files and IMPALA can describe the table. However selecting from the table
    in IMPALA causes the several deamons to crash?

    I1021 11:01:18.022570 8623 status.cc:44] Failed to parse file schema:
    Invalid JSON float in json_t_to_avro_value_helper
    @ 0x83af7d (unknown)
    @ 0x922a00 (unknown)
    @ 0x92309b (unknown)
    @ 0x95e44d (unknown)
    @ 0x910a8f (unknown)
    @ 0x90a680 (unknown)
    @ 0x9a36c4 (unknown)
    @ 0x3681c07851 (unknown)
    @ 0x36818e811d (unknown)
    I1021 11:01:18.030833 5229 progress-updater.cc:56] Query
    9c4f2e4eebf1c7a9:811b8dc272d75e8a: 6% Complete (1951 out of 29457)


    My schema is


    {
    "type" : "record",
    "name" : "points",
    "fields" : [ {
    "name" : "c1",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c2",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c3",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c4",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c5",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c6",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c7",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c8",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c9",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c10",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c11",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c12",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c13",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c14",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c15",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c16",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c17",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c18",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "id1",
    "type" : "int"
    }, {
    "name" : "id2",
    "type" : "int"
    }, {
    "name" : "root_id",
    "type" : "string"
    } ]
    }


    Describing table in impala works, the table is partition by columns not in
    the avro files (flume creates the directories).

    Query: describe points
    Query finished, fetching results ...
    +----------------------------+--------+-------------------+
    name | type | comment |
    +----------------------------+--------+-------------------+
    c1| double | from deserializer |
    c2| string | from deserializer |
    c3| string | from deserializer |
    c4| string | from deserializer |
    c5| double | from deserializer |
    c6| double | from deserializer |
    c7| string | from deserializer |
    c8| string | from deserializer |
    c9| double | from deserializer |
    c10| double | from deserializer |
    c11| double | from deserializer |
    c12| double | from deserializer |
    c13| double | from deserializer |
    c14| double | from deserializer |
    c15| double | from deserializer |
    c16| double | from deserializer |
    c17| double | from deserializer |
    c18| double | from deserializer |
    id1| int | from deserializer |
    id2| int | from deserializer |
    root_id | string | from deserializer |
    deployment | string | |
    date_id | int | |
    hour | int | |
    q_strategy | string | |
    q_fund | string | |
    q_expiry | string | |
    +----------------------------+--------+-------------------+
    Returned 27 row(s) in 29.33s

    To unsubscribe from this group and stop receiving emails from it, send an
    email to impala-user+unsubscribe@cloudera.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.

    To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.
  • Skye Wanderman-Milne at Oct 21, 2013 at 7:11 pm
    Hi Andrew,

    I think the problem is that, according to the Avro
    spec<http://avro.apache.org/docs/current/spec.html>,
    "Default values for union fields correspond to the first schema in the
    union", i.e. the default value must match the first type in the union. This
    seems like an unnecessary constraint in most cases though, and Impala
    shouldn't crash due to a bad schema, so we'll resolve this issue as soon as
    possible.

    As a workaround for now, try switching the order of your union types, e.g.,
    use ["null", "double"] instead of ["double", "null"].

    Thanks,
    Skye


    On Mon, Oct 21, 2013 at 10:02 AM, Andrew Stevenson
    wrote:
    Version 1.1.1

    Regards Andrew
    ------------------------------
    From: Nong Li <nong@cloudera.com>
    Sent: 21/10/2013 18:50
    To: impala-user@cloudera.org
    Subject: Re: Deamon crash with Avro backed tables.

    Thanks for letting us know. This looks like an issue with our handling
    of some avro schemas. What version of Impala
    are you running?

    I've filed https://issues.cloudera.org/browse/IMPALA-635 to track the
    issue.


    On Mon, Oct 21, 2013 at 2:28 AM, Andrew Stevenson wrote:

    Hi Guys,

    I have an Avro backed table. HIVE and the avro tools jar can read the
    files and IMPALA can describe the table. However selecting from the table
    in IMPALA causes the several deamons to crash?

    I1021 11:01:18.022570 8623 status.cc:44] Failed to parse file schema:
    Invalid JSON float in json_t_to_avro_value_helper
    @ 0x83af7d (unknown)
    @ 0x922a00 (unknown)
    @ 0x92309b (unknown)
    @ 0x95e44d (unknown)
    @ 0x910a8f (unknown)
    @ 0x90a680 (unknown)
    @ 0x9a36c4 (unknown)
    @ 0x3681c07851 (unknown)
    @ 0x36818e811d (unknown)
    I1021 11:01:18.030833 5229 progress-updater.cc:56] Query
    9c4f2e4eebf1c7a9:811b8dc272d75e8a: 6% Complete (1951 out of 29457)


    My schema is


    {
    "type" : "record",
    "name" : "points",
    "fields" : [ {
    "name" : "c1",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c2",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c3",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c4",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c5",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c6",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c7",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c8",
    "type" : [ "string", "null" ],
    "default" : null
    }, {
    "name" : "c9",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c10",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c11",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c12",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c13",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c14",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c15",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c16",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c17",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "c18",
    "type" : [ "double", "null" ],
    "default" : null
    }, {
    "name" : "id1",
    "type" : "int"
    }, {
    "name" : "id2",
    "type" : "int"
    }, {
    "name" : "root_id",
    "type" : "string"
    } ]
    }


    Describing table in impala works, the table is partition by columns not
    in the avro files (flume creates the directories).

    Query: describe points
    Query finished, fetching results ...
    +----------------------------+--------+-------------------+
    name | type | comment |
    +----------------------------+--------+-------------------+
    c1| double | from deserializer |
    c2| string | from deserializer |
    c3| string | from deserializer |
    c4| string | from deserializer |
    c5| double | from deserializer |
    c6| double | from deserializer |
    c7| string | from deserializer |
    c8| string | from deserializer |
    c9| double | from deserializer |
    c10| double | from deserializer |
    c11| double | from deserializer |
    c12| double | from deserializer |
    c13| double | from deserializer |
    c14| double | from deserializer |
    c15| double | from deserializer |
    c16| double | from deserializer |
    c17| double | from deserializer |
    c18| double | from deserializer |
    id1| int | from deserializer |
    id2| int | from deserializer |
    root_id | string | from deserializer |
    deployment | string | |
    date_id | int | |
    hour | int | |
    q_strategy | string | |
    q_fund | string | |
    q_expiry | string | |
    +----------------------------+--------+-------------------+
    Returned 27 row(s) in 29.33s

    To unsubscribe from this group and stop receiving emails from it, send
    an email to impala-user+unsubscribe@cloudera.org.


    To unsubscribe from this group and stop receiving emails from it, send
    an email to impala-user+unsubscribe@cloudera.org.

    To unsubscribe from this group and stop receiving emails from it, send an
    email to impala-user+unsubscribe@cloudera.org.
    To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupimpala-user @
categorieshadoop
postedOct 21, '13 at 9:28a
activeOct 21, '13 at 7:11p
posts4
users3
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase