FAQ
Hi,
I have installed impala 0.6 and CDH 4.2, i have setuped my cluster with
three data nodes and a namenode. First, i created a table data stored as
TEXTFILE format in hive, And i have loaded about 150 millons rows into the
table, I could query data in hive and in impalad-shell without any errors,
But it was too slow query speed(described on
https://groups.google.com/a/cloudera.org/forum/#!topic/impala-user/loyjtuUaLfI),
So, i want to store the data in the RCFILE format for speed, then i created
a table data stored as RCFILE format in hive and loaded the same data into
this table, But when i query data stored in table of RCFILE format, All
process of impalad running on the data node crashed, the following are the
output of impalad and impala-shell:

----------------------impalad:
.......
13/03/14 21:07:15 INFO DataNucleus.MetaData: Listener found initialisation
for persistable class org.apache.hadoop.hive.metastore.model.MDatabase
13/03/14 21:07:15 INFO metastore.HiveMetaStore: 0: get_all_databases
13/03/14 21:07:15 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr
cmd=get_all_databases
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x0000003281e8905e, pid=15047, tid=140609687783168
#
# JRE version: 6.0_31-b04
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.6-b01 mixed mode
linux-amd64 compressed oops)
# Problematic frame:
# C [libc.so.6+0x8905e] unsigned long+0xe
#
# An error report file with more information is saved as:
# /root/hs_err_pid15047.log
#
# If you would like to submit a bug report, please visit:
# http://java.sun.com/webapps/bugreport/crash.jsp
#

----------------impala-shell:

$ impala-shell
Welcome to the Impala shell. Press TAB twice to see a list of available
commands.

Copyright (c) 2012 Cloudera, Inc. All rights reserved.

(Build version: Impala v0.6 (720f93c) built on Sat Feb 23 18:52:43 PST 2013)
[Not connected] > connect cloudera-host3
Connected to cloudera-host3:21000
[cloudera-host3:21000] > select count(*) from mpos_gb.gb_xdr_1;
Query: select count(*) from mpos_gb.gb_xdr_1
Unknown Exception : [Errno 104] Connection reset by peer
Query aborted, unable to fetch data
[Not connected] >
--------------------------------------------------------------------------------------------

What was the cause of this? What should i do?

Need your help! Thanks!

Search Discussions

  • Marcel Kornacker at Mar 14, 2013 at 3:41 pm
    We recently found some bugs in Impala's rcfile scanner, and you may
    well be hitting those. We will have fixes in the next version, 0.7,
    which will be released this coming week, you should give that a try.

    Marcel
    On Thu, Mar 14, 2013 at 8:35 AM, zjp... wrote:
    Hi,
    I have installed impala 0.6 and CDH 4.2, i have setuped my cluster with
    three data nodes and a namenode. First, i created a table data stored as
    TEXTFILE format in hive, And i have loaded about 150 millons rows into the
    table, I could query data in hive and in impalad-shell without any errors,
    But it was too slow query speed(described on
    https://groups.google.com/a/cloudera.org/forum/#!topic/impala-user/loyjtuUaLfI),
    So, i want to store the data in the RCFILE format for speed, then i created
    a table data stored as RCFILE format in hive and loaded the same data into
    this table, But when i query data stored in table of RCFILE format, All
    process of impalad running on the data node crashed, the following are the
    output of impalad and impala-shell:

    ----------------------impalad:
    .......
    13/03/14 21:07:15 INFO DataNucleus.MetaData: Listener found initialisation
    for persistable class org.apache.hadoop.hive.metastore.model.MDatabase
    13/03/14 21:07:15 INFO metastore.HiveMetaStore: 0: get_all_databases
    13/03/14 21:07:15 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr
    cmd=get_all_databases
    #
    # A fatal error has been detected by the Java Runtime Environment:
    #
    # SIGSEGV (0xb) at pc=0x0000003281e8905e, pid=15047, tid=140609687783168
    #
    # JRE version: 6.0_31-b04
    # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.6-b01 mixed mode
    linux-amd64 compressed oops)
    # Problematic frame:
    # C [libc.so.6+0x8905e] unsigned long+0xe
    #
    # An error report file with more information is saved as:
    # /root/hs_err_pid15047.log
    #
    # If you would like to submit a bug report, please visit:
    # http://java.sun.com/webapps/bugreport/crash.jsp
    #

    ----------------impala-shell:

    $ impala-shell
    Welcome to the Impala shell. Press TAB twice to see a list of available
    commands.

    Copyright (c) 2012 Cloudera, Inc. All rights reserved.

    (Build version: Impala v0.6 (720f93c) built on Sat Feb 23 18:52:43 PST 2013)
    [Not connected] > connect cloudera-host3
    Connected to cloudera-host3:21000
    [cloudera-host3:21000] > select count(*) from mpos_gb.gb_xdr_1;
    Query: select count(*) from mpos_gb.gb_xdr_1
    Unknown Exception : [Errno 104] Connection reset by peer
    Query aborted, unable to fetch data
    [Not connected] >
    --------------------------------------------------------------------------------------------

    What was the cause of this? What should i do?

    Need your help! Thanks!
  • Zjp... at Mar 15, 2013 at 2:23 am
    Hi, Marcel, Thanks for your response!
    I will try to the next version of impala, Hope it has a better
    performance!

    Thanks!

    在 2013年3月14日星期四UTC下午3时41分50秒,Marcel Kornacker写道:
    We recently found some bugs in Impala's rcfile scanner, and you may
    well be hitting those. We will have fixes in the next version, 0.7,
    which will be released this coming week, you should give that a try.

    Marcel

    On Thu, Mar 14, 2013 at 8:35 AM, zjp... <zjp...@gmail.com <javascript:>>
    wrote:
    Hi,
    I have installed impala 0.6 and CDH 4.2, i have setuped my cluster with
    three data nodes and a namenode. First, i created a table data stored as
    TEXTFILE format in hive, And i have loaded about 150 millons rows into the
    table, I could query data in hive and in impalad-shell without any errors,
    But it was too slow query speed(described on
    https://groups.google.com/a/cloudera.org/forum/#!topic/impala-user/loyjtuUaLfI),
    So, i want to store the data in the RCFILE format for speed, then i created
    a table data stored as RCFILE format in hive and loaded the same data into
    this table, But when i query data stored in table of RCFILE format, All
    process of impalad running on the data node crashed, the following are the
    output of impalad and impala-shell:

    ----------------------impalad:
    .......
    13/03/14 21:07:15 INFO DataNucleus.MetaData: Listener found
    initialisation
    for persistable class org.apache.hadoop.hive.metastore.model.MDatabase
    13/03/14 21:07:15 INFO metastore.HiveMetaStore: 0: get_all_databases
    13/03/14 21:07:15 INFO HiveMetaStore.audit: ugi=root
    ip=unknown-ip-addr
    cmd=get_all_databases
    #
    # A fatal error has been detected by the Java Runtime Environment:
    #
    # SIGSEGV (0xb) at pc=0x0000003281e8905e, pid=15047,
    tid=140609687783168
    #
    # JRE version: 6.0_31-b04
    # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.6-b01 mixed mode
    linux-amd64 compressed oops)
    # Problematic frame:
    # C [libc.so.6+0x8905e] unsigned long+0xe
    #
    # An error report file with more information is saved as:
    # /root/hs_err_pid15047.log
    #
    # If you would like to submit a bug report, please visit:
    # http://java.sun.com/webapps/bugreport/crash.jsp
    #

    ----------------impala-shell:

    $ impala-shell
    Welcome to the Impala shell. Press TAB twice to see a list of available
    commands.

    Copyright (c) 2012 Cloudera, Inc. All rights reserved.

    (Build version: Impala v0.6 (720f93c) built on Sat Feb 23 18:52:43 PST 2013)
    [Not connected] > connect cloudera-host3
    Connected to cloudera-host3:21000
    [cloudera-host3:21000] > select count(*) from mpos_gb.gb_xdr_1;
    Query: select count(*) from mpos_gb.gb_xdr_1
    Unknown Exception : [Errno 104] Connection reset by peer
    Query aborted, unable to fetch data
    [Not connected] >
    --------------------------------------------------------------------------------------------
    What was the cause of this? What should i do?

    Need your help! Thanks!

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupimpala-user @
categorieshadoop
postedMar 14, '13 at 3:35p
activeMar 15, '13 at 2:23a
posts3
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Zjp...: 2 posts Marcel Kornacker: 1 post

People

Translate

site design / logo © 2022 Grokbase