FAQ
Hi there,

I installed impala from the packages and have the state store and impalad
running.

I loaded the tab1.csv file in HDFS as showed in the documentation into
/user/hive.

When I use impala-shell to select * I get the following :

[localhost:21000] > select * from tab1;
ERROR: Failed to open HDFS file
hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv
Error(255): Unknown error 255
ERROR: Invalid query handle

The impalad logs show :

12/12/24 04:57:21 INFO service.Frontend: createExecRequest for query show
tables
12/12/24 04:57:21 INFO service.JniFrontend:
12/12/24 04:57:21 INFO service.JniFrontend: returned TQueryExecRequest2:
TExecRequest(stmt_type:DDL, sql_stmt:show tables,
request_id:TUniqueId(hi:-2866767995086157791, lo:-6585957689921696349),
query_options:TQueryOptions(abort_on_error:false, max_errors:0,
disable_codegen:false, batch_size:0, return_as_ascii:true, num_nodes:0,
max_scan_range_length:0, num_scanner_threads:0, max_io_buffers:0,
allow_unsupported_formats:false, partition_agg:false),
ddl_exec_request:TDdlExecRequest(ddl_type:SHOW_TABLES, database:default),
result_set_metadata:TResultSetMetadata(columnDescs:[TColumnDesc(columnName:name,
columnType:STRING)]))
12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Shutting down the object
store...
12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
ip=unknown-ip-addr cmd=Shutting down the object store...
12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Metastore shutdown
complete.
12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
ip=unknown-ip-addr cmd=Metastore shutdown complete.
12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: get_all_databases
12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
ip=unknown-ip-addr cmd=get_all_databases
12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Opening raw store with
implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
12/12/24 04:57:25 INFO metastore.ObjectStore: ObjectStore, initialize called
12/12/24 04:57:25 INFO metastore.ObjectStore: Initialized ObjectStore
12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: get_tables: db=default
pat=*
12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
ip=unknown-ip-addr cmd=get_tables: db=default pat=*
12/12/24 04:57:30 INFO service.Frontend: createExecRequest for query select
* from tab1
12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_table : db=default
tbl=tab1
12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
ip=unknown-ip-addr cmd=get_table : db=default tbl=tab1
12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_config_value:
name=hive.exec.default.partition.name
defaultValue=__HIVE_DEFAULT_PARTITION__
12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
ip=unknown-ip-addr cmd=get_config_value:
name=hive.exec.default.partition.name
defaultValue=__HIVE_DEFAULT_PARTITION__
12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_fields:
db=defaulttbl=tab1
12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
ip=unknown-ip-addr cmd=get_fields: db=defaulttbl=tab1
12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_table : db=default
tbl=tab1
12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
ip=unknown-ip-addr cmd=get_table : db=default tbl=tab1
12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_partitions :
db=default tbl=tab1
12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
ip=unknown-ip-addr cmd=get_partitions : db=default tbl=tab1
12/12/24 04:57:31 INFO service.JniFrontend: Plan Fragment 0
UNPARTITIONED
EXCHANGE (1)
TUPLE IDS: 0

Plan Fragment 1
RANDOM
STREAM DATA SINK
EXCHANGE ID: 1
UNPARTITIONED

SCAN HDFS table=default.tab1 (0)
TUPLE IDS: 0

12/12/24 04:57:31 INFO service.JniFrontend: returned TQueryExecRequest2:
TExecRequest(stmt_type:QUERY, sql_stmt:select * from tab1,
request_id:TUniqueId(hi:5327623243124263422, lo:-4992899153864226192),
query_options:TQueryOptions(abort_on_error:false, max_errors:0,
disable_codegen:false, batch_size:0, return_as_ascii:true, num_nodes:0,
max_scan_range_length:0, num_scanner_threads:0, max_io_buffers:0,
allow_unsupported_formats:false, partition_agg:false),
query_exec_request:TQueryExecRequest(desc_tbl:TDescriptorTable(slotDescriptors:[TSlotDescriptor(id:0,
parent:0, slotType:INT, columnPos:0, byteOffset:4, nullIndicatorByte:0,
nullIndicatorBit:1, slotIdx:1, isMaterialized:true), TSlotDescriptor(id:1,
parent:0, slotType:BOOLEAN, columnPos:1, byteOffset:1, nullIndicatorByte:0,
nullIndicatorBit:0, slotIdx:0, isMaterialized:true), TSlotDescriptor(id:2,
parent:0, slotType:DOUBLE, columnPos:2, byteOffset:8, nullIndicatorByte:0,
nullIndicatorBit:2, slotIdx:2, isMaterialized:true), TSlotDescriptor(id:3,
parent:0, slotType:TIMESTAMP, columnPos:3, byteOffset:16,
nullIndicatorByte:0, nullIndicatorBit:3, slotIdx:3, isMaterialized:true)],
tupleDescriptors:[TTupleDescriptor(id:0, byteSize:32, numNullBytes:1,
tableId:0)], tableDescriptors:[TTableDescriptor(id:0, tableType:HDFS_TABLE,
numCols:4, numClusteringCols:0,
hdfsTable:THdfsTable(hdfsBaseDir:hdfs://localhost:8020/user/hive/warehouse/tab1,
partitionKeyNames:[], nullPartitionKeyValue:__HIVE_DEFAULT_PARTITION__,
partitions:{-1=THdfsPartition(lineDelim:10, fieldDelim:44,
collectionDelim:44, mapKeyDelim:44, escapeChar:0, fileFormat:TEXT,
partitionKeyExprs:[], blockSize:0, compression:NONE),
1=THdfsPartition(lineDelim:10, fieldDelim:44, collectionDelim:44,
mapKeyDelim:44, escapeChar:0, fileFormat:TEXT, partitionKeyExprs:[],
blockSize:0, compression:NONE)}), tableName:tab1, dbName:default)]),
fragments:[TPlanFragment(plan:TPlan(nodes:[TPlanNode(node_id:1,
node_type:EXCHANGE_NODE, num_children:0, limit:-1, row_tuples:[0],
nullable_tuples:[false], compact_data:false)]),
output_exprs:[TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:INT,
num_children:0, slot_ref:TSlotRef(slot_id:0))]),
TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:BOOLEAN, num_children:0,
slot_ref:TSlotRef(slot_id:1))]), TExpr(nodes:[TExprNode(node_type:SLOT_REF,
type:DOUBLE, num_children:0, slot_ref:TSlotRef(slot_id:2))]),
TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:TIMESTAMP, num_children:0,
slot_ref:TSlotRef(slot_id:3))])],
partition:TDataPartition(type:UNPARTITIONED, partitioning_exprs:[])),
TPlanFragment(plan:TPlan(nodes:[TPlanNode(node_id:0,
node_type:HDFS_SCAN_NODE, num_children:0, limit:-1, row_tuples:[0],
nullable_tuples:[false], compact_data:false,
hdfs_scan_node:THdfsScanNode(tuple_id:0))]),
output_sink:TDataSink(type:DATA_STREAM_SINK,
stream_sink:TDataStreamSink(dest_node_id:1,
output_partition:TDataPartition(type:UNPARTITIONED,
partitioning_exprs:[]))), partition:TDataPartition(type:RANDOM,
partitioning_exprs:[]))], dest_fragment_idx:[0],
per_node_scan_ranges:{0=[TScanRangeLocations(scan_range:TScanRange(hdfs_file_split:THdfsFileSplit(path:hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv,
offset:0, length:193, partition_id:1)),
locations:[TScanRangeLocation(server:THostPort(hostname:127.0.0.1,
ipaddress:127.0.0.1, port:50010), volume_id:0)])]},
query_globals:TQueryGlobals(now_string:2012-12-24 04:57:31.000000050)),
result_set_metadata:TResultSetMetadata(columnDescs:[TColumnDesc(columnName:id,
columnType:INT), TColumnDesc(columnName:col_1, columnType:BOOLEAN),
TColumnDesc(columnName:col_2, columnType:DOUBLE),
TColumnDesc(columnName:col_3, columnType:TIMESTAMP)]))
hdfsOpenFile(hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv):
FileSystem#open((Lorg/apache/hadoop/fs/Path;I)Lorg/apache/hadoop/fs/FSDataInputStream;)
error:
java.lang.IllegalArgumentException: Wrong FS:
hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv, expected:
hdfs://localhost:20500
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:547)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:169)
at
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:245)
at
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:78)
12/12/24 05:02:32 INFO service.Frontend: createExecRequest for query select
* from tab1

I'm scratching my head with the following questions :
- what is the user supposed to start the state store, and the impalad ?
- why is there no init.d script for these daemons ?
- how can I get out of this issue ? Why on earth is it looking for
something local ?

Thank you !

Sekine

--

Search Discussions

  • Harsh J at Dec 31, 2012 at 5:15 am
    I believe you may have an incorrect fs.defaultFS value in your
    /etc/hadoop/conf/core-site.xml, that is perhaps set to
    hdfs://localhost:20500 (bad port) instead of simply hdfs://localhost.
    On Mon, Dec 24, 2012 at 6:39 PM, Sékine Coulibaly wrote:
    Hi there,

    I installed impala from the packages and have the state store and impalad
    running.

    I loaded the tab1.csv file in HDFS as showed in the documentation into
    /user/hive.

    When I use impala-shell to select * I get the following :

    [localhost:21000] > select * from tab1;
    ERROR: Failed to open HDFS file
    hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv
    Error(255): Unknown error 255
    ERROR: Invalid query handle

    The impalad logs show :

    12/12/24 04:57:21 INFO service.Frontend: createExecRequest for query show
    tables
    12/12/24 04:57:21 INFO service.JniFrontend:
    12/12/24 04:57:21 INFO service.JniFrontend: returned TQueryExecRequest2:
    TExecRequest(stmt_type:DDL, sql_stmt:show tables,
    request_id:TUniqueId(hi:-2866767995086157791, lo:-6585957689921696349),
    query_options:TQueryOptions(abort_on_error:false, max_errors:0,
    disable_codegen:false, batch_size:0, return_as_ascii:true, num_nodes:0,
    max_scan_range_length:0, num_scanner_threads:0, max_io_buffers:0,
    allow_unsupported_formats:false, partition_agg:false),
    ddl_exec_request:TDdlExecRequest(ddl_type:SHOW_TABLES, database:default),
    result_set_metadata:TResultSetMetadata(columnDescs:[TColumnDesc(columnName:name,
    columnType:STRING)]))
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Shutting down the object
    store...
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=Shutting down the object store...
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Metastore shutdown
    complete.
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=Metastore shutdown complete.
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: get_all_databases
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_all_databases
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Opening raw store with
    implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
    12/12/24 04:57:25 INFO metastore.ObjectStore: ObjectStore, initialize called
    12/12/24 04:57:25 INFO metastore.ObjectStore: Initialized ObjectStore
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: get_tables: db=default
    pat=*
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_tables: db=default pat=*
    12/12/24 04:57:30 INFO service.Frontend: createExecRequest for query select
    * from tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_table : db=default
    tbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_table : db=default tbl=tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_config_value:
    name=hive.exec.default.partition.name
    defaultValue=__HIVE_DEFAULT_PARTITION__
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_config_value:
    name=hive.exec.default.partition.name
    defaultValue=__HIVE_DEFAULT_PARTITION__
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_fields:
    db=defaulttbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_fields: db=defaulttbl=tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_table : db=default
    tbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_table : db=default tbl=tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_partitions :
    db=default tbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_partitions : db=default tbl=tab1
    12/12/24 04:57:31 INFO service.JniFrontend: Plan Fragment 0
    UNPARTITIONED
    EXCHANGE (1)
    TUPLE IDS: 0

    Plan Fragment 1
    RANDOM
    STREAM DATA SINK
    EXCHANGE ID: 1
    UNPARTITIONED

    SCAN HDFS table=default.tab1 (0)
    TUPLE IDS: 0

    12/12/24 04:57:31 INFO service.JniFrontend: returned TQueryExecRequest2:
    TExecRequest(stmt_type:QUERY, sql_stmt:select * from tab1,
    request_id:TUniqueId(hi:5327623243124263422, lo:-4992899153864226192),
    query_options:TQueryOptions(abort_on_error:false, max_errors:0,
    disable_codegen:false, batch_size:0, return_as_ascii:true, num_nodes:0,
    max_scan_range_length:0, num_scanner_threads:0, max_io_buffers:0,
    allow_unsupported_formats:false, partition_agg:false),
    query_exec_request:TQueryExecRequest(desc_tbl:TDescriptorTable(slotDescriptors:[TSlotDescriptor(id:0,
    parent:0, slotType:INT, columnPos:0, byteOffset:4, nullIndicatorByte:0,
    nullIndicatorBit:1, slotIdx:1, isMaterialized:true), TSlotDescriptor(id:1,
    parent:0, slotType:BOOLEAN, columnPos:1, byteOffset:1, nullIndicatorByte:0,
    nullIndicatorBit:0, slotIdx:0, isMaterialized:true), TSlotDescriptor(id:2,
    parent:0, slotType:DOUBLE, columnPos:2, byteOffset:8, nullIndicatorByte:0,
    nullIndicatorBit:2, slotIdx:2, isMaterialized:true), TSlotDescriptor(id:3,
    parent:0, slotType:TIMESTAMP, columnPos:3, byteOffset:16,
    nullIndicatorByte:0, nullIndicatorBit:3, slotIdx:3, isMaterialized:true)],
    tupleDescriptors:[TTupleDescriptor(id:0, byteSize:32, numNullBytes:1,
    tableId:0)], tableDescriptors:[TTableDescriptor(id:0, tableType:HDFS_TABLE,
    numCols:4, numClusteringCols:0,
    hdfsTable:THdfsTable(hdfsBaseDir:hdfs://localhost:8020/user/hive/warehouse/tab1,
    partitionKeyNames:[], nullPartitionKeyValue:__HIVE_DEFAULT_PARTITION__,
    partitions:{-1=THdfsPartition(lineDelim:10, fieldDelim:44,
    collectionDelim:44, mapKeyDelim:44, escapeChar:0, fileFormat:TEXT,
    partitionKeyExprs:[], blockSize:0, compression:NONE),
    1=THdfsPartition(lineDelim:10, fieldDelim:44, collectionDelim:44,
    mapKeyDelim:44, escapeChar:0, fileFormat:TEXT, partitionKeyExprs:[],
    blockSize:0, compression:NONE)}), tableName:tab1, dbName:default)]),
    fragments:[TPlanFragment(plan:TPlan(nodes:[TPlanNode(node_id:1,
    node_type:EXCHANGE_NODE, num_children:0, limit:-1, row_tuples:[0],
    nullable_tuples:[false], compact_data:false)]),
    output_exprs:[TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:INT,
    num_children:0, slot_ref:TSlotRef(slot_id:0))]),
    TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:BOOLEAN, num_children:0,
    slot_ref:TSlotRef(slot_id:1))]), TExpr(nodes:[TExprNode(node_type:SLOT_REF,
    type:DOUBLE, num_children:0, slot_ref:TSlotRef(slot_id:2))]),
    TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:TIMESTAMP, num_children:0,
    slot_ref:TSlotRef(slot_id:3))])],
    partition:TDataPartition(type:UNPARTITIONED, partitioning_exprs:[])),
    TPlanFragment(plan:TPlan(nodes:[TPlanNode(node_id:0,
    node_type:HDFS_SCAN_NODE, num_children:0, limit:-1, row_tuples:[0],
    nullable_tuples:[false], compact_data:false,
    hdfs_scan_node:THdfsScanNode(tuple_id:0))]),
    output_sink:TDataSink(type:DATA_STREAM_SINK,
    stream_sink:TDataStreamSink(dest_node_id:1,
    output_partition:TDataPartition(type:UNPARTITIONED,
    partitioning_exprs:[]))), partition:TDataPartition(type:RANDOM,
    partitioning_exprs:[]))], dest_fragment_idx:[0],
    per_node_scan_ranges:{0=[TScanRangeLocations(scan_range:TScanRange(hdfs_file_split:THdfsFileSplit(path:hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv,
    offset:0, length:193, partition_id:1)),
    locations:[TScanRangeLocation(server:THostPort(hostname:127.0.0.1,
    ipaddress:127.0.0.1, port:50010), volume_id:0)])]},
    query_globals:TQueryGlobals(now_string:2012-12-24 04:57:31.000000050)),
    result_set_metadata:TResultSetMetadata(columnDescs:[TColumnDesc(columnName:id,
    columnType:INT), TColumnDesc(columnName:col_1, columnType:BOOLEAN),
    TColumnDesc(columnName:col_2, columnType:DOUBLE),
    TColumnDesc(columnName:col_3, columnType:TIMESTAMP)]))
    hdfsOpenFile(hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv):
    FileSystem#open((Lorg/apache/hadoop/fs/Path;I)Lorg/apache/hadoop/fs/FSDataInputStream;)
    error:
    java.lang.IllegalArgumentException: Wrong FS:
    hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv, expected:
    hdfs://localhost:20500
    at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:547)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:169)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:245)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:78)
    12/12/24 05:02:32 INFO service.Frontend: createExecRequest for query select
    * from tab1

    I'm scratching my head with the following questions :
    - what is the user supposed to start the state store, and the impalad ?
    - why is there no init.d script for these daemons ?
    - how can I get out of this issue ? Why on earth is it looking for something
    local ?

    Thank you !

    Sekine

    --


    --
    Harsh J

    --
  • Marcel Kornacker at Dec 31, 2012 at 7:38 pm
    Sekine, how did you install Impala? Did you use Cloudera Manager?

    Also, are you able to run that same query through Hive?

    Marcel
    On Mon, Dec 24, 2012 at 5:09 AM, Sékine Coulibaly wrote:
    Hi there,

    I installed impala from the packages and have the state store and impalad
    running.

    I loaded the tab1.csv file in HDFS as showed in the documentation into
    /user/hive.

    When I use impala-shell to select * I get the following :

    [localhost:21000] > select * from tab1;
    ERROR: Failed to open HDFS file
    hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv
    Error(255): Unknown error 255
    ERROR: Invalid query handle

    The impalad logs show :

    12/12/24 04:57:21 INFO service.Frontend: createExecRequest for query show
    tables
    12/12/24 04:57:21 INFO service.JniFrontend:
    12/12/24 04:57:21 INFO service.JniFrontend: returned TQueryExecRequest2:
    TExecRequest(stmt_type:DDL, sql_stmt:show tables,
    request_id:TUniqueId(hi:-2866767995086157791, lo:-6585957689921696349),
    query_options:TQueryOptions(abort_on_error:false, max_errors:0,
    disable_codegen:false, batch_size:0, return_as_ascii:true, num_nodes:0,
    max_scan_range_length:0, num_scanner_threads:0, max_io_buffers:0,
    allow_unsupported_formats:false, partition_agg:false),
    ddl_exec_request:TDdlExecRequest(ddl_type:SHOW_TABLES, database:default),
    result_set_metadata:TResultSetMetadata(columnDescs:[TColumnDesc(columnName:name,
    columnType:STRING)]))
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Shutting down the object
    store...
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=Shutting down the object store...
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Metastore shutdown
    complete.
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=Metastore shutdown complete.
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: get_all_databases
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_all_databases
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Opening raw store with
    implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
    12/12/24 04:57:25 INFO metastore.ObjectStore: ObjectStore, initialize called
    12/12/24 04:57:25 INFO metastore.ObjectStore: Initialized ObjectStore
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: get_tables: db=default
    pat=*
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_tables: db=default pat=*
    12/12/24 04:57:30 INFO service.Frontend: createExecRequest for query select
    * from tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_table : db=default
    tbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_table : db=default tbl=tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_config_value:
    name=hive.exec.default.partition.name
    defaultValue=__HIVE_DEFAULT_PARTITION__
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_config_value:
    name=hive.exec.default.partition.name
    defaultValue=__HIVE_DEFAULT_PARTITION__
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_fields:
    db=defaulttbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_fields: db=defaulttbl=tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_table : db=default
    tbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_table : db=default tbl=tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_partitions :
    db=default tbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_partitions : db=default tbl=tab1
    12/12/24 04:57:31 INFO service.JniFrontend: Plan Fragment 0
    UNPARTITIONED
    EXCHANGE (1)
    TUPLE IDS: 0

    Plan Fragment 1
    RANDOM
    STREAM DATA SINK
    EXCHANGE ID: 1
    UNPARTITIONED

    SCAN HDFS table=default.tab1 (0)
    TUPLE IDS: 0

    12/12/24 04:57:31 INFO service.JniFrontend: returned TQueryExecRequest2:
    TExecRequest(stmt_type:QUERY, sql_stmt:select * from tab1,
    request_id:TUniqueId(hi:5327623243124263422, lo:-4992899153864226192),
    query_options:TQueryOptions(abort_on_error:false, max_errors:0,
    disable_codegen:false, batch_size:0, return_as_ascii:true, num_nodes:0,
    max_scan_range_length:0, num_scanner_threads:0, max_io_buffers:0,
    allow_unsupported_formats:false, partition_agg:false),
    query_exec_request:TQueryExecRequest(desc_tbl:TDescriptorTable(slotDescriptors:[TSlotDescriptor(id:0,
    parent:0, slotType:INT, columnPos:0, byteOffset:4, nullIndicatorByte:0,
    nullIndicatorBit:1, slotIdx:1, isMaterialized:true), TSlotDescriptor(id:1,
    parent:0, slotType:BOOLEAN, columnPos:1, byteOffset:1, nullIndicatorByte:0,
    nullIndicatorBit:0, slotIdx:0, isMaterialized:true), TSlotDescriptor(id:2,
    parent:0, slotType:DOUBLE, columnPos:2, byteOffset:8, nullIndicatorByte:0,
    nullIndicatorBit:2, slotIdx:2, isMaterialized:true), TSlotDescriptor(id:3,
    parent:0, slotType:TIMESTAMP, columnPos:3, byteOffset:16,
    nullIndicatorByte:0, nullIndicatorBit:3, slotIdx:3, isMaterialized:true)],
    tupleDescriptors:[TTupleDescriptor(id:0, byteSize:32, numNullBytes:1,
    tableId:0)], tableDescriptors:[TTableDescriptor(id:0, tableType:HDFS_TABLE,
    numCols:4, numClusteringCols:0,
    hdfsTable:THdfsTable(hdfsBaseDir:hdfs://localhost:8020/user/hive/warehouse/tab1,
    partitionKeyNames:[], nullPartitionKeyValue:__HIVE_DEFAULT_PARTITION__,
    partitions:{-1=THdfsPartition(lineDelim:10, fieldDelim:44,
    collectionDelim:44, mapKeyDelim:44, escapeChar:0, fileFormat:TEXT,
    partitionKeyExprs:[], blockSize:0, compression:NONE),
    1=THdfsPartition(lineDelim:10, fieldDelim:44, collectionDelim:44,
    mapKeyDelim:44, escapeChar:0, fileFormat:TEXT, partitionKeyExprs:[],
    blockSize:0, compression:NONE)}), tableName:tab1, dbName:default)]),
    fragments:[TPlanFragment(plan:TPlan(nodes:[TPlanNode(node_id:1,
    node_type:EXCHANGE_NODE, num_children:0, limit:-1, row_tuples:[0],
    nullable_tuples:[false], compact_data:false)]),
    output_exprs:[TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:INT,
    num_children:0, slot_ref:TSlotRef(slot_id:0))]),
    TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:BOOLEAN, num_children:0,
    slot_ref:TSlotRef(slot_id:1))]), TExpr(nodes:[TExprNode(node_type:SLOT_REF,
    type:DOUBLE, num_children:0, slot_ref:TSlotRef(slot_id:2))]),
    TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:TIMESTAMP, num_children:0,
    slot_ref:TSlotRef(slot_id:3))])],
    partition:TDataPartition(type:UNPARTITIONED, partitioning_exprs:[])),
    TPlanFragment(plan:TPlan(nodes:[TPlanNode(node_id:0,
    node_type:HDFS_SCAN_NODE, num_children:0, limit:-1, row_tuples:[0],
    nullable_tuples:[false], compact_data:false,
    hdfs_scan_node:THdfsScanNode(tuple_id:0))]),
    output_sink:TDataSink(type:DATA_STREAM_SINK,
    stream_sink:TDataStreamSink(dest_node_id:1,
    output_partition:TDataPartition(type:UNPARTITIONED,
    partitioning_exprs:[]))), partition:TDataPartition(type:RANDOM,
    partitioning_exprs:[]))], dest_fragment_idx:[0],
    per_node_scan_ranges:{0=[TScanRangeLocations(scan_range:TScanRange(hdfs_file_split:THdfsFileSplit(path:hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv,
    offset:0, length:193, partition_id:1)),
    locations:[TScanRangeLocation(server:THostPort(hostname:127.0.0.1,
    ipaddress:127.0.0.1, port:50010), volume_id:0)])]},
    query_globals:TQueryGlobals(now_string:2012-12-24 04:57:31.000000050)),
    result_set_metadata:TResultSetMetadata(columnDescs:[TColumnDesc(columnName:id,
    columnType:INT), TColumnDesc(columnName:col_1, columnType:BOOLEAN),
    TColumnDesc(columnName:col_2, columnType:DOUBLE),
    TColumnDesc(columnName:col_3, columnType:TIMESTAMP)]))
    hdfsOpenFile(hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv):
    FileSystem#open((Lorg/apache/hadoop/fs/Path;I)Lorg/apache/hadoop/fs/FSDataInputStream;)
    error:
    java.lang.IllegalArgumentException: Wrong FS:
    hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv, expected:
    hdfs://localhost:20500
    at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:547)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:169)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:245)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:78)
    12/12/24 05:02:32 INFO service.Frontend: createExecRequest for query select
    * from tab1

    I'm scratching my head with the following questions :
    - what is the user supposed to start the state store, and the impalad ?
    - why is there no init.d script for these daemons ?
    - how can I get out of this issue ? Why on earth is it looking for something
    local ?

    Thank you !

    Sekine

    --
    --
  • Henry Robinson at Feb 27, 2013 at 6:38 pm
    Thanks for the update Sékine!

    When you see problems with our documentation, would you mind filing a bug
    at http://issues.cloudera.org/browse/IMPALA? That way we'll be able to
    track and fix it.

    Henry
    On 27 February 2013 02:27, Sékine Coulibaly wrote:

    This post is quite old, but wanted to get back to provide some information.

    As suggested by Harsh, the port was wrong, and in my case was supposed to
    be 8020.

    Let me point that the Impala documentation seems to present command line
    snippets using wrong port numbers (especially when starting state store and
    the impalad).

    I did install manually Impala since when I started testing, I never
    managed to use CM on a VMWare virtual machine. I guess I'll give it a try
    now !

    Thanks for your help


    2012/12/31 Marcel Kornacker <marcel@cloudera.com>
    Sekine, how did you install Impala? Did you use Cloudera Manager?

    Also, are you able to run that same query through Hive?

    Marcel

    On Mon, Dec 24, 2012 at 5:09 AM, Sékine Coulibaly <scoulibaly@gmail.com>
    wrote:
    Hi there,

    I installed impala from the packages and have the state store and impalad
    running.

    I loaded the tab1.csv file in HDFS as showed in the documentation into
    /user/hive.

    When I use impala-shell to select * I get the following :

    [localhost:21000] > select * from tab1;
    ERROR: Failed to open HDFS file
    hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv
    Error(255): Unknown error 255
    ERROR: Invalid query handle

    The impalad logs show :

    12/12/24 04:57:21 INFO service.Frontend: createExecRequest for query show
    tables
    12/12/24 04:57:21 INFO service.JniFrontend:
    12/12/24 04:57:21 INFO service.JniFrontend: returned TQueryExecRequest2:
    TExecRequest(stmt_type:DDL, sql_stmt:show tables,
    request_id:TUniqueId(hi:-2866767995086157791, lo:-6585957689921696349),
    query_options:TQueryOptions(abort_on_error:false, max_errors:0,
    disable_codegen:false, batch_size:0, return_as_ascii:true, num_nodes:0,
    max_scan_range_length:0, num_scanner_threads:0, max_io_buffers:0,
    allow_unsupported_formats:false, partition_agg:false),
    ddl_exec_request:TDdlExecRequest(ddl_type:SHOW_TABLES,
    database:default),

    result_set_metadata:TResultSetMetadata(columnDescs:[TColumnDesc(columnName:name,
    columnType:STRING)]))
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Shutting down the object
    store...
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=Shutting down the object store...
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Metastore shutdown
    complete.
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=Metastore shutdown complete.
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: get_all_databases
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_all_databases
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Opening raw store with
    implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
    12/12/24 04:57:25 INFO metastore.ObjectStore: ObjectStore, initialize called
    12/12/24 04:57:25 INFO metastore.ObjectStore: Initialized ObjectStore
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: get_tables:
    db=default
    pat=*
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_tables: db=default pat=*
    12/12/24 04:57:30 INFO service.Frontend: createExecRequest for query select
    * from tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_table :
    db=default
    tbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_table : db=default tbl=tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_config_value:
    name=hive.exec.default.partition.name
    defaultValue=__HIVE_DEFAULT_PARTITION__
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_config_value:
    name=hive.exec.default.partition.name
    defaultValue=__HIVE_DEFAULT_PARTITION__
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_fields:
    db=defaulttbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_fields: db=defaulttbl=tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_table :
    db=default
    tbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_table : db=default tbl=tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_partitions :
    db=default tbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_partitions : db=default tbl=tab1
    12/12/24 04:57:31 INFO service.JniFrontend: Plan Fragment 0
    UNPARTITIONED
    EXCHANGE (1)
    TUPLE IDS: 0

    Plan Fragment 1
    RANDOM
    STREAM DATA SINK
    EXCHANGE ID: 1
    UNPARTITIONED

    SCAN HDFS table=default.tab1 (0)
    TUPLE IDS: 0

    12/12/24 04:57:31 INFO service.JniFrontend: returned TQueryExecRequest2:
    TExecRequest(stmt_type:QUERY, sql_stmt:select * from tab1,
    request_id:TUniqueId(hi:5327623243124263422, lo:-4992899153864226192),
    query_options:TQueryOptions(abort_on_error:false, max_errors:0,
    disable_codegen:false, batch_size:0, return_as_ascii:true, num_nodes:0,
    max_scan_range_length:0, num_scanner_threads:0, max_io_buffers:0,
    allow_unsupported_formats:false, partition_agg:false),
    query_exec_request:TQueryExecRequest(desc_tbl:TDescriptorTable(slotDescriptors:[TSlotDescriptor(id:0,
    parent:0, slotType:INT, columnPos:0, byteOffset:4, nullIndicatorByte:0,
    nullIndicatorBit:1, slotIdx:1, isMaterialized:true),
    TSlotDescriptor(id:1,
    parent:0, slotType:BOOLEAN, columnPos:1, byteOffset:1,
    nullIndicatorByte:0,
    nullIndicatorBit:0, slotIdx:0, isMaterialized:true),
    TSlotDescriptor(id:2,
    parent:0, slotType:DOUBLE, columnPos:2, byteOffset:8,
    nullIndicatorByte:0,
    nullIndicatorBit:2, slotIdx:2, isMaterialized:true),
    TSlotDescriptor(id:3,
    parent:0, slotType:TIMESTAMP, columnPos:3, byteOffset:16,
    nullIndicatorByte:0, nullIndicatorBit:3, slotIdx:3,
    isMaterialized:true)],
    tupleDescriptors:[TTupleDescriptor(id:0, byteSize:32, numNullBytes:1,
    tableId:0)], tableDescriptors:[TTableDescriptor(id:0,
    tableType:HDFS_TABLE,
    numCols:4, numClusteringCols:0,
    hdfsTable:THdfsTable(hdfsBaseDir:hdfs://localhost:8020/user/hive/warehouse/tab1,
    partitionKeyNames:[], nullPartitionKeyValue:__HIVE_DEFAULT_PARTITION__,
    partitions:{-1=THdfsPartition(lineDelim:10, fieldDelim:44,
    collectionDelim:44, mapKeyDelim:44, escapeChar:0, fileFormat:TEXT,
    partitionKeyExprs:[], blockSize:0, compression:NONE),
    1=THdfsPartition(lineDelim:10, fieldDelim:44, collectionDelim:44,
    mapKeyDelim:44, escapeChar:0, fileFormat:TEXT, partitionKeyExprs:[],
    blockSize:0, compression:NONE)}), tableName:tab1, dbName:default)]),
    fragments:[TPlanFragment(plan:TPlan(nodes:[TPlanNode(node_id:1,
    node_type:EXCHANGE_NODE, num_children:0, limit:-1, row_tuples:[0],
    nullable_tuples:[false], compact_data:false)]),
    output_exprs:[TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:INT,
    num_children:0, slot_ref:TSlotRef(slot_id:0))]),
    TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:BOOLEAN, num_children:0,
    slot_ref:TSlotRef(slot_id:1))]),
    TExpr(nodes:[TExprNode(node_type:SLOT_REF,
    type:DOUBLE, num_children:0, slot_ref:TSlotRef(slot_id:2))]),
    TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:TIMESTAMP,
    num_children:0,
    slot_ref:TSlotRef(slot_id:3))])],
    partition:TDataPartition(type:UNPARTITIONED, partitioning_exprs:[])),
    TPlanFragment(plan:TPlan(nodes:[TPlanNode(node_id:0,
    node_type:HDFS_SCAN_NODE, num_children:0, limit:-1, row_tuples:[0],
    nullable_tuples:[false], compact_data:false,
    hdfs_scan_node:THdfsScanNode(tuple_id:0))]),
    output_sink:TDataSink(type:DATA_STREAM_SINK,
    stream_sink:TDataStreamSink(dest_node_id:1,
    output_partition:TDataPartition(type:UNPARTITIONED,
    partitioning_exprs:[]))), partition:TDataPartition(type:RANDOM,
    partitioning_exprs:[]))], dest_fragment_idx:[0],
    per_node_scan_ranges:{0=[TScanRangeLocations(scan_range:TScanRange(hdfs_file_split:THdfsFileSplit(path:hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv,
    offset:0, length:193, partition_id:1)),
    locations:[TScanRangeLocation(server:THostPort(hostname:127.0.0.1,
    ipaddress:127.0.0.1, port:50010), volume_id:0)])]},
    query_globals:TQueryGlobals(now_string:2012-12-24 04:57:31.000000050)),
    result_set_metadata:TResultSetMetadata(columnDescs:[TColumnDesc(columnName:id,
    columnType:INT), TColumnDesc(columnName:col_1, columnType:BOOLEAN),
    TColumnDesc(columnName:col_2, columnType:DOUBLE),
    TColumnDesc(columnName:col_3, columnType:TIMESTAMP)]))
    hdfsOpenFile(hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv):
    FileSystem#open((Lorg/apache/hadoop/fs/Path;I)Lorg/apache/hadoop/fs/FSDataInputStream;)
    error:
    java.lang.IllegalArgumentException: Wrong FS:
    hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv, expected:
    hdfs://localhost:20500
    at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:547)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:169)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:245)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:78)
    12/12/24 05:02:32 INFO service.Frontend: createExecRequest for query select
    * from tab1

    I'm scratching my head with the following questions :
    - what is the user supposed to start the state store, and the impalad ?
    - why is there no init.d script for these daemons ?
    - how can I get out of this issue ? Why on earth is it looking for something
    local ?

    Thank you !

    Sekine

    --
    --


    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679
  • Sékine Coulibaly at Feb 27, 2013 at 7:19 pm
    Doing it right now ! ;)


    2013/2/27 Henry Robinson <henry@cloudera.com>
    Thanks for the update Sékine!

    When you see problems with our documentation, would you mind filing a bug
    at http://issues.cloudera.org/browse/IMPALA? That way we'll be able to
    track and fix it.

    Henry

    On 27 February 2013 02:27, Sékine Coulibaly wrote:

    This post is quite old, but wanted to get back to provide some
    information.

    As suggested by Harsh, the port was wrong, and in my case was supposed to
    be 8020.

    Let me point that the Impala documentation seems to present command line
    snippets using wrong port numbers (especially when starting state store and
    the impalad).

    I did install manually Impala since when I started testing, I never
    managed to use CM on a VMWare virtual machine. I guess I'll give it a try
    now !

    Thanks for your help


    2012/12/31 Marcel Kornacker <marcel@cloudera.com>
    Sekine, how did you install Impala? Did you use Cloudera Manager?

    Also, are you able to run that same query through Hive?

    Marcel

    On Mon, Dec 24, 2012 at 5:09 AM, Sékine Coulibaly <scoulibaly@gmail.com>
    wrote:
    Hi there,

    I installed impala from the packages and have the state store and impalad
    running.

    I loaded the tab1.csv file in HDFS as showed in the documentation into
    /user/hive.

    When I use impala-shell to select * I get the following :

    [localhost:21000] > select * from tab1;
    ERROR: Failed to open HDFS file
    hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv
    Error(255): Unknown error 255
    ERROR: Invalid query handle

    The impalad logs show :

    12/12/24 04:57:21 INFO service.Frontend: createExecRequest for query show
    tables
    12/12/24 04:57:21 INFO service.JniFrontend:
    12/12/24 04:57:21 INFO service.JniFrontend: returned
    TQueryExecRequest2:
    TExecRequest(stmt_type:DDL, sql_stmt:show tables,
    request_id:TUniqueId(hi:-2866767995086157791, lo:-6585957689921696349),
    query_options:TQueryOptions(abort_on_error:false, max_errors:0,
    disable_codegen:false, batch_size:0, return_as_ascii:true, num_nodes:0,
    max_scan_range_length:0, num_scanner_threads:0, max_io_buffers:0,
    allow_unsupported_formats:false, partition_agg:false),
    ddl_exec_request:TDdlExecRequest(ddl_type:SHOW_TABLES,
    database:default),

    result_set_metadata:TResultSetMetadata(columnDescs:[TColumnDesc(columnName:name,
    columnType:STRING)]))
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Shutting down the object
    store...
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=Shutting down the object store...
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Metastore shutdown
    complete.
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=Metastore shutdown complete.
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: get_all_databases
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_all_databases
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: Opening raw store with
    implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
    12/12/24 04:57:25 INFO metastore.ObjectStore: ObjectStore, initialize called
    12/12/24 04:57:25 INFO metastore.ObjectStore: Initialized ObjectStore
    12/12/24 04:57:25 INFO metastore.HiveMetaStore: 2: get_tables:
    db=default
    pat=*
    12/12/24 04:57:25 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_tables: db=default pat=*
    12/12/24 04:57:30 INFO service.Frontend: createExecRequest for query select
    * from tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_table :
    db=default
    tbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_table : db=default tbl=tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_config_value:
    name=hive.exec.default.partition.name
    defaultValue=__HIVE_DEFAULT_PARTITION__
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_config_value:
    name=hive.exec.default.partition.name
    defaultValue=__HIVE_DEFAULT_PARTITION__
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_fields:
    db=defaulttbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_fields: db=defaulttbl=tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_table :
    db=default
    tbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_table : db=default tbl=tab1
    12/12/24 04:57:30 INFO metastore.HiveMetaStore: 2: get_partitions :
    db=default tbl=tab1
    12/12/24 04:57:30 INFO HiveMetaStore.audit: ugi=scoulibaly
    ip=unknown-ip-addr cmd=get_partitions : db=default tbl=tab1
    12/12/24 04:57:31 INFO service.JniFrontend: Plan Fragment 0
    UNPARTITIONED
    EXCHANGE (1)
    TUPLE IDS: 0

    Plan Fragment 1
    RANDOM
    STREAM DATA SINK
    EXCHANGE ID: 1
    UNPARTITIONED

    SCAN HDFS table=default.tab1 (0)
    TUPLE IDS: 0

    12/12/24 04:57:31 INFO service.JniFrontend: returned
    TQueryExecRequest2:
    TExecRequest(stmt_type:QUERY, sql_stmt:select * from tab1,
    request_id:TUniqueId(hi:5327623243124263422, lo:-4992899153864226192),
    query_options:TQueryOptions(abort_on_error:false, max_errors:0,
    disable_codegen:false, batch_size:0, return_as_ascii:true, num_nodes:0,
    max_scan_range_length:0, num_scanner_threads:0, max_io_buffers:0,
    allow_unsupported_formats:false, partition_agg:false),
    query_exec_request:TQueryExecRequest(desc_tbl:TDescriptorTable(slotDescriptors:[TSlotDescriptor(id:0,
    parent:0, slotType:INT, columnPos:0, byteOffset:4, nullIndicatorByte:0,
    nullIndicatorBit:1, slotIdx:1, isMaterialized:true),
    TSlotDescriptor(id:1,
    parent:0, slotType:BOOLEAN, columnPos:1, byteOffset:1,
    nullIndicatorByte:0,
    nullIndicatorBit:0, slotIdx:0, isMaterialized:true),
    TSlotDescriptor(id:2,
    parent:0, slotType:DOUBLE, columnPos:2, byteOffset:8,
    nullIndicatorByte:0,
    nullIndicatorBit:2, slotIdx:2, isMaterialized:true),
    TSlotDescriptor(id:3,
    parent:0, slotType:TIMESTAMP, columnPos:3, byteOffset:16,
    nullIndicatorByte:0, nullIndicatorBit:3, slotIdx:3,
    isMaterialized:true)],
    tupleDescriptors:[TTupleDescriptor(id:0, byteSize:32, numNullBytes:1,
    tableId:0)], tableDescriptors:[TTableDescriptor(id:0,
    tableType:HDFS_TABLE,
    numCols:4, numClusteringCols:0,
    hdfsTable:THdfsTable(hdfsBaseDir:hdfs://localhost:8020/user/hive/warehouse/tab1,
    partitionKeyNames:[], nullPartitionKeyValue:__HIVE_DEFAULT_PARTITION__,
    partitions:{-1=THdfsPartition(lineDelim:10, fieldDelim:44,
    collectionDelim:44, mapKeyDelim:44, escapeChar:0, fileFormat:TEXT,
    partitionKeyExprs:[], blockSize:0, compression:NONE),
    1=THdfsPartition(lineDelim:10, fieldDelim:44, collectionDelim:44,
    mapKeyDelim:44, escapeChar:0, fileFormat:TEXT, partitionKeyExprs:[],
    blockSize:0, compression:NONE)}), tableName:tab1, dbName:default)]),
    fragments:[TPlanFragment(plan:TPlan(nodes:[TPlanNode(node_id:1,
    node_type:EXCHANGE_NODE, num_children:0, limit:-1, row_tuples:[0],
    nullable_tuples:[false], compact_data:false)]),
    output_exprs:[TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:INT,
    num_children:0, slot_ref:TSlotRef(slot_id:0))]),
    TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:BOOLEAN,
    num_children:0,
    slot_ref:TSlotRef(slot_id:1))]),
    TExpr(nodes:[TExprNode(node_type:SLOT_REF,
    type:DOUBLE, num_children:0, slot_ref:TSlotRef(slot_id:2))]),
    TExpr(nodes:[TExprNode(node_type:SLOT_REF, type:TIMESTAMP,
    num_children:0,
    slot_ref:TSlotRef(slot_id:3))])],
    partition:TDataPartition(type:UNPARTITIONED, partitioning_exprs:[])),
    TPlanFragment(plan:TPlan(nodes:[TPlanNode(node_id:0,
    node_type:HDFS_SCAN_NODE, num_children:0, limit:-1, row_tuples:[0],
    nullable_tuples:[false], compact_data:false,
    hdfs_scan_node:THdfsScanNode(tuple_id:0))]),
    output_sink:TDataSink(type:DATA_STREAM_SINK,
    stream_sink:TDataStreamSink(dest_node_id:1,
    output_partition:TDataPartition(type:UNPARTITIONED,
    partitioning_exprs:[]))), partition:TDataPartition(type:RANDOM,
    partitioning_exprs:[]))], dest_fragment_idx:[0],
    per_node_scan_ranges:{0=[TScanRangeLocations(scan_range:TScanRange(hdfs_file_split:THdfsFileSplit(path:hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv,
    offset:0, length:193, partition_id:1)),
    locations:[TScanRangeLocation(server:THostPort(hostname:127.0.0.1,
    ipaddress:127.0.0.1, port:50010), volume_id:0)])]},
    query_globals:TQueryGlobals(now_string:2012-12-24 04:57:31.000000050)),
    result_set_metadata:TResultSetMetadata(columnDescs:[TColumnDesc(columnName:id,
    columnType:INT), TColumnDesc(columnName:col_1, columnType:BOOLEAN),
    TColumnDesc(columnName:col_2, columnType:DOUBLE),
    TColumnDesc(columnName:col_3, columnType:TIMESTAMP)]))
    hdfsOpenFile(hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv):
    FileSystem#open((Lorg/apache/hadoop/fs/Path;I)Lorg/apache/hadoop/fs/FSDataInputStream;)
    error:
    java.lang.IllegalArgumentException: Wrong FS:
    hdfs://localhost:8020/user/hive/warehouse/tab1/tab1.csv, expected:
    hdfs://localhost:20500
    at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:547)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:169)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:245)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:78)
    12/12/24 05:02:32 INFO service.Frontend: createExecRequest for query select
    * from tab1

    I'm scratching my head with the following questions :
    - what is the user supposed to start the state store, and the impalad ?
    - why is there no init.d script for these daemons ?
    - how can I get out of this issue ? Why on earth is it looking for something
    local ?

    Thank you !

    Sekine

    --
    --


    --
    Henry Robinson
    Software Engineer
    Cloudera
    415-994-6679

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupimpala-user @
categorieshadoop
postedDec 24, '12 at 1:09p
activeFeb 27, '13 at 7:19p
posts5
users4
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase