FAQ
We're using hive and impala with flume-ng, which is writing gzip compressed
sequence files to HDFS. In impala v0.1 this worked fine; in 0.3, it
doesn't, throwing "ERROR: java.lang.RuntimeException: Compressed file not
supported:".

Hive tables were created with:
CREATE EXTERNAL TABLE IF NOT EXISTS ....
PARTITIONED BY (day STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS SEQUENCEFILE LOCATION '...';


Hive accesses the data just fine, and as I said, v0.1 of Impala was fine as
well. Other than upgrading and restarting statestore and impalad, no
changes on my end from v0.1. Not running Cloudera Manager, but on CDH 4.1.2.

As far as I can tell
https://github.com/cloudera/impala/commit/e14c238b63dfb175db26861b2a681468fe8c1b11#L110R360 is
what caused this by restricting the available compression formats to LZO.
Backtrace from simple query:
I0107 19:11:52.957919 20015 impala-server.cc:863] query(): query=select
count(*) FROM logs_dash_unicorn
I0107 19:11:52.962441 20015 status.cc:36] java.lang.RuntimeException:
Compressed file not supported:
hdfs://10.10.0.211/flume/logs/dash-unicorn/day=2013-01-07/dash-unicorn-bigdata-01-0.1357572901365.gz
at
com.cloudera.impala.catalog.HdfsTable.getBlockMetadata(HdfsTable.java:379)
at
com.cloudera.impala.planner.HdfsScanNode.getScanRangeLocations(HdfsScanNode.java:130)
at com.cloudera.impala.service.Frontend.createExecRequest(Frontend.java:278)
at
com.cloudera.impala.service.JniFrontend.createExecRequest(JniFrontend.java:86)

https://ccp.cloudera.com/display/IMPALA10BETADOC/Appendix+A+-+Compression+Support suggests
that there should be gzip support for sequence files.

Something obvious I'm missing, or might this be a regression?

Thanks,
Noah

--

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 1 | next ›
Discussion Overview
groupimpala-user @
categorieshadoop
postedJan 8, '13 at 4:58p
activeJan 8, '13 at 4:58p
posts1
users1
websitecloudera.com
irc#hadoop

1 user in discussion

Noah Lorang: 1 post

People

Translate

site design / logo © 2022 Grokbase