Are you able to run ANALYZE TABLE <table name> COMPUTE STATISTICS on
a text based table? If that also fails, then it is most likely a
configuration problem. Please make sure you have made the
hive-site.xml config changes suggested in the Impala docs:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_performance.html
If this problem only impacts Parquet, it would be helpful to get:
1) the log file referenced in the Hive output:
/tmp/root/root_20130816153030_624d33c1-2bd5-4a20-b120-a45f7bf8e967.log
2) The output from executing "DESCRIBE FORMATTED <table name>"
Note you will also need to make the change Nong mentioned to query the
table in Hive (including computing table stats), and then change back
to the Impala metadata once you compute table and column stats. This
will get fixed with the v1.1.1 release which we are finishing up as I
write.
Nong - "To make the table readable by Hive, the metadata needs to be:
SerDe: parquet.hive.serde.ParquetHiveSerDe
InputFormat: parquet.hive.DeprecatedParquetInputFormat
OutputFormat: parquet.hive.DeprecatedParquetOutputFormat
"
Thanks,
Lenni
Software Engineer - Cloudera
On Tue, Aug 20, 2013 at 9:54 PM, Andrew Stevenson
wrote:
Anybody have any ideas? Since statistics improve performance it's disappointing IMPALA can't benefit from this with PARQUET tables?
To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.
To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.