*Your query has the following error(s):*
AnalysisException: Failed to load metadata for table: comb_prod_hier CAUSED
BY: TableLoadingException: Failed to load metadata for table:
comb_prod_hier CAUSED BY: RuntimeException: Compressed file not supported
without compression input format:
hdfs://nameservice1/user/hive/warehouse/comb_prod_hier/part-m-00000.lzo
[root@ahad]# rpm -qa | grep impala
impala-lzo-1.0.1-1.gplextras.p0.84.el5
impala-1.0.1-1.p0.888.el5
impala-shell-1.0-1.p0.819.el5
impala-lzo-debuginfo-1.0.1-1.gplextras.p0.84.el5
hue-impala-2.2.0+189-1.cdh4.2.0.p0.8.el5
please advise.
On Monday, July 8, 2013 6:40:13 PM UTC-5, Venkata Gattala wrote:
Did u make this work I had same setup in CM to do the lzo compression and
it works great when I query in hive or in hue beeswax it works great. if I
run same sql select in impala it will throw a error as you knoe can you
please guide steps on how to make this work.
Thanks a million for ur help
Deepak Gattala
Did u make this work I had same setup in CM to do the lzo compression and
it works great when I query in hive or in hue beeswax it works great. if I
run same sql select in impala it will throw a error as you knoe can you
please guide steps on how to make this work.
Thanks a million for ur help
Deepak Gattala
On May 13, 2013 3:33 PM, "Mark" wrote:
Got it. There is a "Deploy Configuration" option in the Cluster actions
dropdown that will update the local config files on all of the nodes. Sweet!
On May 13, 2013, at 1:01 PM, Alex Behm wrote:
As far as I know CM should configure the machines in such a way that your
shell should pick up the proper configuration.
I'm not an expert an CM, so I've added scm-users@cloudera. They may be
able to provide you with more details.
Cheers,
Alex
Got it. There is a "Deploy Configuration" option in the Cluster actions
dropdown that will update the local config files on all of the nodes. Sweet!
On May 13, 2013, at 1:01 PM, Alex Behm wrote:
As far as I know CM should configure the machines in such a way that your
shell should pick up the proper configuration.
I'm not an expert an CM, so I've added scm-users@cloudera. They may be
able to provide you with more details.
Cheers,
Alex
On Fri, May 10, 2013 at 8:48 AM, Mark wrote:
Thanks for the help. I'll try this sometime today and let you know how
it works out.
Would you mind clarifying some things for me though. So it seems
everything configured via CM doesn't apply when running any command line
tools from individual nodes, correct? Is there any reason CM doesn't modify
the appropriate configuration files on the cluster? Will I need to modify
each nodes local configuration files for these command line tools to work
or will it be sufficient just modifying the one I'm issuing the command
from? Is there any wrapper script I can use that will effectively use the
configurations stored in CM?
Thanks
On May 9, 2013, at 7:23 PM, Alex Behm wrote:
Here is another idea.
Since you configured LZO support via CM. maybe you can follow the steps
here to make sure the "hadoop" command that you are running from the shell
is picking up the proper configuration.
https://ccp.cloudera.com/display/express37/Generating+Client+Configuration
Basically, you can use CM to export its config files to a .zip file.
Then you unzip those configs on your client (where you want to run the
"hadoop" command from) and set an environment variable HADOOP_CONF_DIR to
point to those configs.
This should ensure that the "hadoop" command is picking up the proper
configs. Can you try that and see if it resolves the issue?
Let me know if you need help with a specific step in that process.
Cheers,
Alex
Thanks for the help. I'll try this sometime today and let you know how
it works out.
Would you mind clarifying some things for me though. So it seems
everything configured via CM doesn't apply when running any command line
tools from individual nodes, correct? Is there any reason CM doesn't modify
the appropriate configuration files on the cluster? Will I need to modify
each nodes local configuration files for these command line tools to work
or will it be sufficient just modifying the one I'm issuing the command
from? Is there any wrapper script I can use that will effectively use the
configurations stored in CM?
Thanks
On May 9, 2013, at 7:23 PM, Alex Behm wrote:
Here is another idea.
Since you configured LZO support via CM. maybe you can follow the steps
here to make sure the "hadoop" command that you are running from the shell
is picking up the proper configuration.
https://ccp.cloudera.com/display/express37/Generating+Client+Configuration
Basically, you can use CM to export its config files to a .zip file.
Then you unzip those configs on your client (where you want to run the
"hadoop" command from) and set an environment variable HADOOP_CONF_DIR to
point to those configs.
This should ensure that the "hadoop" command is picking up the proper
configs. Can you try that and see if it resolves the issue?
Let me know if you need help with a specific step in that process.
Cheers,
Alex
On Thu, May 9, 2013 at 5:02 PM, Mark wrote:
I have made all those changes via the CM in Hue and I've confined that
LZO support does indeed work when running tasks through the Oozie workflow
manager. I've also confirmed that Impala works over LZO (unindexed) files
via Hue.
Seems like everything through the traditional command line hasn't been
configured with LZO support though. This is exactly what I'm running into
with this:
https://groups.google.com/a/cloudera.org/forum/?fromgroups=#!topic/cdh-user/hPKf5C-0yaM
Do I need to configure the cluster via CM and on the filesystem?
On May 9, 2013, at 4:55 PM, Alex Behm wrote:
Hi Mark,
the error indicates that the job was not able to find an applicable
compression codec based on the ".lzo" file extension. The job consults the
hadoop configuration files to determine such codecs.
To run the indexed it is required that you follow the steps in:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_lzo.html
In particular, you should make the changes to the core-site.xml
configuration files. Those changes are also required to make the indexer
run.
Cheers,
Alex
I have made all those changes via the CM in Hue and I've confined that
LZO support does indeed work when running tasks through the Oozie workflow
manager. I've also confirmed that Impala works over LZO (unindexed) files
via Hue.
Seems like everything through the traditional command line hasn't been
configured with LZO support though. This is exactly what I'm running into
with this:
https://groups.google.com/a/cloudera.org/forum/?fromgroups=#!topic/cdh-user/hPKf5C-0yaM
Do I need to configure the cluster via CM and on the filesystem?
On May 9, 2013, at 4:55 PM, Alex Behm wrote:
Hi Mark,
the error indicates that the job was not able to find an applicable
compression codec based on the ".lzo" file extension. The job consults the
hadoop configuration files to determine such codecs.
To run the indexed it is required that you follow the steps in:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_lzo.html
In particular, you should make the changes to the core-site.xml
configuration files. Those changes are also required to make the indexer
run.
Cheers,
Alex
On Thu, May 9, 2013 at 1:45 PM, Marcel Kornacker wrote:
Mark, you might also want to take a look at our documentation about
how to use lzo-compressed files with Impala:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_lzo.html
On Thu, May 9, 2013 at 4:22 PM, Alex Behm <alex.behm@cloudera.com>
wrote:
to run
"real"
wrote:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_lzo.html
few
compressed LZO
wrote:
further?
wrote:
table
'stored
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
but it
wrote:
you
as
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_lzo.html
static.void.dev@gmail.com>
files?
CAUSED
searches
hadoop-master.mycompany.com:8020/user/root/rails/archive/search/2013/05/06/part-r-00000.lzo
--
Harsh J
Mark, you might also want to take a look at our documentation about
how to use lzo-compressed files with Impala:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_lzo.html
On Thu, May 9, 2013 at 4:22 PM, Alex Behm <alex.behm@cloudera.com>
wrote:
Hi Mark,
you can point the LZO indexer to an HDFS directory and the indexer job will
traverse all sub-directories. It will look for files with the ".lzo"
extension and create a ".index" file for each such file.
For example, suppose you had the following HDFS directory structure for
managing your "searches" table partitioned by "day":
/searches/1/file_a.lzo
/searches/1/file_b.lzo
/searches/2/file_a.lzo
/searches/3/file_a.lzo
You can point the LZO indexer to the "/searches/" directory and it will
create appropriate index files resulting in:
/searches/1/file_a.lzo
/searches/1/file_a.index
/searches/1/file_b.lzo
/searches/1/file_b.index
/searches/2/file_a.lzo
/searches/2/file_a.index
/searches/3/file_a.lzo
/searches/3/file_a.index
Hope it helps!
Cheers,
Alex
On Thu, May 9, 2013 at 11:37 AM, Mark <static.void.dev@gmail.com>
wrote:you can point the LZO indexer to an HDFS directory and the indexer job will
traverse all sub-directories. It will look for files with the ".lzo"
extension and create a ".index" file for each such file.
For example, suppose you had the following HDFS directory structure for
managing your "searches" table partitioned by "day":
/searches/1/file_a.lzo
/searches/1/file_b.lzo
/searches/2/file_a.lzo
/searches/3/file_a.lzo
You can point the LZO indexer to the "/searches/" directory and it will
create appropriate index files resulting in:
/searches/1/file_a.lzo
/searches/1/file_a.index
/searches/1/file_b.lzo
/searches/1/file_b.index
/searches/2/file_a.lzo
/searches/2/file_a.index
/searches/3/file_a.lzo
/searches/3/file_a.index
Hope it helps!
Cheers,
Alex
On Thu, May 9, 2013 at 11:37 AM, Mark <static.void.dev@gmail.com>
So I'm using an external table with partitions by day. Will I need
this indexer over each partition? I'm guessing so since there is no
table anywhere. Also, where do these index files get stored?
Thanks
On May 8, 2013, at 5:41 PM, Alex Behm <alex.behm@cloudera.com>
Thanks
On May 8, 2013, at 5:41 PM, Alex Behm <alex.behm@cloudera.com>
Sure, you can run the indexer on external tables.
You can follow the steps documented here:
You can follow the steps documented here:
Basically, you need to install another software package, change a
configuration options, and then run the indexer on whatever
tables you have.
The indexer is nothing but a Hadoop MapReduce job.
Cheers,
Alex
On Wed, May 8, 2013 at 3:56 PM, Mark <static.void.dev@gmail.com>
The indexer is nothing but a Hadoop MapReduce job.
Cheers,
Alex
On Wed, May 8, 2013 at 3:56 PM, Mark <static.void.dev@gmail.com>
Thanks. Could you please explain the indexing processes a bit
Can this be used on an external table?
On May 8, 2013, at 2:50 PM, Alex Behm <alex.behm@cloudera.com>
On May 8, 2013, at 2:50 PM, Alex Behm <alex.behm@cloudera.com>
Hi Mark,
the error indicates that the table metadata obtained from the Hive
meteatore is inconsistent with the ".lzo" file suffix, i.e., the
the error indicates that the table metadata obtained from the Hive
meteatore is inconsistent with the ".lzo" file suffix, i.e., the
metadata says that the file format is not LZO compressed text.
When creating the external table did you use the following proper
When creating the external table did you use the following proper
as' clause?
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT
STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT
Btw, Impala is able to query unindexed LZO compressed text files,
is strongly encouraged to index the data for performance reasons.
Cheers,
Alex
On Wed, May 8, 2013 at 2:01 AM, Harsh J <harsh@cloudera.com>
Cheers,
Alex
On Wed, May 8, 2013 at 2:01 AM, Harsh J <harsh@cloudera.com>
Best to ask Impala questions on the Impala user lists
(impala-user@cloudera.org), which I've added here.
Yes Impala can query LZO tables iff they are also indexed. Have
(impala-user@cloudera.org), which I've added here.
Yes Impala can query LZO tables iff they are also indexed. Have
installed the gplextras packages required for this functionality,
documented at
On Wed, May 8, 2013 at 3:19 AM, StaticVoid <
wrote:
Can you use impala on an external table that has LZO compressed
Your query has the following error(s):
AnalysisException: Failed to load metadata for table: searches
AnalysisException: Failed to load metadata for table: searches
BY:
TableLoadingException: Failed to load metadata for table:
TableLoadingException: Failed to load metadata for table:
CAUSED
BY: RuntimeException: Compressed file not supported without
compression
input format:
hdfs://
BY: RuntimeException: Compressed file not supported without
compression
input format:
hdfs://
--
--
Harsh J