FAQ
i am still getting


*Your query has the following error(s):*

AnalysisException: Failed to load metadata for table: comb_prod_hier CAUSED
BY: TableLoadingException: Failed to load metadata for table:
comb_prod_hier CAUSED BY: RuntimeException: Compressed file not supported
without compression input format:
hdfs://nameservice1/user/hive/warehouse/comb_prod_hier/part-m-00000.lzo



[root@ahad]# rpm -qa | grep impala
impala-lzo-1.0.1-1.gplextras.p0.84.el5
impala-1.0.1-1.p0.888.el5
impala-shell-1.0-1.p0.819.el5
impala-lzo-debuginfo-1.0.1-1.gplextras.p0.84.el5
hue-impala-2.2.0+189-1.cdh4.2.0.p0.8.el5



please advise.
On Monday, July 8, 2013 6:40:13 PM UTC-5, Venkata Gattala wrote:

Did u make this work I had same setup in CM to do the lzo compression and
it works great when I query in hive or in hue beeswax it works great. if I
run same sql select in impala it will throw a error as you knoe can you
please guide steps on how to make this work.

Thanks a million for ur help
Deepak Gattala
On May 13, 2013 3:33 PM, "Mark" wrote:

Got it. There is a "Deploy Configuration" option in the Cluster actions
dropdown that will update the local config files on all of the nodes. Sweet!



On May 13, 2013, at 1:01 PM, Alex Behm wrote:

As far as I know CM should configure the machines in such a way that your
shell should pick up the proper configuration.
I'm not an expert an CM, so I've added scm-users@cloudera. They may be
able to provide you with more details.

Cheers,

Alex

On Fri, May 10, 2013 at 8:48 AM, Mark wrote:

Thanks for the help. I'll try this sometime today and let you know how
it works out.

Would you mind clarifying some things for me though. So it seems
everything configured via CM doesn't apply when running any command line
tools from individual nodes, correct? Is there any reason CM doesn't modify
the appropriate configuration files on the cluster? Will I need to modify
each nodes local configuration files for these command line tools to work
or will it be sufficient just modifying the one I'm issuing the command
from? Is there any wrapper script I can use that will effectively use the
configurations stored in CM?

Thanks

On May 9, 2013, at 7:23 PM, Alex Behm wrote:

Here is another idea.
Since you configured LZO support via CM. maybe you can follow the steps
here to make sure the "hadoop" command that you are running from the shell
is picking up the proper configuration.


https://ccp.cloudera.com/display/express37/Generating+Client+Configuration

Basically, you can use CM to export its config files to a .zip file.
Then you unzip those configs on your client (where you want to run the
"hadoop" command from) and set an environment variable HADOOP_CONF_DIR to
point to those configs.
This should ensure that the "hadoop" command is picking up the proper
configs. Can you try that and see if it resolves the issue?

Let me know if you need help with a specific step in that process.

Cheers,

Alex


On Thu, May 9, 2013 at 5:02 PM, Mark wrote:

I have made all those changes via the CM in Hue and I've confined that
LZO support does indeed work when running tasks through the Oozie workflow
manager. I've also confirmed that Impala works over LZO (unindexed) files
via Hue.

Seems like everything through the traditional command line hasn't been
configured with LZO support though. This is exactly what I'm running into
with this:
https://groups.google.com/a/cloudera.org/forum/?fromgroups=#!topic/cdh-user/hPKf5C-0yaM

Do I need to configure the cluster via CM and on the filesystem?



On May 9, 2013, at 4:55 PM, Alex Behm wrote:

Hi Mark,

the error indicates that the job was not able to find an applicable
compression codec based on the ".lzo" file extension. The job consults the
hadoop configuration files to determine such codecs.
To run the indexed it is required that you follow the steps in:

http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_lzo.html

In particular, you should make the changes to the core-site.xml
configuration files. Those changes are also required to make the indexer
run.

Cheers,

Alex

On Thu, May 9, 2013 at 1:45 PM, Marcel Kornacker wrote:

Mark, you might also want to take a look at our documentation about
how to use lzo-compressed files with Impala:

http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_lzo.html

On Thu, May 9, 2013 at 4:22 PM, Alex Behm <alex.behm@cloudera.com>
wrote:
Hi Mark,

you can point the LZO indexer to an HDFS directory and the indexer job will
traverse all sub-directories. It will look for files with the ".lzo"
extension and create a ".index" file for each such file.
For example, suppose you had the following HDFS directory structure for
managing your "searches" table partitioned by "day":
/searches/1/file_a.lzo
/searches/1/file_b.lzo
/searches/2/file_a.lzo
/searches/3/file_a.lzo

You can point the LZO indexer to the "/searches/" directory and it will
create appropriate index files resulting in:
/searches/1/file_a.lzo
/searches/1/file_a.index
/searches/1/file_b.lzo
/searches/1/file_b.index
/searches/2/file_a.lzo
/searches/2/file_a.index
/searches/3/file_a.lzo
/searches/3/file_a.index

Hope it helps!

Cheers,

Alex






On Thu, May 9, 2013 at 11:37 AM, Mark <static.void.dev@gmail.com>
wrote:
So I'm using an external table with partitions by day. Will I need
to run
this indexer over each partition? I'm guessing so since there is no
"real"
table anywhere. Also, where do these index files get stored?

Thanks

On May 8, 2013, at 5:41 PM, Alex Behm <alex.behm@cloudera.com>
wrote:
Sure, you can run the indexer on external tables.

You can follow the steps documented here:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_lzo.html
Basically, you need to install another software package, change a
few
configuration options, and then run the indexer on whatever
compressed LZO
tables you have.
The indexer is nothing but a Hadoop MapReduce job.

Cheers,

Alex




On Wed, May 8, 2013 at 3:56 PM, Mark <static.void.dev@gmail.com>
wrote:
Thanks. Could you please explain the indexing processes a bit
further?
Can this be used on an external table?

On May 8, 2013, at 2:50 PM, Alex Behm <alex.behm@cloudera.com>
wrote:
Hi Mark,

the error indicates that the table metadata obtained from the Hive
meteatore is inconsistent with the ".lzo" file suffix, i.e., the
table
metadata says that the file format is not LZO compressed text.
When creating the external table did you use the following proper
'stored
as' clause?

STORED AS
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
Btw, Impala is able to query unindexed LZO compressed text files,
but it
is strongly encouraged to index the data for performance reasons.

Cheers,

Alex


On Wed, May 8, 2013 at 2:01 AM, Harsh J <harsh@cloudera.com>
wrote:
Best to ask Impala questions on the Impala user lists
(impala-user@cloudera.org), which I've added here.

Yes Impala can query LZO tables iff they are also indexed. Have
you
installed the gplextras packages required for this functionality,
as
documented at
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_lzo.html
On Wed, May 8, 2013 at 3:19 AM, StaticVoid <
static.void.dev@gmail.com>
wrote:
Can you use impala on an external table that has LZO compressed
files?
Your query has the following error(s):

AnalysisException: Failed to load metadata for table: searches
CAUSED
BY:
TableLoadingException: Failed to load metadata for table:
searches
CAUSED
BY: RuntimeException: Compressed file not supported without
compression
input format:

hdfs://
hadoop-master.mycompany.com:8020/user/root/rails/archive/search/2013/05/06/part-r-00000.lzo
--



--
Harsh J

Search Discussions

  • Venkata Gattala at Jul 9, 2013 at 5:01 pm
    oops sorry i had to restart the impala services and it all worked. also
    make sure the input and output formats are compatible like explained by
    experts in this post.

    thank you al lfor your valuable inputs.
    Thanks
    Deepak Gattala
    On Tuesday, July 9, 2013 11:57:49 AM UTC-5, Venkata Gattala wrote:

    i am still getting


    *Your query has the following error(s):*

    AnalysisException: Failed to load metadata for table: comb_prod_hier
    CAUSED BY: TableLoadingException: Failed to load metadata for table:
    comb_prod_hier CAUSED BY: RuntimeException: Compressed file not supported
    without compression input format:
    hdfs://nameservice1/user/hive/warehouse/comb_prod_hier/part-m-00000.lzo



    [root@ahad]# rpm -qa | grep impala
    impala-lzo-1.0.1-1.gplextras.p0.84.el5
    impala-1.0.1-1.p0.888.el5
    impala-shell-1.0-1.p0.819.el5
    impala-lzo-debuginfo-1.0.1-1.gplextras.p0.84.el5
    hue-impala-2.2.0+189-1.cdh4.2.0.p0.8.el5



    please advise.
    On Monday, July 8, 2013 6:40:13 PM UTC-5, Venkata Gattala wrote:

    Did u make this work I had same setup in CM to do the lzo compression and
    it works great when I query in hive or in hue beeswax it works great. if I
    run same sql select in impala it will throw a error as you knoe can you
    please guide steps on how to make this work.

    Thanks a million for ur help
    Deepak Gattala
    On May 13, 2013 3:33 PM, "Mark" wrote:

    Got it. There is a "Deploy Configuration" option in the Cluster actions
    dropdown that will update the local config files on all of the nodes. Sweet!



    On May 13, 2013, at 1:01 PM, Alex Behm wrote:

    As far as I know CM should configure the machines in such a way that
    your shell should pick up the proper configuration.
    I'm not an expert an CM, so I've added scm-users@cloudera. They may be
    able to provide you with more details.

    Cheers,

    Alex

    On Fri, May 10, 2013 at 8:48 AM, Mark wrote:

    Thanks for the help. I'll try this sometime today and let you know how
    it works out.

    Would you mind clarifying some things for me though. So it seems
    everything configured via CM doesn't apply when running any command line
    tools from individual nodes, correct? Is there any reason CM doesn't modify
    the appropriate configuration files on the cluster? Will I need to modify
    each nodes local configuration files for these command line tools to work
    or will it be sufficient just modifying the one I'm issuing the command
    from? Is there any wrapper script I can use that will effectively use the
    configurations stored in CM?

    Thanks

    On May 9, 2013, at 7:23 PM, Alex Behm wrote:

    Here is another idea.
    Since you configured LZO support via CM. maybe you can follow the steps
    here to make sure the "hadoop" command that you are running from the shell
    is picking up the proper configuration.


    https://ccp.cloudera.com/display/express37/Generating+Client+Configuration

    Basically, you can use CM to export its config files to a .zip file.
    Then you unzip those configs on your client (where you want to run the
    "hadoop" command from) and set an environment variable HADOOP_CONF_DIR to
    point to those configs.
    This should ensure that the "hadoop" command is picking up the proper
    configs. Can you try that and see if it resolves the issue?

    Let me know if you need help with a specific step in that process.

    Cheers,

    Alex


    On Thu, May 9, 2013 at 5:02 PM, Mark wrote:

    I have made all those changes via the CM in Hue and I've confined that
    LZO support does indeed work when running tasks through the Oozie workflow
    manager. I've also confirmed that Impala works over LZO (unindexed) files
    via Hue.

    Seems like everything through the traditional command line hasn't been
    configured with LZO support though. This is exactly what I'm running into
    with this:
    https://groups.google.com/a/cloudera.org/forum/?fromgroups=#!topic/cdh-user/hPKf5C-0yaM

    Do I need to configure the cluster via CM and on the filesystem?



    On May 9, 2013, at 4:55 PM, Alex Behm wrote:

    Hi Mark,

    the error indicates that the job was not able to find an applicable
    compression codec based on the ".lzo" file extension. The job consults the
    hadoop configuration files to determine such codecs.
    To run the indexed it is required that you follow the steps in:

    http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_lzo.html

    In particular, you should make the changes to the core-site.xml
    configuration files. Those changes are also required to make the indexer
    run.

    Cheers,

    Alex

    On Thu, May 9, 2013 at 1:45 PM, Marcel Kornacker wrote:

    Mark, you might also want to take a look at our documentation about
    how to use lzo-compressed files with Impala:

    http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_lzo.html

    On Thu, May 9, 2013 at 4:22 PM, Alex Behm <alex.behm@cloudera.com>
    wrote:
    Hi Mark,

    you can point the LZO indexer to an HDFS directory and the indexer job will
    traverse all sub-directories. It will look for files with the ".lzo"
    extension and create a ".index" file for each such file.
    For example, suppose you had the following HDFS directory structure for
    managing your "searches" table partitioned by "day":
    /searches/1/file_a.lzo
    /searches/1/file_b.lzo
    /searches/2/file_a.lzo
    /searches/3/file_a.lzo

    You can point the LZO indexer to the "/searches/" directory and it will
    create appropriate index files resulting in:
    /searches/1/file_a.lzo
    /searches/1/file_a.index
    /searches/1/file_b.lzo
    /searches/1/file_b.index
    /searches/2/file_a.lzo
    /searches/2/file_a.index
    /searches/3/file_a.lzo
    /searches/3/file_a.index

    Hope it helps!

    Cheers,

    Alex






    On Thu, May 9, 2013 at 11:37 AM, Mark <static.void.dev@gmail.com>
    wrote:
    So I'm using an external table with partitions by day. Will I need
    to run
    this indexer over each partition? I'm guessing so since there is
    no "real"
    table anywhere. Also, where do these index files get stored?

    Thanks

    On May 8, 2013, at 5:41 PM, Alex Behm <alex.behm@cloudera.com>
    wrote:
    Sure, you can run the indexer on external tables.

    You can follow the steps documented here:
    http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_lzo.html
    Basically, you need to install another software package, change a
    few
    configuration options, and then run the indexer on whatever
    compressed LZO
    tables you have.
    The indexer is nothing but a Hadoop MapReduce job.

    Cheers,

    Alex




    On Wed, May 8, 2013 at 3:56 PM, Mark <static.void.dev@gmail.com>
    wrote:
    Thanks. Could you please explain the indexing processes a bit
    further?
    Can this be used on an external table?

    On May 8, 2013, at 2:50 PM, Alex Behm <alex.behm@cloudera.com>
    wrote:
    Hi Mark,

    the error indicates that the table metadata obtained from the Hive
    meteatore is inconsistent with the ".lzo" file suffix, i.e., the
    table
    metadata says that the file format is not LZO compressed text.
    When creating the external table did you use the following proper
    'stored
    as' clause?

    STORED AS
    INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
    OUTPUTFORMAT
    'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
    Btw, Impala is able to query unindexed LZO compressed text files,
    but it
    is strongly encouraged to index the data for performance reasons.

    Cheers,

    Alex


    On Wed, May 8, 2013 at 2:01 AM, Harsh J <harsh@cloudera.com>
    wrote:
    Best to ask Impala questions on the Impala user lists
    (impala-user@cloudera.org), which I've added here.

    Yes Impala can query LZO tables iff they are also indexed. Have
    you
    installed the gplextras packages required for this
    functionality, as
    documented at
    http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_lzo.html
    On Wed, May 8, 2013 at 3:19 AM, StaticVoid <
    static.void.dev@gmail.com>
    wrote:
    Can you use impala on an external table that has LZO
    compressed files?
    Your query has the following error(s):

    AnalysisException: Failed to load metadata for table: searches
    CAUSED
    BY:
    TableLoadingException: Failed to load metadata for table:
    searches
    CAUSED
    BY: RuntimeException: Compressed file not supported without
    compression
    input format:

    hdfs://
    hadoop-master.mycompany.com:8020/user/root/rails/archive/search/2013/05/06/part-r-00000.lzo
    --



    --
    Harsh J

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedJul 9, '13 at 4:57p
activeJul 9, '13 at 5:01p
posts2
users1
websitecloudera.com
irc#hadoop

1 user in discussion

Venkata Gattala: 2 posts

People

Translate

site design / logo © 2022 Grokbase