FAQ
I'm using JDBC to access impala server. Here are the steps:

    1. If the HDFS directory for data exists, delete the directory;
    2. Create the HDFS directory;
    3. Create the external Hive table if it doesn't exist;
    4. Create a partition if it doesn't exist;
    5. Run a mapreduce job to write data into the HDFS directory (could take
    20 mins);
    6. Run an impala query to query the partition;

I use HDFS API in step 1 and 2, and JDBC to impalad in Step 3, 4 and 6. To
my understanding, I don't need to run "refresh" because I create the table
and partition in Impala directly.

However, I did find that the impala query in Step 6 returned nothing in my
JDBC code, and the same query running in impala-shell returned nothing too,
then the query returned results in impala-shell if I ran refresh.

Impala seems to remember the state when the directory for the partition is
empty. Does impala really do like that?

Is there a way I can show partitions in Impala?

Thanks.

Ben

Search Discussions

  • Marcel Kornacker at May 22, 2013 at 11:24 pm

    On Wed, May 22, 2013 at 4:21 PM, wrote:
    I'm using JDBC to access impala server. Here are the steps:

    If the HDFS directory for data exists, delete the directory;
    Create the HDFS directory;
    Create the external Hive table if it doesn't exist;
    Create a partition if it doesn't exist;
    Run a mapreduce job to write data into the HDFS directory (could take 20
    mins);
    Run an impala query to query the partition;

    I use HDFS API in step 1 and 2, and JDBC to impalad in Step 3, 4 and 6. To
    my understanding, I don't need to run "refresh" because I create the table
    and partition in Impala directly.

    However, I did find that the impala query in Step 6 returned nothing in my
    JDBC code, and the same query running in impala-shell returned nothing too,
    then the query returned results in impala-shell if I ran refresh.

    Impala seems to remember the state when the directory for the partition is
    empty. Does impala really do like that?
    Impala caches all metadata related to a table, which includes the file
    system metadata such as which files are part of a directory, the block
    replicas of a file, etc.
    Is there a way I can show partitions in Impala?

    Thanks.

    Ben
  • Bewang Tech at May 23, 2013 at 5:16 pm
    Thanks Marcel.

    I adjust my steps by creating partition after mapreduce. And it works this
    time.

        1. If the HDFS directory for data exists, delete the directory;
        2. Create the HDFS directory;
        3. Create the external Hive table if it doesn't exist;
        4. >> Drop the partition if it already exists;
        5. Run a mapreduce job to write data into the HDFS directory (could take
        20 mins);
        6. >> Create the partition
        7. Run an impala query to query the partition;

    On Wednesday, May 22, 2013 4:24:11 PM UTC-7, Marcel Kornacker wrote:

    On Wed, May 22, 2013 at 4:21 PM, <bewan...@gmail.com <javascript:>>
    wrote:
    I'm using JDBC to access impala server. Here are the steps:

    If the HDFS directory for data exists, delete the directory;
    Create the HDFS directory;
    Create the external Hive table if it doesn't exist;
    Create a partition if it doesn't exist;
    Run a mapreduce job to write data into the HDFS directory (could take 20
    mins);
    Run an impala query to query the partition;

    I use HDFS API in step 1 and 2, and JDBC to impalad in Step 3, 4 and 6. To
    my understanding, I don't need to run "refresh" because I create the table
    and partition in Impala directly.

    However, I did find that the impala query in Step 6 returned nothing in my
    JDBC code, and the same query running in impala-shell returned nothing too,
    then the query returned results in impala-shell if I ran refresh.

    Impala seems to remember the state when the directory for the partition is
    empty. Does impala really do like that?
    Impala caches all metadata related to a table, which includes the file
    system metadata such as which files are part of a directory, the block
    replicas of a file, etc.
    Is there a way I can show partitions in Impala?

    Thanks.

    Ben

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupimpala-user @
categorieshadoop
postedMay 22, '13 at 11:24p
activeMay 23, '13 at 5:16p
posts3
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Bewang Tech: 2 posts Marcel Kornacker: 1 post

People

Translate

site design / logo © 2022 Grokbase