FAQ
Hi Alex,

Here's the link<http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_langref_sql.html?scroll=insert_unique_1>to
the impala doc that tells you how to do insert into a partitioned
table.
Here's an example from the doc:

create table t1 (i int) *partitioned by (x int, y string)*;
-- Select an INT column from another table.
-- All inserted rows will have the same x and y values, as specified
in the INSERT statement.insert into t1 *partition(x=10, y='a')* select
c1 from some_other_table;
-- Select two INT columns from another table.
-- All inserted rows will have the same y value, as specified in the
INSERT statement.
-- Values from c2 go into t1.x.
-- Any partitioning columns whose value is not specified are filled in
-- from the columns specified last in the SELECT list.insert into t1
*partition(x, y='b')* select c1, c2 from some_other_table;
-- Select an INT and a STRING column from another table.
-- All inserted rows will have the same x value, as specified in the
INSERT statement.
-- Values from c3 go into t1.y.insert into t1 *partition(x=20, y)*
select c1, c3 from some_other_table;



On Mon, Jan 13, 2014 at 1:12 PM, Alan Choi wrote:

Hi Alex,

It'll be a multi-steps process. First, you'll create a non-partitioned
external table (let's call it src_tbl) pointing to the CSV files. Then, you
create a partitioned table (let's call it dst_tbl). After that you've to do:

insert into dst_tbl select * from src_tbl;

Thanks,
Alan


On Mon, Jan 13, 2014 at 3:10 AM, Alexander Schätzle <
schaetzle.alexander@gmail.com> wrote:
Hi,

is there a way to automatically partition a table according to the
distinct values of a column?
Let's say we have a column "type" and we want to partition the table such
that all entries with the same type are stored in one partition. The table
is generated by loading a csv file. We want to load this file and generate
a table partitioned by the values of the column "type". We expected Impala
to do this automatically when defining the table to be partitioned by
"type" but it seems that we have to create the partitions on our own. This
is good if we want to control wich values should be stored together in a
partition but there should be a kind of default behaviour that generates
partitions by distict values because I think this is something that is
often wanted (e.g. partition by year).
Is there a simple solution for that or do we miss something?

Thx a lot in advance for clarification.

Best regards,
Alex

To unsubscribe from this group and stop receiving emails from it, send an
email to impala-user+unsubscribe@cloudera.org.
To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 2 | next ›
Discussion Overview
groupimpala-user @
categorieshadoop
postedJan 13, '14 at 9:12p
activeJan 13, '14 at 9:19p
posts2
users1
websitecloudera.com
irc#hadoop

1 user in discussion

Alan Choi: 2 posts

People

Translate

site design / logo © 2022 Grokbase