You can add new files to the table directory, and Impala will read them
without having to update any table definition, although you will need to
issue a 'refresh' command before Impala picks them up.
Best,
Henry
On 5 June 2013 03:18, Nick Corbett wrote:
Hi
I have lots of data from various sources that I want to analyse using
Impala and, since most queries will only focus on a few columns, it makes
sense to use the Parquet column storage format. My question is how do I
deal with new data? For example, if I might build a table containing all
my sales data. After 1 day, I will have new sales data that I need to add.
Do I have to rebuild the whole table, or can I add new Parquet files into
the same directory that Impala 'sees' as the same table?
Hope that makes sense
Nick
Hi
I have lots of data from various sources that I want to analyse using
Impala and, since most queries will only focus on a few columns, it makes
sense to use the Parquet column storage format. My question is how do I
deal with new data? For example, if I might build a table containing all
my sales data. After 1 day, I will have new sales data that I need to add.
Do I have to rebuild the whole table, or can I add new Parquet files into
the same directory that Impala 'sees' as the same table?
Hope that makes sense
Nick
--
Henry Robinson
Software Engineer
Cloudera
415-994-6679