i also have similar issue going with me i am on CDH 4.5 and i am trying to
insert into parquet table from a raw table in hive.
[ausgtmhadoop01:21000] > insert INTO gbl_sdr_aud_t.customer_product_append
SELECT * FROM gbl_sdr_aud_t.customer_product_inc;
Query: insert INTO gbl_sdr_aud_t.customer_product_append SELECT * FROM
gbl_sdr_aud_t.customer_product_inc
Query aborted.
ERRORS ENCOUNTERED DURING EXECUTION: Backend 1:Failed to close HDFS file:
hdfs://nameservice1/user/hive/warehouse/gbl_sdr_aud_t.db/customer_product_append/.impala_insert_staging/fe4e8a47162c631b_351ee8923ed672b9//.-122008101973105893-3827752448128611003_1027235387_dir/-122008101973105893-3827752448128611003_1531887124_data.0
Error(255): Unknown error 255
Failed to get info on temporary HDFS file:
hdfs://nameservice1/user/hive/warehouse/gbl_sdr_aud_t.db/customer_product_append/.impala_insert_staging/fe4e8a47162c631b_351ee8923ed672b9//.-122008101973105893-3827752448128611003_1027235387_dir/-122008101973105893-3827752448128611003_1531887124_data.0
Error(2): No such file or directory
Backend 4:Error seeking to 7516192768 in file:
hdfs://nameservice1/user/hive/warehouse/gbl_sdr_aud_t.db/customer_product_inc/part-m-00000
Error(255): Unknown error 255
On Thu, May 8, 2014 at 10:15 AM, Pengcheng Liu wrote:Hello experts
I have been working with impala for a year and now the new parquet format
is really exciting.
I had impala version vcdh5-1.3.0
I had a data set about 40G size in parquet (raw data is 500G) and with 20
partitions but the partition is not evenly distributed.
When i set the block size 1 GB, some of files are split into multiple
blocks since they are larger than 1 GB.
The impala query will work but it gives me some warning information saying
cannot query parquet files with multiple blocks.
And I saw some folks posted a similar problem here and one of response is
setting the block size larger than the actual size of the file.
So I go ahead tried that I used 10 GB as my hdfs file block size.
Now my query failed with this error message:
ERROR: Error seeking to 3955895608 in file:
hdfs://research-mn00.saas.local:8020/user/tablepar/201309/-r-00106.snappy.parquet
Error(22): Invalid argument
ERROR: Invalid query handle
Is this error due to the large block size I used? Is there any limits on
the maximum block size we can create on hdfs?
Thanks
Pengcheng
To unsubscribe from this group and stop receiving emails from it, send an
email to
[email protected].
To unsubscribe from this group and stop receiving emails from it, send an email to
[email protected].