Hi All,
I have an impala external table which has about 10k column. When I fire
select count(1) on that table it takes more than 10 mins for the first
time. Next time any select or aggregate returns in sub seconds.
I saw impala server logs and found that "catalog.HdfsTable: load table" is
taking lot of time.
When I refresh the impala cache and fire any query it takes lot of time for
first query then onwards it very fast.
Logs :
13/06/18 15:01:56 INFO service.Frontend: analyze query select count(1) from
imp_ext_test
13/06/18 15:01:57 INFO catalog.HdfsTable: load table imp_ext_test
13/06/18 15:46:02 INFO catalog.HdfsTable: load partition block md for
imp_ext_test
13/06/18 15:46:02 INFO catalog.HdfsTable: loaded partition
PartitionBlockMetadata{#blocks=0, #filenames=0, totalStringLen=0}
13/06/18 15:46:02 INFO catalog.HdfsTable: loaded partition
PartitionBlockMetadata{#blocks=16, #filenames=6, totalStringLen=420}
13/06/18 15:46:02 INFO catalog.HdfsTable: loaded disk ids for table
default.imp_ext_test
13/06/18 15:46:02 INFO catalog.HdfsTable: 1
Could you please provide some help on reducing the load table time. My
table has about 10k columns.
Thanks