The huge CPU usage is more likely to be caused by a large number of files
or blocks. We're aware of the performance problem when the catalog/metadata
is huge. We're working on it. Stay tuned!
Thanks,
Alan
On Mon, Jan 13, 2014 at 5:29 PM, Sammy Yu wrote:
Hi,
I also see that catalogd is not behaving very nicely after this table
was found. It's eating up a lot of CPU, this is the output from top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4894 impala 20 0 3008m 1.2g 23m S 83.1 16.8 17:37.46 catalogd
Is there a limit to the maximum number of partitions a table can handle in
impala? If there is, could one create a tables per partition? Would
having 30,000+ tables causing problems in impala?
Thanks,
Sammy
email to impala-user+unsubscribe@cloudera.org.
To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.Hi,
I also see that catalogd is not behaving very nicely after this table
was found. It's eating up a lot of CPU, this is the output from top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4894 impala 20 0 3008m 1.2g 23m S 83.1 16.8 17:37.46 catalogd
Is there a limit to the maximum number of partitions a table can handle in
impala? If there is, could one create a tables per partition? Would
having 30,000+ tables causing problems in impala?
Thanks,
Sammy
On Friday, January 10, 2014 7:39:07 PM UTC-8, Sammy Yu wrote:
Hi,
I'm running impala 1.2.3 on with a rcfile table with 38687 partitions
that was created from hive. Afterwards, I did a refresh metadata and
compared the select count(1) results and noticed that the result differed
(impala results was significantly smaller than hive). I did further
investigation and determined that impala was not considering some of my
later partitions.
The hive show partition results came back as expected. I tried using the
show table stats command in impala, but I'm getting an error:
[ip-10-124-195-6.ec2.internal:21000] > SHOW TABLE STATS rcfile_3p;
Query: show TABLE STATS rcfile_3p
ERROR: IllegalArgumentException: Comparison method violates its general
contract!
Thanks for your help.
Best,
Sammy
To unsubscribe from this group and stop receiving emails from it, send anHi,
I'm running impala 1.2.3 on with a rcfile table with 38687 partitions
that was created from hive. Afterwards, I did a refresh metadata and
compared the select count(1) results and noticed that the result differed
(impala results was significantly smaller than hive). I did further
investigation and determined that impala was not considering some of my
later partitions.
The hive show partition results came back as expected. I tried using the
show table stats command in impala, but I'm getting an error:
[ip-10-124-195-6.ec2.internal:21000] > SHOW TABLE STATS rcfile_3p;
Query: show TABLE STATS rcfile_3p
ERROR: IllegalArgumentException: Comparison method violates its general
contract!
Thanks for your help.
Best,
Sammy
email to impala-user+unsubscribe@cloudera.org.