I am using a low configuration single node cluster and I have installed
impala on it using the cloudera manager.
node configuration
Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz
RAM: 4GB
I have two tables one is 15000 rows of 15 columns and another one of
15009732 rows and 2 columns.
query:-
select * from table1 JOIN table2 ON(table1.id = table2.id) limit 10
When I execute a join query on this with a output limit of 10. I get a the
result in approx 10 seconds.
Then I increased the output limit to 100 and the query took approximately
224 seconds to execute.
Again when I executed the query with limit 10 the query never returned
results. I have raised a bug for this here.
https://issues.cloudera.org/browse/IMPALA-174
The question here is does it take this long for simple join operations to
execute or am I missing something ?
PS:
The same operations on single machine mysql takes about 0.30 and 0.37
seconds.
-- abhishek