maximum number of the threads per disk. (The default value is 1)
I am not quite sure, but increasing this option value might help to read
the column chunks in parallel. Therefore, the query latency will be lowered.
Am I right, experts?
Thanks
On Saturday, May 25, 2013 1:18:02 AM UTC+9, gerrard...@gmail.com wrote:
Hi,
When I run a select query on data in parquet with ~50 million rows and 10
columns I get much worse performance as I select more columns in the row.
Suppose the following query returns 3 rows:
select a from table where a = 12345;
This query returns in 2 seconds. Then if I query:
select a, b from table where a = 12345;
the query returns in 4 seconds and so on. Is this expected behaviour as
parquet is a columnar store? Is there a way to optimise this?
Hi,
When I run a select query on data in parquet with ~50 million rows and 10
columns I get much worse performance as I select more columns in the row.
Suppose the following query returns 3 rows:
select a from table where a = 12345;
This query returns in 2 seconds. Then if I query:
select a, b from table where a = 12345;
the query returns in 4 seconds and so on. Is this expected behaviour as
parquet is a columnar store? Is there a way to optimise this?