If you only have a single rotational disk, then you only have a single
scanner (AverageScannerThreadConcurrency = 1 from the profile), and if the
data is all in the fs page cache, it's a CPU problem that is on a single
core because of a single thread.

You can adjust the number of threads as mention in this email thread:

On Fri, Oct 25, 2013 at 11:26 AM, wrote:

Gotcha. Sounds like it's worth digging into then. Here are the two
profiles (from a single machine, not the cluster I was using before). The
first profile is taken during my load test, which takes about 60 seconds to
complete with 10 clients sending 10 requests each. The second request is a
single request with no load, which takes about 600ms.

From my untrained eye, under load, it seems that all the time is spent
waiting on data in the hdfs scan node. It also looks like impala is
artificially restricting the io resources dedicated to the query by
limiting the number of scan threads assigned to 1. From my brief reading
of the comments in your disk io manager, this makes sense. I wonder,
however, if the constraint is too restrictive, and if there's a way for me
to tweak it through some setting. This is only running on a single disk,
but it's such a small amount of data, everything should be in the disk
cache, which explains why iostat is reporting such a low utilization number.
To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.

Search Discussions

Discussion Posts


Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 3 | next ›
Discussion Overview
groupimpala-user @
postedOct 24, '13 at 10:30p
activeOct 25, '13 at 8:23p

2 users in discussion

Keith: 2 posts Greg Rahn: 1 post



site design / logo © 2022 Grokbase