FAQ
If you only have a single rotational disk, then you only have a single
scanner (AverageScannerThreadConcurrency = 1 from the profile), and if the
data is all in the fs page cache, it's a CPU problem that is on a single
core because of a single thread.

You can adjust the number of threads as mention in this email thread:
https://groups.google.com/a/cloudera.org/d/msg/impala-user/CXRQon_CPR0/1fU09CqfNvEJ

On Fri, Oct 25, 2013 at 11:26 AM, wrote:

Gotcha. Sounds like it's worth digging into then. Here are the two
profiles (from a single machine, not the cluster I was using before). The
first profile is taken during my load test, which takes about 60 seconds to
complete with 10 clients sending 10 requests each. The second request is a
single request with no load, which takes about 600ms.

From my untrained eye, under load, it seems that all the time is spent
waiting on data in the hdfs scan node. It also looks like impala is
artificially restricting the io resources dedicated to the query by
limiting the number of scan threads assigned to 1. From my brief reading
of the comments in your disk io manager, this makes sense. I wonder,
however, if the constraint is too restrictive, and if there's a way for me
to tweak it through some setting. This is only running on a single disk,
but it's such a small amount of data, everything should be in the disk
cache, which explains why iostat is reporting such a low utilization number.
To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 3 | next ›
Discussion Overview
groupimpala-user @
categorieshadoop
postedOct 24, '13 at 10:30p
activeOct 25, '13 at 8:23p
posts3
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Keith: 2 posts Greg Rahn: 1 post

People

Translate

site design / logo © 2022 Grokbase