FAQ
I've been running some tests on query throughput, and the results have been
different than I expected. In short, even a few concurrent queries really
slows down Impala.

I have a test query that takes roughly 1 second to complete. If I run this
query from 10 different parallel processes 10 times each (for 100 total
queries), the whole thing takes about 80 seconds to run. That means it's
not running much faster than simply running these queries sequentially.
  Further more, the per query completion time spikes up to about 10 seconds
each. My setup is a 4 node cluster, and all queries are being issued to
the same impalad daemon (though presumably the resulting fragments are
being run elsewhere). iostat shows there's plenty of headroom on the
disks, and top says I have about 20% peak cpu use.

Since Impala was built as a faster version of hive, I'll understand if
multiple concurrent queries isn't really a case it's designed to handle.
  But before I abandon impala as not suitable for my project, I want to make
sure this is expected behavior and not some sort of misconfiguration.

Keith

To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 3 | next ›
Discussion Overview
groupimpala-user @
categorieshadoop
postedOct 24, '13 at 10:30p
activeOct 25, '13 at 8:23p
posts3
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Keith: 2 posts Greg Rahn: 1 post

People

Translate

site design / logo © 2022 Grokbase