FAQ
I've been running some tests on query throughput, and the results have been
different than I expected. In short, even a few concurrent queries really
slows down Impala.

I have a test query that takes roughly 1 second to complete. If I run this
query from 10 different parallel processes 10 times each (for 100 total
queries), the whole thing takes about 80 seconds to run. That means it's
not running much faster than simply running these queries sequentially.
  Further more, the per query completion time spikes up to about 10 seconds
each. My setup is a 4 node cluster, and all queries are being issued to
the same impalad daemon (though presumably the resulting fragments are
being run elsewhere). iostat shows there's plenty of headroom on the
disks, and top says I have about 20% peak cpu use.

Since Impala was built as a faster version of hive, I'll understand if
multiple concurrent queries isn't really a case it's designed to handle.
  But before I abandon impala as not suitable for my project, I want to make
sure this is expected behavior and not some sort of misconfiguration.

Keith

To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.

Search Discussions

  • Keith at Oct 25, 2013 at 6:26 pm
    Gotcha. Sounds like it's worth digging into then. Here are the two
    profiles (from a single machine, not the cluster I was using before). The
    first profile is taken during my load test, which takes about 60 seconds to
    complete with 10 clients sending 10 requests each. The second request is a
    single request with no load, which takes about 600ms.

    From my untrained eye, under load, it seems that all the time is spent
    waiting on data in the hdfs scan node. It also looks like impala is
    artificially restricting the io resources dedicated to the query by
    limiting the number of scan threads assigned to 1. From my brief reading
    of the comments in your disk io manager, this makes sense. I wonder,
    however, if the constraint is too restrictive, and if there's a way for me
    to tweak it through some setting. This is only running on a single disk,
    but it's such a small amount of data, everything should be in the disk
    cache, which explains why iostat is reporting such a low utilization number.


    *Contended query*

    PLAN FRAGMENT 0

      PARTITION: UNPARTITIONED

      8:TOP-N
    order by: SUM(wait_time) DESC
    limit: 10
    tuple ids: 8

      7:AGGREGATE
    output: SUM(<slot 97>), SUM(<slot 98>), SUM(<slot 99>), SUM(<slot 100>)
    group by: <slot 101>
    tuple ids: 8
      9:EXCHANGE

         tuple ids: 6

    PLAN FRAGMENT 1

      PARTITION: RANDOM

      STREAM DATA SINK

        EXCHANGE ID: 9

        UNPARTITIONED

      15:MERGE
    predicates: (<slot 109> = 55)
    tuple ids: 6
      6:SCAN HDFS

         table=pulse.user_action_ui_stats_10_minute_rollup #partitions=0 size=0B

         predicates: bucket <= '2013-10-23 02:30:00.000', bucket >= '2013-10-23
    02:00:00.000'

         tuple ids: 5

    PLAN FRAGMENT 2

      PARTITION: RANDOM

      STREAM DATA SINK

        EXCHANGE ID: 9

        UNPARTITIONED

      14:MERGE
    predicates: (<slot 109> = 55)
    tuple ids: 6
      5:SCAN HDFS

         table=pulse.user_action_ui_stats_1_hour_rollup #partitions=0 size=0B

         predicates: bucket <= '2013-10-23 02:00:00.000', bucket >= '2013-10-23
    00:00:00.000'

         tuple ids: 4

    PLAN FRAGMENT 3

      PARTITION: RANDOM

      STREAM DATA SINK

        EXCHANGE ID: 9

        UNPARTITIONED

      13:MERGE
    predicates: (<slot 109> = 55)
    tuple ids: 6
      4:SCAN HDFS

         table=pulse.user_action_ui_stats_10_minute_rollup #partitions=5
    size=14.14KB

         predicates: bucket >= '2013-10-16 02:40:00.000', bucket <= '2013-10-16
    03:00:00.000'

         tuple ids: 3

    PLAN FRAGMENT 4

      PARTITION: RANDOM

      STREAM DATA SINK

        EXCHANGE ID: 9

        UNPARTITIONED

      12:MERGE
    predicates: (<slot 109> = 55)
    tuple ids: 6
      3:SCAN HDFS

         table=pulse.user_action_ui_stats_1_hour_rollup #partitions=6
    size=24.39KB

         predicates: bucket >= '2013-10-16 03:00:00.000', bucket <= '2013-10-16
    06:00:00.000'

         tuple ids: 2

    PLAN FRAGMENT 5

      PARTITION: RANDOM

      STREAM DATA SINK

        EXCHANGE ID: 9

        UNPARTITIONED

      11:MERGE
    predicates: (<slot 109> = 55)
    tuple ids: 6
      2:SCAN HDFS

         table=pulse.user_action_ui_stats_6_hour_rollup #partitions=6
    size=23.92KB

         predicates: bucket >= '2013-10-16 06:00:00.000', bucket <= '2013-10-17
    00:00:00.000'

         tuple ids: 1

    PLAN FRAGMENT 6

      PARTITION: RANDOM

      STREAM DATA SINK

        EXCHANGE ID: 9

        UNPARTITIONED

      10:MERGE
    predicates: (<slot 109> = 55)
    tuple ids: 6
      1:SCAN HDFS

         table=pulse.user_action_ui_stats_1_day_rollup #partitions=10
    size=73.38KB

         predicates: bucket >= '2013-10-17 00:00:00.000', bucket <= '2013-10-23
    00:00:00.000'

         tuple ids: 0

    ----------------

        Query Timeline: 5s733ms

           - Start execution: 23.6ms (23.6ms)

           - Planning finished: 308.605ms (285.599ms)

           - Rows available: 5s704ms (5s396ms)

           - First row fetched: 5s716ms (11.531ms)

      ImpalaServer:

         - ClientFetchWaitTimer: 16.774ms

         - RowMaterializationTimer: 125.855us

      Execution Profile 174050aa9c33038e:b796d9549ff72bb5:(Active: 5s399ms, %
    non-child: 0.00%)

        Per Node Peak Memory Usage: clouderavm.vm:22000(321.73 MB)

         - FinalizationTimer: 0ns

        Coordinator Fragment:(Active: 4s650ms, % non-child: 0.00%)

           - AverageThreadTokens: 1.00

           - PeakMemoryUsage: 321.73 MB

           - RowsProduced: 10

          CodeGen:(Active: 255.743ms, % non-child: 5.50%)

             - CodegenTime: 8.211ms

             - CompileTime: 226.696ms

             - LoadTime: 29.45ms

             - ModuleFileSize: 75.62 KB

          SORT_NODE (id=8):(Active: 4s650ms, % non-child: 0.00%)

             - MemoryUsed: 0.00

             - RowsReturned: 10

             - RowsReturnedRate: 2.00 /sec

          AGGREGATION_NODE (id=7):(Active: 4s657ms, % non-child: 0.40%)

            ExecOption: Codegen Enabled

             - BuildBuckets: 1.02K (1024)

             - BuildTime: 96.21us

             - GetResultsTime: 3.648us

             - LoadFactor: 0.11

             - MemoryUsed: 47.34 KB

             - RowsReturned: 118

             - RowsReturnedRate: 25.00 /sec

          EXCHANGE_NODE (id=9):(Active: 4s639ms, % non-child: 99.76%)

             - BytesReceived: 28.71 KB

             - ConvertRowBatchTime: 8.850us

             - DataArrivalWaitTime: 4s639ms

             - DeserializeRowBatchTimer: 230.126us

             - FirstBatchArrivalWaitTime: 0ns

             - MemoryUsed: 0.00

             - RowsReturned: 321

             - RowsReturnedRate: 69.00 /sec

             - SendersBlockedTimer: 0ns

             - SendersBlockedTotalTimer(*): 0ns

        Averaged Fragment 1:(Active: 56.334us, % non-child: 0.00%)

          split sizes: min: 0.00 , max: 0.00 , avg: 0.00 , stddev: 0.00

          completion times: min:390.151ms max:390.151ms mean: 390.151ms
      stddev:0ns

          execution rates: min:0.00 /sec max:0.00 /sec mean:0.00 /sec
      stddev:0.00 /sec

          num instances: 1

           - AverageThreadTokens: 0.00

           - PeakMemoryUsage: 32.00 KB

           - RowsProduced: 0

          CodeGen:(Active: 76.159ms, % non-child: 100.00%)

             - CodegenTime: 161.270us

             - CompileTime: 57.357ms

             - LoadTime: 18.800ms

             - ModuleFileSize: 75.62 KB

          DataStreamSender (dst_id=9):(Active: 9.763us, % non-child: 17.33%)

             - BytesSent: 0.00

             - NetworkThroughput(*): 0.00 /sec

             - OverallThroughput: 0.00 /sec

             - SerializeBatchTime: 0ns

             - ThriftTransmitTime(*): 0ns

             - UncompressedRowBatchSize: 0.00

          MERGE_NODE (id=15):(Active: 54.277us, % non-child: 50.30%)

             - MemoryUsed: 0.00

             - RowsReturned: 0

             - RowsReturnedRate: 0

          HDFS_SCAN_NODE (id=6):(Active: 25.939us, % non-child: 46.05%)

             - BytesRead: 0.00

             - MemoryUsed: 0.00

             - NumDisksAccessed: 0

             - NumScannerThreadsStarted: 0

             - PerReadThreadRawHdfsThroughput: 0.00 /sec

             - RowsRead: 0

             - RowsReturned: 0

             - RowsReturnedRate: 0

             - ScanRangesComplete: 0

             - ScannerThreadsInvoluntaryContextSwitches: 0

             - ScannerThreadsTotalWallClockTime: 0ns

               - MaterializeTupleTime(*): 0ns

               - ScannerThreadsSysTime: 0ns

               - ScannerThreadsUserTime: 0ns

             - ScannerThreadsVoluntaryContextSwitches: 0

             - TotalRawHdfsReadTime(*): 0ns

             - TotalReadThroughput: 0.00 /sec

        Averaged Fragment 2:(Active: 55.684us, % non-child: 0.00%)

          split sizes: min: 0.00 , max: 0.00 , avg: 0.00 , stddev: 0.00

          completion times: min:273.697ms max:273.697ms mean: 273.697ms
      stddev:0ns

          execution rates: min:0.00 /sec max:0.00 /sec mean:0.00 /sec
      stddev:0.00 /sec

          num instances: 1

           - AverageThreadTokens: 0.00

           - PeakMemoryUsage: 32.00 KB

           - RowsProduced: 0

          CodeGen:(Active: 96.772ms, % non-child: 100.00%)

             - CodegenTime: 119.750us

             - CompileTime: 85.545ms

             - LoadTime: 11.225ms

             - ModuleFileSize: 75.62 KB

          DataStreamSender (dst_id=9):(Active: 8.534us, % non-child: 15.33%)

             - BytesSent: 0.00

             - NetworkThroughput(*): 0.00 /sec

             - OverallThroughput: 0.00 /sec

             - SerializeBatchTime: 0ns

             - ThriftTransmitTime(*): 0ns

             - UncompressedRowBatchSize: 0.00

          MERGE_NODE (id=14):(Active: 53.881us, % non-child: 59.62%)

             - MemoryUsed: 0.00

             - RowsReturned: 0

             - RowsReturnedRate: 0

          HDFS_SCAN_NODE (id=5):(Active: 20.680us, % non-child: 37.14%)

             - BytesRead: 0.00

             - MemoryUsed: 0.00

             - NumDisksAccessed: 0

             - NumScannerThreadsStarted: 0

             - PerReadThreadRawHdfsThroughput: 0.00 /sec

             - RowsRead: 0

             - RowsReturned: 0

             - RowsReturnedRate: 0

             - ScanRangesComplete: 0

             - ScannerThreadsInvoluntaryContextSwitches: 0

             - ScannerThreadsTotalWallClockTime: 0ns

               - MaterializeTupleTime(*): 0ns

               - ScannerThreadsSysTime: 0ns

               - ScannerThreadsUserTime: 0ns

             - ScannerThreadsVoluntaryContextSwitches: 0

             - TotalRawHdfsReadTime(*): 0ns

             - TotalReadThroughput: 0.00 /sec

        Averaged Fragment 3:(Active: 3s513ms, % non-child: 0.00%)

          split sizes: min: 14.14 KB, max: 14.14 KB, avg: 14.14 KB, stddev:
    0.00

          completion times: min:3s515ms max:3s515ms mean: 3s515ms stddev:0ns

          execution rates: min:4.02 KB/sec max:4.02 KB/sec mean:4.02 KB/sec
      stddev:0.00 /sec

          num instances: 1

           - AverageThreadTokens: 1.88

           - PeakMemoryUsage: 321.73 MB

           - RowsProduced: 0

          CodeGen:(Active: 74.897ms, % non-child: 2.13%)

             - CodegenTime: 112.853us

             - CompileTime: 67.468ms

             - LoadTime: 7.427ms

             - ModuleFileSize: 75.62 KB

          DataStreamSender (dst_id=9):(Active: 7.935us, % non-child: 0.00%)

             - BytesSent: 0.00

             - NetworkThroughput(*): 0.00 /sec

             - OverallThroughput: 0.00 /sec

             - SerializeBatchTime: 0ns

             - ThriftTransmitTime(*): 0ns

             - UncompressedRowBatchSize: 0.00

          MERGE_NODE (id=13):(Active: 3s513ms, % non-child: 0.00%)

             - MemoryUsed: 0.00

             - RowsReturned: 0

             - RowsReturnedRate: 0

          HDFS_SCAN_NODE (id=4):(Active: 3s513ms, % non-child: 100.00%)

             - AverageHdfsReadThreadConcurrency: 0.00

             - AverageScannerThreadConcurrency: 1.00

             - BytesRead: 21.64 KB

             - BytesReadLocal: 21.64 KB

             - BytesReadShortCircuit: 21.64 KB

             - DecompressionTime: 249.263us

             - MemoryUsed: 0.00

             - NumColumns: 0

             - NumDisksAccessed: 1

             - NumScannerThreadsStarted: 1

             - PerReadThreadRawHdfsThroughput: 30.62 MB/sec

             - RowsRead: 157

             - RowsReturned: 0

             - RowsReturnedRate: 0

             - ScanRangesComplete: 5

             - ScannerThreadsInvoluntaryContextSwitches: 0

             - ScannerThreadsTotalWallClockTime: 3s513ms

               - MaterializeTupleTime(*): 244.915us

               - ScannerThreadsSysTime: 0ns

               - ScannerThreadsUserTime: 4.0ms

             - ScannerThreadsVoluntaryContextSwitches: 89

             - TotalRawHdfsReadTime(*): 690.143us

             - TotalReadThroughput: 5.39 KB/sec

        Averaged Fragment 4:(Active: 3s922ms, % non-child: 0.00%)

          split sizes: min: 24.39 KB, max: 24.39 KB, avg: 24.39 KB, stddev:
    0.00

          completion times: min:3s924ms max:3s924ms mean: 3s924ms stddev:0ns

          execution rates: min:6.21 KB/sec max:6.21 KB/sec mean:6.21 KB/sec
      stddev:0.00 /sec

          num instances: 1

           - AverageThreadTokens: 2.00

           - PeakMemoryUsage: 321.73 MB

           - RowsProduced: 35

          CodeGen:(Active: 76.265ms, % non-child: 1.94%)

             - CodegenTime: 154.831us

             - CompileTime: 54.123ms

             - LoadTime: 22.141ms

             - ModuleFileSize: 75.62 KB

          DataStreamSender (dst_id=9):(Active: 149.730us, % non-child: 0.00%)

             - BytesSent: 3.29 KB

             - NetworkThroughput(*): 6.13 MB/sec

             - OverallThroughput: 21.43 MB/sec

             - SerializeBatchTime: 84.151us

             - ThriftTransmitTime(*): 523.266us

             - UncompressedRowBatchSize: 10.65 KB

          MERGE_NODE (id=12):(Active: 3s922ms, % non-child: 0.01%)

             - MemoryUsed: 0.00

             - RowsReturned: 35

             - RowsReturnedRate: 8.00 /sec

          HDFS_SCAN_NODE (id=3):(Active: 3s922ms, % non-child: 99.99%)

             - AverageHdfsReadThreadConcurrency: 0.00

             - AverageScannerThreadConcurrency: 1.00

             - BytesRead: 40.73 KB

             - BytesReadLocal: 40.73 KB

             - BytesReadShortCircuit: 40.73 KB

             - DecompressionTime: 290.322us

             - MemoryUsed: 0.00

             - NumColumns: 0

             - NumDisksAccessed: 1

             - NumScannerThreadsStarted: 1

             - PerReadThreadRawHdfsThroughput: 42.72 MB/sec

             - RowsRead: 622

             - RowsReturned: 40

             - RowsReturnedRate: 10.00 /sec

             - ScanRangesComplete: 6

             - ScannerThreadsInvoluntaryContextSwitches: 0

             - ScannerThreadsTotalWallClockTime: 3s922ms

               - MaterializeTupleTime(*): 639.232us

               - ScannerThreadsSysTime: 0ns

               - ScannerThreadsUserTime: 4.0ms

             - ScannerThreadsVoluntaryContextSwitches: 125

             - TotalRawHdfsReadTime(*): 930.993us

             - TotalReadThroughput: 10.17 KB/sec

        Averaged Fragment 5:(Active: 3s922ms, % non-child: 0.00%)

          split sizes: min: 23.92 KB, max: 23.92 KB, avg: 23.92 KB, stddev:
    0.00

          completion times: min:3s926ms max:3s926ms mean: 3s926ms stddev:0ns

          execution rates: min:6.09 KB/sec max:6.09 KB/sec mean:6.09 KB/sec
      stddev:0.00 /sec

          num instances: 1

           - AverageThreadTokens: 2.00

           - PeakMemoryUsage: 321.73 MB

           - RowsProduced: 142

          CodeGen:(Active: 39.648ms, % non-child: 1.01%)

             - CodegenTime: 309.800us

             - CompileTime: 35.695ms

             - LoadTime: 3.952ms

             - ModuleFileSize: 75.62 KB

          DataStreamSender (dst_id=9):(Active: 471.992us, % non-child: 0.01%)

             - BytesSent: 12.90 KB

             - NetworkThroughput(*): 21.90 MB/sec

             - OverallThroughput: 26.69 MB/sec

             - SerializeBatchTime: 400.861us

             - ThriftTransmitTime(*): 575.192us

             - UncompressedRowBatchSize: 42.57 KB

          MERGE_NODE (id=11):(Active: 3s922ms, % non-child: 0.01%)

             - MemoryUsed: 0.00

             - RowsReturned: 142

             - RowsReturnedRate: 36.00 /sec

          HDFS_SCAN_NODE (id=2):(Active: 3s921ms, % non-child: 99.98%)

             - AverageHdfsReadThreadConcurrency: 0.00

             - AverageScannerThreadConcurrency: 1.00

             - BytesRead: 39.79 KB

             - BytesReadLocal: 39.79 KB

             - BytesReadShortCircuit: 39.79 KB

             - DecompressionTime: 278.285us

             - MemoryUsed: 0.00

             - NumColumns: 0

             - NumDisksAccessed: 1

             - NumScannerThreadsStarted: 1

             - PerReadThreadRawHdfsThroughput: 44.64 MB/sec

             - RowsRead: 611

             - RowsReturned: 175

             - RowsReturnedRate: 44.00 /sec

             - ScanRangesComplete: 6

             - ScannerThreadsInvoluntaryContextSwitches: 0

             - ScannerThreadsTotalWallClockTime: 3s921ms

               - MaterializeTupleTime(*): 627.357us

               - ScannerThreadsSysTime: 0ns

               - ScannerThreadsUserTime: 4.0ms

             - ScannerThreadsVoluntaryContextSwitches: 127

             - TotalRawHdfsReadTime(*): 870.333us

             - TotalReadThroughput: 9.91 KB/sec

        Averaged Fragment 6:(Active: 4s649ms, % non-child: 0.00%)

          split sizes: min: 73.38 KB, max: 73.38 KB, avg: 73.38 KB, stddev:
    0.00

          completion times: min:4s656ms max:4s656ms mean: 4s656ms stddev:0ns

          execution rates: min:15.76 KB/sec max:15.76 KB/sec mean:15.76 KB/sec
      stddev:0.00 /sec

          num instances: 1

           - AverageThreadTokens: 2.00

           - PeakMemoryUsage: 321.73 MB

           - RowsProduced: 144

          CodeGen:(Active: 42.804ms, % non-child: 0.92%)

             - CodegenTime: 126.484us

             - CompileTime: 36.859ms

             - LoadTime: 5.943ms

             - ModuleFileSize: 75.62 KB

          DataStreamSender (dst_id=9):(Active: 5.757ms, % non-child: 0.12%)

             - BytesSent: 12.53 KB

             - NetworkThroughput(*): 61.45 MB/sec

             - OverallThroughput: 2.13 MB/sec

             - SerializeBatchTime: 172.616us

             - ThriftTransmitTime(*): 199.171us

             - UncompressedRowBatchSize: 42.73 KB

          MERGE_NODE (id=10):(Active: 4s643ms, % non-child: 0.01%)

             - MemoryUsed: 0.00

             - RowsReturned: 144

             - RowsReturnedRate: 31.00 /sec

          HDFS_SCAN_NODE (id=1):(Active: 4s643ms, % non-child: 99.86%)

             - AverageHdfsReadThreadConcurrency: 0.00

             - AverageScannerThreadConcurrency: 1.00

             - BytesRead: 133.28 KB

             - BytesReadLocal: 133.28 KB

             - BytesReadShortCircuit: 133.28 KB

             - DecompressionTime: 490.818us

             - MemoryUsed: 0.00

             - NumColumns: 0

             - NumDisksAccessed: 1

             - NumScannerThreadsStarted: 1

             - PerReadThreadRawHdfsThroughput: 81.32 MB/sec

             - RowsRead: 3.23K (3228)

             - RowsReturned: 166

             - RowsReturnedRate: 35.00 /sec

             - ScanRangesComplete: 10

             - ScannerThreadsInvoluntaryContextSwitches: 0

             - ScannerThreadsTotalWallClockTime: 4s642ms

               - MaterializeTupleTime(*): 1.726ms

               - ScannerThreadsSysTime: 0ns

               - ScannerThreadsUserTime: 8.0ms

             - ScannerThreadsVoluntaryContextSwitches: 210

             - TotalRawHdfsReadTime(*): 1.600ms

             - TotalReadThroughput: 25.97 KB/sec

        Fragment 1:

          Instance 174050aa9c33038e:b796d9549ff72bb7
    (host=clouderavm.vm:22000):(Active: 56.334us, % non-child: 0.00%)

             - AverageThreadTokens: 0.00

             - PeakMemoryUsage: 32.00 KB

             - RowsProduced: 0

            CodeGen:(Active: 76.159ms, % non-child: 100.00%)

               - CodegenTime: 161.270us

               - CompileTime: 57.357ms

               - LoadTime: 18.800ms

               - ModuleFileSize: 75.62 KB

            DataStreamSender (dst_id=9):(Active: 9.763us, % non-child: 17.33%)

               - BytesSent: 0.00

               - NetworkThroughput(*): 0.00 /sec

               - OverallThroughput: 0.00 /sec

               - SerializeBatchTime: 0ns

               - ThriftTransmitTime(*): 0ns

               - UncompressedRowBatchSize: 0.00

            MERGE_NODE (id=15):(Active: 54.277us, % non-child: 50.30%)

               - MemoryUsed: 0.00

               - RowsReturned: 0

               - RowsReturnedRate: 0

            HDFS_SCAN_NODE (id=6):(Active: 25.939us, % non-child: 46.05%)

              Hdfs split stats (<volume id>:<# splits>/<split lengths>):

              Hdfs Read Thread Concurrency Bucket:

              ExecOption: Codegen enabled: 0 out of 0

               - BytesRead: 0.00

               - MemoryUsed: 0.00

               - NumDisksAccessed: 0

               - NumScannerThreadsStarted: 0

               - PerReadThreadRawHdfsThroughput: 0.00 /sec

               - RowsRead: 0

               - RowsReturned: 0

               - RowsReturnedRate: 0

               - ScanRangesComplete: 0

               - ScannerThreadsInvoluntaryContextSwitches: 0

               - ScannerThreadsTotalWallClockTime: 0ns

                 - MaterializeTupleTime(*): 0ns

                 - ScannerThreadsSysTime: 0ns

                 - ScannerThreadsUserTime: 0ns

               - ScannerThreadsVoluntaryContextSwitches: 0

               - TotalRawHdfsReadTime(*): 0ns

               - TotalReadThroughput: 0.00 /sec

        Fragment 2:

          Instance 174050aa9c33038e:b796d9549ff72bb8
    (host=clouderavm.vm:22000):(Active: 55.684us, % non-child: 0.00%)

             - AverageThreadTokens: 0.00

             - PeakMemoryUsage: 32.00 KB

             - RowsProduced: 0

            CodeGen:(Active: 96.772ms, % non-child: 100.00%)

               - CodegenTime: 119.750us

               - CompileTime: 85.545ms

               - LoadTime: 11.225ms

               - ModuleFileSize: 75.62 KB

            DataStreamSender (dst_id=9):(Active: 8.534us, % non-child: 15.33%)

               - BytesSent: 0.00

               - NetworkThroughput(*): 0.00 /sec

               - OverallThroughput: 0.00 /sec

               - SerializeBatchTime: 0ns

               - ThriftTransmitTime(*): 0ns

               - UncompressedRowBatchSize: 0.00

            MERGE_NODE (id=14):(Active: 53.881us, % non-child: 59.62%)

               - MemoryUsed: 0.00

               - RowsReturned: 0

               - RowsReturnedRate: 0

            HDFS_SCAN_NODE (id=5):(Active: 20.680us, % non-child: 37.14%)

              Hdfs split stats (<volume id>:<# splits>/<split lengths>):

              Hdfs Read Thread Concurrency Bucket:

              ExecOption: Codegen enabled: 0 out of 0

               - BytesRead: 0.00

               - MemoryUsed: 0.00

               - NumDisksAccessed: 0

               - NumScannerThreadsStarted: 0

               - PerReadThreadRawHdfsThroughput: 0.00 /sec

               - RowsRead: 0

               - RowsReturned: 0

               - RowsReturnedRate: 0

               - ScanRangesComplete: 0

               - ScannerThreadsInvoluntaryContextSwitches: 0

               - ScannerThreadsTotalWallClockTime: 0ns

                 - MaterializeTupleTime(*): 0ns

                 - ScannerThreadsSysTime: 0ns

                 - ScannerThreadsUserTime: 0ns

               - ScannerThreadsVoluntaryContextSwitches: 0

               - TotalRawHdfsReadTime(*): 0ns

               - TotalReadThroughput: 0.00 /sec

        Fragment 3:

          Instance 174050aa9c33038e:b796d9549ff72bb9
    (host=clouderavm.vm:22000):(Active: 3s513ms, % non-child: 0.00%)

            Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:5/14.14
    KB

             - AverageThreadTokens: 1.88

             - PeakMemoryUsage: 321.73 MB

             - RowsProduced: 0

            CodeGen:(Active: 74.897ms, % non-child: 2.13%)

               - CodegenTime: 112.853us

               - CompileTime: 67.468ms

               - LoadTime: 7.427ms

               - ModuleFileSize: 75.62 KB

            DataStreamSender (dst_id=9):(Active: 7.935us, % non-child: 0.00%)

               - BytesSent: 0.00

               - NetworkThroughput(*): 0.00 /sec

               - OverallThroughput: 0.00 /sec

               - SerializeBatchTime: 0ns

               - ThriftTransmitTime(*): 0ns

               - UncompressedRowBatchSize: 0.00

            MERGE_NODE (id=13):(Active: 3s513ms, % non-child: 0.00%)

               - MemoryUsed: 0.00

               - RowsReturned: 0

               - RowsReturnedRate: 0

            HDFS_SCAN_NODE (id=4):(Active: 3s513ms, % non-child: 100.00%)

              Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:5/14.14 KB

              Hdfs Read Thread Concurrency Bucket: 0:100% 1:0% 2:0% 3:0%

              File Formats: PARQUET/SNAPPY:65

              ExecOption: Codegen enabled: 0 out of 5

               - AverageHdfsReadThreadConcurrency: 0.00

               - AverageScannerThreadConcurrency: 1.00

               - BytesRead: 21.64 KB

               - BytesReadLocal: 21.64 KB

               - BytesReadShortCircuit: 21.64 KB

               - DecompressionTime: 249.263us

               - MemoryUsed: 0.00

               - NumColumns: 0

               - NumDisksAccessed: 1

               - NumScannerThreadsStarted: 1

               - PerReadThreadRawHdfsThroughput: 30.62 MB/sec

               - RowsRead: 157

               - RowsReturned: 0

               - RowsReturnedRate: 0

               - ScanRangesComplete: 5

               - ScannerThreadsInvoluntaryContextSwitches: 0

               - ScannerThreadsTotalWallClockTime: 3s513ms

                 - MaterializeTupleTime(*): 244.915us

                 - ScannerThreadsSysTime: 0ns

                 - ScannerThreadsUserTime: 4.0ms

               - ScannerThreadsVoluntaryContextSwitches: 89

               - TotalRawHdfsReadTime(*): 690.143us

               - TotalReadThroughput: 5.39 KB/sec

        Fragment 4:

          Instance 174050aa9c33038e:b796d9549ff72bba
    (host=clouderavm.vm:22000):(Active: 3s922ms, % non-child: 0.00%)

            Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:6/24.39
    KB

             - AverageThreadTokens: 2.00

             - PeakMemoryUsage: 321.73 MB

             - RowsProduced: 35

            CodeGen:(Active: 76.265ms, % non-child: 1.94%)

               - CodegenTime: 154.831us

               - CompileTime: 54.123ms

               - LoadTime: 22.141ms

               - ModuleFileSize: 75.62 KB

            DataStreamSender (dst_id=9):(Active: 149.730us, % non-child: 0.00%)

               - BytesSent: 3.29 KB

               - NetworkThroughput(*): 6.13 MB/sec

               - OverallThroughput: 21.43 MB/sec

               - SerializeBatchTime: 84.151us

               - ThriftTransmitTime(*): 523.266us

               - UncompressedRowBatchSize: 10.65 KB

            MERGE_NODE (id=12):(Active: 3s922ms, % non-child: 0.01%)

               - MemoryUsed: 0.00

               - RowsReturned: 35

               - RowsReturnedRate: 8.00 /sec

            HDFS_SCAN_NODE (id=3):(Active: 3s922ms, % non-child: 99.99%)

              Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:6/24.39 KB

              Hdfs Read Thread Concurrency Bucket: 0:100% 1:0% 2:0% 3:0%

              File Formats: PARQUET/SNAPPY:78

              ExecOption: Codegen enabled: 0 out of 6

               - AverageHdfsReadThreadConcurrency: 0.00

               - AverageScannerThreadConcurrency: 1.00

               - BytesRead: 40.73 KB

               - BytesReadLocal: 40.73 KB

               - BytesReadShortCircuit: 40.73 KB

               - DecompressionTime: 290.322us

               - MemoryUsed: 0.00

               - NumColumns: 0

               - NumDisksAccessed: 1

               - NumScannerThreadsStarted: 1

               - PerReadThreadRawHdfsThroughput: 42.72 MB/sec

               - RowsRead: 622

               - RowsReturned: 40

               - RowsReturnedRate: 10.00 /sec

               - ScanRangesComplete: 6

               - ScannerThreadsInvoluntaryContextSwitches: 0

               - ScannerThreadsTotalWallClockTime: 3s922ms

                 - MaterializeTupleTime(*): 639.232us

                 - ScannerThreadsSysTime: 0ns

                 - ScannerThreadsUserTime: 4.0ms

               - ScannerThreadsVoluntaryContextSwitches: 125

               - TotalRawHdfsReadTime(*): 930.993us

               - TotalReadThroughput: 10.17 KB/sec

        Fragment 5:

          Instance 174050aa9c33038e:b796d9549ff72bbb
    (host=clouderavm.vm:22000):(Active: 3s922ms, % non-child: 0.00%)

            Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:6/23.92
    KB

             - AverageThreadTokens: 2.00

             - PeakMemoryUsage: 321.73 MB

             - RowsProduced: 142

            CodeGen:(Active: 39.648ms, % non-child: 1.01%)

               - CodegenTime: 309.800us

               - CompileTime: 35.695ms

               - LoadTime: 3.952ms

               - ModuleFileSize: 75.62 KB

            DataStreamSender (dst_id=9):(Active: 471.992us, % non-child: 0.01%)

               - BytesSent: 12.90 KB

               - NetworkThroughput(*): 21.90 MB/sec

               - OverallThroughput: 26.69 MB/sec

               - SerializeBatchTime: 400.861us

               - ThriftTransmitTime(*): 575.192us

               - UncompressedRowBatchSize: 42.57 KB

            MERGE_NODE (id=11):(Active: 3s922ms, % non-child: 0.01%)

               - MemoryUsed: 0.00

               - RowsReturned: 142

               - RowsReturnedRate: 36.00 /sec

            HDFS_SCAN_NODE (id=2):(Active: 3s921ms, % non-child: 99.98%)

              Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:6/23.92 KB

              Hdfs Read Thread Concurrency Bucket: 0:100% 1:0% 2:0% 3:0%

              File Formats: PARQUET/SNAPPY:78

              ExecOption: Codegen enabled: 0 out of 6

               - AverageHdfsReadThreadConcurrency: 0.00

               - AverageScannerThreadConcurrency: 1.00

               - BytesRead: 39.79 KB

               - BytesReadLocal: 39.79 KB

               - BytesReadShortCircuit: 39.79 KB

               - DecompressionTime: 278.285us

               - MemoryUsed: 0.00

               - NumColumns: 0

               - NumDisksAccessed: 1

               - NumScannerThreadsStarted: 1

               - PerReadThreadRawHdfsThroughput: 44.64 MB/sec

               - RowsRead: 611

               - RowsReturned: 175

               - RowsReturnedRate: 44.00 /sec

               - ScanRangesComplete: 6

               - ScannerThreadsInvoluntaryContextSwitches: 0

               - ScannerThreadsTotalWallClockTime: 3s921ms

                 - MaterializeTupleTime(*): 627.357us

                 - ScannerThreadsSysTime: 0ns

                 - ScannerThreadsUserTime: 4.0ms

               - ScannerThreadsVoluntaryContextSwitches: 127

               - TotalRawHdfsReadTime(*): 870.333us

               - TotalReadThroughput: 9.91 KB/sec

        Fragment 6:

          Instance 174050aa9c33038e:b796d9549ff72bbc
    (host=clouderavm.vm:22000):(Active: 4s649ms, % non-child: 0.00%)

            Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:10/73.38 KB

             - AverageThreadTokens: 2.00

             - PeakMemoryUsage: 321.73 MB

             - RowsProduced: 144

            CodeGen:(Active: 42.804ms, % non-child: 0.92%)

               - CodegenTime: 126.484us

               - CompileTime: 36.859ms

               - LoadTime: 5.943ms

               - ModuleFileSize: 75.62 KB

            DataStreamSender (dst_id=9):(Active: 5.757ms, % non-child: 0.12%)

               - BytesSent: 12.53 KB

               - NetworkThroughput(*): 61.45 MB/sec

               - OverallThroughput: 2.13 MB/sec

               - SerializeBatchTime: 172.616us

               - ThriftTransmitTime(*): 199.171us

               - UncompressedRowBatchSize: 42.73 KB

            MERGE_NODE (id=10):(Active: 4s643ms, % non-child: 0.01%)

               - MemoryUsed: 0.00

               - RowsReturned: 144

               - RowsReturnedRate: 31.00 /sec

            HDFS_SCAN_NODE (id=1):(Active: 4s643ms, % non-child: 99.86%)

              Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:10/73.38 KB

              Hdfs Read Thread Concurrency Bucket: 0:100% 1:0% 2:0% 3:0%

              File Formats: PARQUET/SNAPPY:130

              ExecOption: Codegen enabled: 0 out of 10

               - AverageHdfsReadThreadConcurrency: 0.00

               - AverageScannerThreadConcurrency: 1.00

               - BytesRead: 133.28 KB

               - BytesReadLocal: 133.28 KB

               - BytesReadShortCircuit: 133.28 KB

               - DecompressionTime: 490.818us

               - MemoryUsed: 0.00

               - NumColumns: 0

               - NumDisksAccessed: 1

               - NumScannerThreadsStarted: 1

               - PerReadThreadRawHdfsThroughput: 81.32 MB/sec

               - RowsRead: 3.23K (3228)

               - RowsReturned: 166

               - RowsReturnedRate: 35.00 /sec

               - ScanRangesComplete: 10

               - ScannerThreadsInvoluntaryContextSwitches: 0

               - ScannerThreadsTotalWallClockTime: 4s642ms

                 - MaterializeTupleTime(*): 1.726ms

                 - ScannerThreadsSysTime: 0ns

                 - ScannerThreadsUserTime: 8.0ms

               - ScannerThreadsVoluntaryContextSwitches: 210

               - TotalRawHdfsReadTime(*): 1.600ms

               - TotalReadThroughput: 25.97 KB/sec

    *Single query*


    PLAN FRAGMENT 0

      PARTITION: UNPARTITIONED

    *
    *

      8:TOP-N
    order by: SUM(wait_time) DESC
    limit: 10
    tuple ids: 8

      7:AGGREGATE
    output: SUM(<slot 97>), SUM(<slot 98>), SUM(<slot 99>), SUM(<slot 100>)
    group by: <slot 101>
    tuple ids: 8
      9:EXCHANGE

         tuple ids: 6

    *
    *

    PLAN FRAGMENT 1

      PARTITION: RANDOM

    *
    *

      STREAM DATA SINK

        EXCHANGE ID: 9

        UNPARTITIONED

    *
    *

      15:MERGE
    predicates: (<slot 109> = 55)
    tuple ids: 6
      6:SCAN HDFS

         table=pulse.user_action_ui_stats_10_minute_rollup #partitions=0 size=0B

         predicates: bucket <= '2013-10-23 02:30:00.000', bucket >= '2013-10-23
    02:00:00.000'

         tuple ids: 5

    *
    *

    PLAN FRAGMENT 2

      PARTITION: RANDOM

    *
    *

      STREAM DATA SINK

        EXCHANGE ID: 9

        UNPARTITIONED

    *
    *

      14:MERGE
    predicates: (<slot 109> = 55)
    tuple ids: 6
      5:SCAN HDFS

         table=pulse.user_action_ui_stats_1_hour_rollup #partitions=0 size=0B

         predicates: bucket <= '2013-10-23 02:00:00.000', bucket >= '2013-10-23
    00:00:00.000'

         tuple ids: 4

    *
    *

    PLAN FRAGMENT 3

      PARTITION: RANDOM

    *
    *

      STREAM DATA SINK

        EXCHANGE ID: 9

        UNPARTITIONED

    *
    *

      13:MERGE
    predicates: (<slot 109> = 55)
    tuple ids: 6
      4:SCAN HDFS

         table=pulse.user_action_ui_stats_10_minute_rollup #partitions=5
    size=14.14KB

         predicates: bucket >= '2013-10-16 02:40:00.000', bucket <= '2013-10-16
    03:00:00.000'

         tuple ids: 3

    *
    *

    PLAN FRAGMENT 4

      PARTITION: RANDOM

    *
    *

      STREAM DATA SINK

        EXCHANGE ID: 9

        UNPARTITIONED

    *
    *

      12:MERGE
    predicates: (<slot 109> = 55)
    tuple ids: 6
      3:SCAN HDFS

         table=pulse.user_action_ui_stats_1_hour_rollup #partitions=6
    size=24.39KB

         predicates: bucket >= '2013-10-16 03:00:00.000', bucket <= '2013-10-16
    06:00:00.000'

         tuple ids: 2

    *
    *

    PLAN FRAGMENT 5

      PARTITION: RANDOM

    *
    *

      STREAM DATA SINK

        EXCHANGE ID: 9

        UNPARTITIONED

    *
    *

      11:MERGE
    predicates: (<slot 109> = 55)
    tuple ids: 6
      2:SCAN HDFS

         table=pulse.user_action_ui_stats_6_hour_rollup #partitions=6
    size=23.92KB

         predicates: bucket >= '2013-10-16 06:00:00.000', bucket <= '2013-10-17
    00:00:00.000'

         tuple ids: 1

    *
    *

    PLAN FRAGMENT 6

      PARTITION: RANDOM

    *
    *

      STREAM DATA SINK

        EXCHANGE ID: 9

        UNPARTITIONED

    *
    *

      10:MERGE
    predicates: (<slot 109> = 55)
    tuple ids: 6
      1:SCAN HDFS

         table=pulse.user_action_ui_stats_1_day_rollup #partitions=10
    size=73.38KB

         predicates: bucket >= '2013-10-17 00:00:00.000', bucket <= '2013-10-23
    00:00:00.000'

         tuple ids: 0

    ----------------

        Query Timeline: 638.553ms

           - Start execution: 2.737ms (2.737ms)

           - Planning finished: 103.380ms (100.643ms)

           - Rows available: 570.380ms (466.999ms)

           - First row fetched: 629.132ms (58.751ms)

      ImpalaServer:

         - ClientFetchWaitTimer: 59.688ms

         - RowMaterializationTimer: 105.18us

      Execution Profile 624f4f947711d4be:1dd2e9d377914090:(Active: 466.139ms, %
    non-child: 0.00%)

        Per Node Peak Memory Usage: clouderavm.vm:22000(121.09 MB)

         - FinalizationTimer: 0ns

        Coordinator Fragment:(Active: 145.926ms, % non-child: 0.00%)

           - AverageThreadTokens: 0.00

           - PeakMemoryUsage: 121.09 MB

           - RowsProduced: 10

          CodeGen:(Active: 60.148ms, % non-child: 41.22%)

             - CodegenTime: 2.822ms

             - CompileTime: 55.2ms

             - LoadTime: 5.145ms

             - ModuleFileSize: 75.62 KB

          SORT_NODE (id=8):(Active: 145.920ms, % non-child: 0.00%)

             - MemoryUsed: 0.00

             - RowsReturned: 10

             - RowsReturnedRate: 68.00 /sec

          AGGREGATION_NODE (id=7):(Active: 148.413ms, % non-child: 9.56%)

            ExecOption: Codegen Enabled

             - BuildBuckets: 1.02K (1024)

             - BuildTime: 178.96us

             - GetResultsTime: 15.164us

             - LoadFactor: 0.11

             - MemoryUsed: 47.34 KB

             - RowsReturned: 118

             - RowsReturnedRate: 795.00 /sec

          EXCHANGE_NODE (id=9):(Active: 134.466ms, % non-child: 92.15%)

             - BytesReceived: 28.71 KB

             - ConvertRowBatchTime: 7.567us

             - DataArrivalWaitTime: 134.328ms

             - DeserializeRowBatchTimer: 170.203us

             - FirstBatchArrivalWaitTime: 0ns

             - MemoryUsed: 0.00

             - RowsReturned: 321

             - RowsReturnedRate: 2.39 K/sec

             - SendersBlockedTimer: 0ns

             - SendersBlockedTotalTimer(*): 0ns

        Averaged Fragment 1:(Active: 43.110us, % non-child: 0.00%)

          split sizes: min: 0.00 , max: 0.00 , avg: 0.00 , stddev: 0.00

          completion times: min:210.902ms max:210.902ms mean: 210.902ms
      stddev:0ns

          execution rates: min:0.00 /sec max:0.00 /sec mean:0.00 /sec
      stddev:0.00 /sec

          num instances: 1

           - AverageThreadTokens: 0.00

           - PeakMemoryUsage: 32.00 KB

           - RowsProduced: 0

          CodeGen:(Active: 32.27ms, % non-child: 100.00%)

             - CodegenTime: 123.159us

             - CompileTime: 28.400ms

             - LoadTime: 3.625ms

             - ModuleFileSize: 75.62 KB

          DataStreamSender (dst_id=9):(Active: 16.424us, % non-child: 38.10%)

             - BytesSent: 0.00

             - NetworkThroughput(*): 0.00 /sec

             - OverallThroughput: 0.00 /sec

             - SerializeBatchTime: 0ns

             - ThriftTransmitTime(*): 0ns

             - UncompressedRowBatchSize: 0.00

          MERGE_NODE (id=15):(Active: 41.449us, % non-child: 65.98%)

             - MemoryUsed: 0.00

             - RowsReturned: 0

             - RowsReturnedRate: 0

          HDFS_SCAN_NODE (id=6):(Active: 13.6us, % non-child: 30.17%)

             - BytesRead: 0.00

             - MemoryUsed: 0.00

             - NumDisksAccessed: 0

             - NumScannerThreadsStarted: 0

             - PerReadThreadRawHdfsThroughput: 0.00 /sec

             - RowsRead: 0

             - RowsReturned: 0

             - RowsReturnedRate: 0

             - ScanRangesComplete: 0

             - ScannerThreadsInvoluntaryContextSwitches: 0

             - ScannerThreadsTotalWallClockTime: 0ns

               - MaterializeTupleTime(*): 0ns

               - ScannerThreadsSysTime: 0ns

               - ScannerThreadsUserTime: 0ns

             - ScannerThreadsVoluntaryContextSwitches: 0

             - TotalRawHdfsReadTime(*): 0ns

             - TotalReadThroughput: 0.00 /sec

        Averaged Fragment 2:(Active: 54.689us, % non-child: 0.00%)

          split sizes: min: 0.00 , max: 0.00 , avg: 0.00 , stddev: 0.00

          completion times: min:169.188ms max:169.188ms mean: 169.188ms
      stddev:0ns

          execution rates: min:0.00 /sec max:0.00 /sec mean:0.00 /sec
      stddev:0.00 /sec

          num instances: 1

           - AverageThreadTokens: 0.00

           - PeakMemoryUsage: 32.00 KB

           - RowsProduced: 0

          CodeGen:(Active: 36.313ms, % non-child: 100.00%)

             - CodegenTime: 204.575us

             - CompileTime: 28.583ms

             - LoadTime: 7.728ms

             - ModuleFileSize: 75.62 KB

          DataStreamSender (dst_id=9):(Active: 28.423us, % non-child: 51.97%)

             - BytesSent: 0.00

             - NetworkThroughput(*): 0.00 /sec

             - OverallThroughput: 0.00 /sec

             - SerializeBatchTime: 0ns

             - ThriftTransmitTime(*): 0ns

             - UncompressedRowBatchSize: 0.00

          MERGE_NODE (id=14):(Active: 52.838us, % non-child: 73.23%)

             - MemoryUsed: 0.00

             - RowsReturned: 0

             - RowsReturnedRate: 0

          HDFS_SCAN_NODE (id=5):(Active: 12.789us, % non-child: 23.38%)

             - BytesRead: 0.00

             - MemoryUsed: 0.00

             - NumDisksAccessed: 0

             - NumScannerThreadsStarted: 0

             - PerReadThreadRawHdfsThroughput: 0.00 /sec

             - RowsRead: 0

             - RowsReturned: 0

             - RowsReturnedRate: 0

             - ScanRangesComplete: 0

             - ScannerThreadsInvoluntaryContextSwitches: 0

             - ScannerThreadsTotalWallClockTime: 0ns

               - MaterializeTupleTime(*): 0ns

               - ScannerThreadsSysTime: 0ns

               - ScannerThreadsUserTime: 0ns

             - ScannerThreadsVoluntaryContextSwitches: 0

             - TotalRawHdfsReadTime(*): 0ns

             - TotalReadThroughput: 0.00 /sec

        Averaged Fragment 3:(Active: 39.911ms, % non-child: 0.00%)

          split sizes: min: 14.14 KB, max: 14.14 KB, avg: 14.14 KB, stddev:
    0.00

          completion times: min:130.415ms max:130.415ms mean: 130.415ms
      stddev:0ns

          execution rates: min:108.46 KB/sec max:108.46 KB/sec mean:108.46
    KB/sec stddev:0.00 /sec

          num instances: 1

           - AverageThreadTokens: 0.00

           - PeakMemoryUsage: 113.07 MB

           - RowsProduced: 0

          CodeGen:(Active: 34.35ms, % non-child: 85.28%)

             - CodegenTime: 116.541us

             - CompileTime: 27.510ms

             - LoadTime: 6.522ms

             - ModuleFileSize: 75.62 KB

          DataStreamSender (dst_id=9):(Active: 7.558us, % non-child: 0.02%)

             - BytesSent: 0.00

             - NetworkThroughput(*): 0.00 /sec

             - OverallThroughput: 0.00 /sec

             - SerializeBatchTime: 0ns

             - ThriftTransmitTime(*): 0ns

             - UncompressedRowBatchSize: 0.00

          MERGE_NODE (id=13):(Active: 39.908ms, % non-child: 0.67%)

             - MemoryUsed: 0.00

             - RowsReturned: 0

             - RowsReturnedRate: 0

          HDFS_SCAN_NODE (id=4):(Active: 39.640ms, % non-child: 99.32%)

             - AverageHdfsReadThreadConcurrency: 0.00

             - AverageScannerThreadConcurrency: 0.00

             - BytesRead: 21.64 KB

             - BytesReadLocal: 21.64 KB

             - BytesReadShortCircuit: 21.64 KB

             - DecompressionTime: 97.810us

             - MemoryUsed: 0.00

             - NumColumns: 0

             - NumDisksAccessed: 1

             - NumScannerThreadsStarted: 5

             - PerReadThreadRawHdfsThroughput: 44.32 MB/sec

             - RowsRead: 157

             - RowsReturned: 0

             - RowsReturnedRate: 0

             - ScanRangesComplete: 5

             - ScannerThreadsInvoluntaryContextSwitches: 1

             - ScannerThreadsTotalWallClockTime: 136.228ms

               - MaterializeTupleTime(*): 214.662us

               - ScannerThreadsSysTime: 0ns

               - ScannerThreadsUserTime: 0ns

             - ScannerThreadsVoluntaryContextSwitches: 117

             - TotalRawHdfsReadTime(*): 476.877us

             - TotalReadThroughput: 0.00 /sec

        Averaged Fragment 4:(Active: 42.123ms, % non-child: 0.00%)

          split sizes: min: 24.39 KB, max: 24.39 KB, avg: 24.39 KB, stddev:
    0.00

          completion times: min:90.167ms max:90.167ms mean: 90.167ms
      stddev:0ns

          execution rates: min:270.52 KB/sec max:270.52 KB/sec mean:270.52
    KB/sec stddev:0.00 /sec

          num instances: 1

           - AverageThreadTokens: 0.00

           - PeakMemoryUsage: 113.08 MB

           - RowsProduced: 35

          CodeGen:(Active: 35.980ms, % non-child: 85.42%)

             - CodegenTime: 122.65us

             - CompileTime: 30.517ms

             - LoadTime: 5.462ms

             - ModuleFileSize: 75.62 KB

          DataStreamSender (dst_id=9):(Active: 228.512us, % non-child: 0.54%)

             - BytesSent: 3.29 KB

             - NetworkThroughput(*): 15.55 MB/sec

             - OverallThroughput: 14.04 MB/sec

             - SerializeBatchTime: 133.848us

             - ThriftTransmitTime(*): 206.253us

             - UncompressedRowBatchSize: 10.65 KB

          MERGE_NODE (id=12):(Active: 41.898ms, % non-child: 1.34%)

             - MemoryUsed: 0.00

             - RowsReturned: 35

             - RowsReturnedRate: 835.00 /sec

          HDFS_SCAN_NODE (id=3):(Active: 41.335ms, % non-child: 98.13%)

             - AverageHdfsReadThreadConcurrency: 0.00

             - AverageScannerThreadConcurrency: 0.00

             - BytesRead: 40.73 KB

             - BytesReadLocal: 40.73 KB

             - BytesReadShortCircuit: 40.73 KB

             - DecompressionTime: 108.994us

             - MemoryUsed: 0.00

             - NumColumns: 0

             - NumDisksAccessed: 1

             - NumScannerThreadsStarted: 4

             - PerReadThreadRawHdfsThroughput: 62.43 MB/sec

             - RowsRead: 622

             - RowsReturned: 40

             - RowsReturnedRate: 967.00 /sec

             - ScanRangesComplete: 6

             - ScannerThreadsInvoluntaryContextSwitches: 16

             - ScannerThreadsTotalWallClockTime: 113.341ms

               - MaterializeTupleTime(*): 550.780us

               - ScannerThreadsSysTime: 0ns

               - ScannerThreadsUserTime: 0ns

             - ScannerThreadsVoluntaryContextSwitches: 147

             - TotalRawHdfsReadTime(*): 637.102us

             - TotalReadThroughput: 0.00 /sec

        Averaged Fragment 5:(Active: 52.161ms, % non-child: 0.00%)

          split sizes: min: 23.92 KB, max: 23.92 KB, avg: 23.92 KB, stddev:
    0.00

          completion times: min:53.777ms max:53.777ms mean: 53.777ms
      stddev:0ns

          execution rates: min:444.78 KB/sec max:444.78 KB/sec mean:444.78
    KB/sec stddev:0.00 /sec

          num instances: 1

           - AverageThreadTokens: 0.00

           - PeakMemoryUsage: 113.35 MB

           - RowsProduced: 142

          CodeGen:(Active: 36.274ms, % non-child: 69.54%)

             - CodegenTime: 98.900us

             - CompileTime: 32.325ms

             - LoadTime: 3.948ms

             - ModuleFileSize: 75.62 KB

          DataStreamSender (dst_id=9):(Active: 176.280us, % non-child: 0.34%)

             - BytesSent: 12.90 KB

             - NetworkThroughput(*): 66.00 MB/sec

             - OverallThroughput: 71.45 MB/sec

             - SerializeBatchTime: 147.618us

             - ThriftTransmitTime(*): 190.844us

             - UncompressedRowBatchSize: 42.57 KB

          MERGE_NODE (id=11):(Active: 51.988ms, % non-child: 0.60%)

             - MemoryUsed: 0.00

             - RowsReturned: 142

             - RowsReturnedRate: 2.73 K/sec

          HDFS_SCAN_NODE (id=2):(Active: 51.674ms, % non-child: 99.07%)

             - AverageHdfsReadThreadConcurrency: 0.00

             - AverageScannerThreadConcurrency: 0.00

             - BytesRead: 39.79 KB

             - BytesReadLocal: 39.79 KB

             - BytesReadShortCircuit: 39.79 KB

             - DecompressionTime: 127.855us

             - MemoryUsed: 0.00

             - NumColumns: 0

             - NumDisksAccessed: 1

             - NumScannerThreadsStarted: 3

             - PerReadThreadRawHdfsThroughput: 45.25 MB/sec

             - RowsRead: 611

             - RowsReturned: 175

             - RowsReturnedRate: 3.39 K/sec

             - ScanRangesComplete: 6

             - ScannerThreadsInvoluntaryContextSwitches: 5

             - ScannerThreadsTotalWallClockTime: 133.578ms

               - MaterializeTupleTime(*): 310.822us

               - ScannerThreadsSysTime: 0ns

               - ScannerThreadsUserTime: 0ns

             - ScannerThreadsVoluntaryContextSwitches: 148

             - TotalRawHdfsReadTime(*): 858.757us

             - TotalReadThroughput: 0.00 /sec

        Averaged Fragment 6:(Active: 148.220ms, % non-child: 0.00%)

          split sizes: min: 73.38 KB, max: 73.38 KB, avg: 73.38 KB, stddev:
    0.00

          completion times: min:149.679ms max:149.679ms mean: 149.679ms
      stddev:0ns

          execution rates: min:490.26 KB/sec max:490.26 KB/sec mean:490.26
    KB/sec stddev:0.00 /sec

          num instances: 1

           - AverageThreadTokens: 11.00

           - PeakMemoryUsage: 121.09 MB

           - RowsProduced: 144

          CodeGen:(Active: 37.28ms, % non-child: 24.98%)

             - CodegenTime: 239.688us

             - CompileTime: 32.109ms

             - LoadTime: 4.917ms

             - ModuleFileSize: 75.62 KB

          DataStreamSender (dst_id=9):(Active: 392.899us, % non-child: 0.27%)

             - BytesSent: 12.53 KB

             - NetworkThroughput(*): 39.31 MB/sec

             - OverallThroughput: 31.15 MB/sec

             - SerializeBatchTime: 325.694us

             - ThriftTransmitTime(*): 311.371us

             - UncompressedRowBatchSize: 42.73 KB

          MERGE_NODE (id=10):(Active: 147.829ms, % non-child: 0.68%)

             - MemoryUsed: 0.00

             - RowsReturned: 144

             - RowsReturnedRate: 974.00 /sec

          HDFS_SCAN_NODE (id=1):(Active: 146.823ms, % non-child: 99.06%)

             - AverageHdfsReadThreadConcurrency: 0.00

             - AverageScannerThreadConcurrency: 10.00

             - BytesRead: 133.28 KB

             - BytesReadLocal: 133.28 KB

             - BytesReadShortCircuit: 133.28 KB

             - DecompressionTime: 365.599us

             - MemoryUsed: 0.00

             - NumColumns: 0

             - NumDisksAccessed: 1

             - NumScannerThreadsStarted: 10

             - PerReadThreadRawHdfsThroughput: 66.98 MB/sec

             - RowsRead: 3.23K (3228)

             - RowsReturned: 166

             - RowsReturnedRate: 1.13 K/sec

             - ScanRangesComplete: 10

             - ScannerThreadsInvoluntaryContextSwitches: 1

             - ScannerThreadsTotalWallClockTime: 786.975ms

               - MaterializeTupleTime(*): 2.414ms

               - ScannerThreadsSysTime: 0ns

               - ScannerThreadsUserTime: 4.0ms

             - ScannerThreadsVoluntaryContextSwitches: 320

             - TotalRawHdfsReadTime(*): 1.943ms

             - TotalReadThroughput: 8.39 KB/sec

        Fragment 1:

          Instance 624f4f947711d4be:1dd2e9d377914092
    (host=clouderavm.vm:22000):(Active: 43.110us, % non-child: 0.00%)

             - AverageThreadTokens: 0.00

             - PeakMemoryUsage: 32.00 KB

             - RowsProduced: 0

            CodeGen:(Active: 32.27ms, % non-child: 100.00%)

               - CodegenTime: 123.159us

               - CompileTime: 28.400ms

               - LoadTime: 3.625ms

               - ModuleFileSize: 75.62 KB

            DataStreamSender (dst_id=9):(Active: 16.424us, % non-child: 38.10%)

               - BytesSent: 0.00

               - NetworkThroughput(*): 0.00 /sec

               - OverallThroughput: 0.00 /sec

               - SerializeBatchTime: 0ns

               - ThriftTransmitTime(*): 0ns

               - UncompressedRowBatchSize: 0.00

            MERGE_NODE (id=15):(Active: 41.449us, % non-child: 65.98%)

               - MemoryUsed: 0.00

               - RowsReturned: 0

               - RowsReturnedRate: 0

            HDFS_SCAN_NODE (id=6):(Active: 13.6us, % non-child: 30.17%)

              Hdfs split stats (<volume id>:<# splits>/<split lengths>):

              Hdfs Read Thread Concurrency Bucket:

              ExecOption: Codegen enabled: 0 out of 0

               - BytesRead: 0.00

               - MemoryUsed: 0.00

               - NumDisksAccessed: 0

               - NumScannerThreadsStarted: 0

               - PerReadThreadRawHdfsThroughput: 0.00 /sec

               - RowsRead: 0

               - RowsReturned: 0

               - RowsReturnedRate: 0

               - ScanRangesComplete: 0

               - ScannerThreadsInvoluntaryContextSwitches: 0

               - ScannerThreadsTotalWallClockTime: 0ns

                 - MaterializeTupleTime(*): 0ns

                 - ScannerThreadsSysTime: 0ns

                 - ScannerThreadsUserTime: 0ns

               - ScannerThreadsVoluntaryContextSwitches: 0

               - TotalRawHdfsReadTime(*): 0ns

               - TotalReadThroughput: 0.00 /sec

        Fragment 2:

          Instance 624f4f947711d4be:1dd2e9d377914093
    (host=clouderavm.vm:22000):(Active: 54.689us, % non-child: 0.00%)

             - AverageThreadTokens: 0.00

             - PeakMemoryUsage: 32.00 KB

             - RowsProduced: 0

            CodeGen:(Active: 36.313ms, % non-child: 100.00%)

               - CodegenTime: 204.575us

               - CompileTime: 28.583ms

               - LoadTime: 7.728ms

               - ModuleFileSize: 75.62 KB

            DataStreamSender (dst_id=9):(Active: 28.423us, % non-child: 51.97%)

               - BytesSent: 0.00

               - NetworkThroughput(*): 0.00 /sec

               - OverallThroughput: 0.00 /sec

               - SerializeBatchTime: 0ns

               - ThriftTransmitTime(*): 0ns

               - UncompressedRowBatchSize: 0.00

            MERGE_NODE (id=14):(Active: 52.838us, % non-child: 73.23%)

               - MemoryUsed: 0.00

               - RowsReturned: 0

               - RowsReturnedRate: 0

            HDFS_SCAN_NODE (id=5):(Active: 12.789us, % non-child: 23.38%)

              Hdfs split stats (<volume id>:<# splits>/<split lengths>):

              Hdfs Read Thread Concurrency Bucket:

              ExecOption: Codegen enabled: 0 out of 0

               - BytesRead: 0.00

               - MemoryUsed: 0.00

               - NumDisksAccessed: 0

               - NumScannerThreadsStarted: 0

               - PerReadThreadRawHdfsThroughput: 0.00 /sec

               - RowsRead: 0

               - RowsReturned: 0

               - RowsReturnedRate: 0

               - ScanRangesComplete: 0

               - ScannerThreadsInvoluntaryContextSwitches: 0

               - ScannerThreadsTotalWallClockTime: 0ns

                 - MaterializeTupleTime(*): 0ns

                 - ScannerThreadsSysTime: 0ns

                 - ScannerThreadsUserTime: 0ns

               - ScannerThreadsVoluntaryContextSwitches: 0

               - TotalRawHdfsReadTime(*): 0ns

               - TotalReadThroughput: 0.00 /sec

        Fragment 3:

          Instance 624f4f947711d4be:1dd2e9d377914094
    (host=clouderavm.vm:22000):(Active: 39.911ms, % non-child: 0.00%)

            Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:5/14.14
    KB

             - AverageThreadTokens: 0.00

             - PeakMemoryUsage: 113.07 MB

             - RowsProduced: 0

            CodeGen:(Active: 34.35ms, % non-child: 85.28%)

               - CodegenTime: 116.541us

               - CompileTime: 27.510ms

               - LoadTime: 6.522ms

               - ModuleFileSize: 75.62 KB

            DataStreamSender (dst_id=9):(Active: 7.558us, % non-child: 0.02%)

               - BytesSent: 0.00

               - NetworkThroughput(*): 0.00 /sec

               - OverallThroughput: 0.00 /sec

               - SerializeBatchTime: 0ns

               - ThriftTransmitTime(*): 0ns

               - UncompressedRowBatchSize: 0.00

            MERGE_NODE (id=13):(Active: 39.908ms, % non-child: 0.67%)

               - MemoryUsed: 0.00

               - RowsReturned: 0

               - RowsReturnedRate: 0

            HDFS_SCAN_NODE (id=4):(Active: 39.640ms, % non-child: 99.32%)

              Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:5/14.14 KB

              Hdfs Read Thread Concurrency Bucket: 0:0% 1:0% 2:0% 3:0%

              File Formats: PARQUET/SNAPPY:65

              ExecOption: Codegen enabled: 0 out of 5

               - AverageHdfsReadThreadConcurrency: 0.00

               - AverageScannerThreadConcurrency: 0.00

               - BytesRead: 21.64 KB

               - BytesReadLocal: 21.64 KB

               - BytesReadShortCircuit: 21.64 KB

               - DecompressionTime: 97.810us

               - MemoryUsed: 0.00

               - NumColumns: 0

               - NumDisksAccessed: 1

               - NumScannerThreadsStarted: 5

               - PerReadThreadRawHdfsThroughput: 44.32 MB/sec

               - RowsRead: 157

               - RowsReturned: 0

               - RowsReturnedRate: 0

               - ScanRangesComplete: 5

               - ScannerThreadsInvoluntaryContextSwitches: 1

               - ScannerThreadsTotalWallClockTime: 136.228ms

                 - MaterializeTupleTime(*): 214.662us

                 - ScannerThreadsSysTime: 0ns

                 - ScannerThreadsUserTime: 0ns

               - ScannerThreadsVoluntaryContextSwitches: 117

               - TotalRawHdfsReadTime(*): 476.877us

               - TotalReadThroughput: 0.00 /sec

        Fragment 4:

          Instance 624f4f947711d4be:1dd2e9d377914095
    (host=clouderavm.vm:22000):(Active: 42.123ms, % non-child: 0.00%)

            Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:6/24.39
    KB

             - AverageThreadTokens: 0.00

             - PeakMemoryUsage: 113.08 MB

             - RowsProduced: 35

            CodeGen:(Active: 35.980ms, % non-child: 85.42%)

               - CodegenTime: 122.65us

               - CompileTime: 30.517ms

               - LoadTime: 5.462ms

               - ModuleFileSize: 75.62 KB

            DataStreamSender (dst_id=9):(Active: 228.512us, % non-child: 0.54%)

               - BytesSent: 3.29 KB

               - NetworkThroughput(*): 15.55 MB/sec

               - OverallThroughput: 14.04 MB/sec

               - SerializeBatchTime: 133.848us

               - ThriftTransmitTime(*): 206.253us

               - UncompressedRowBatchSize: 10.65 KB

            MERGE_NODE (id=12):(Active: 41.898ms, % non-child: 1.34%)

               - MemoryUsed: 0.00

               - RowsReturned: 35

               - RowsReturnedRate: 835.00 /sec

            HDFS_SCAN_NODE (id=3):(Active: 41.335ms, % non-child: 98.13%)

              Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:6/24.39 KB

              Hdfs Read Thread Concurrency Bucket: 0:0% 1:0% 2:0% 3:0%

              File Formats: PARQUET/SNAPPY:78

              ExecOption: Codegen enabled: 0 out of 6

               - AverageHdfsReadThreadConcurrency: 0.00

               - AverageScannerThreadConcurrency: 0.00

               - BytesRead: 40.73 KB

               - BytesReadLocal: 40.73 KB

               - BytesReadShortCircuit: 40.73 KB

               - DecompressionTime: 108.994us

               - MemoryUsed: 0.00

               - NumColumns: 0

               - NumDisksAccessed: 1

               - NumScannerThreadsStarted: 4

               - PerReadThreadRawHdfsThroughput: 62.43 MB/sec

               - RowsRead: 622

               - RowsReturned: 40

               - RowsReturnedRate: 967.00 /sec

               - ScanRangesComplete: 6

               - ScannerThreadsInvoluntaryContextSwitches: 16

               - ScannerThreadsTotalWallClockTime: 113.341ms

                 - MaterializeTupleTime(*): 550.780us

                 - ScannerThreadsSysTime: 0ns

                 - ScannerThreadsUserTime: 0ns

               - ScannerThreadsVoluntaryContextSwitches: 147

               - TotalRawHdfsReadTime(*): 637.102us

               - TotalReadThroughput: 0.00 /sec

        Fragment 5:

          Instance 624f4f947711d4be:1dd2e9d377914096
    (host=clouderavm.vm:22000):(Active: 52.161ms, % non-child: 0.00%)

            Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:6/23.92
    KB

             - AverageThreadTokens: 0.00

             - PeakMemoryUsage: 113.35 MB

             - RowsProduced: 142

            CodeGen:(Active: 36.274ms, % non-child: 69.54%)

               - CodegenTime: 98.900us

               - CompileTime: 32.325ms

               - LoadTime: 3.948ms

               - ModuleFileSize: 75.62 KB

            DataStreamSender (dst_id=9):(Active: 176.280us, % non-child: 0.34%)

               - BytesSent: 12.90 KB

               - NetworkThroughput(*): 66.00 MB/sec

               - OverallThroughput: 71.45 MB/sec

               - SerializeBatchTime: 147.618us

               - ThriftTransmitTime(*): 190.844us

               - UncompressedRowBatchSize: 42.57 KB

            MERGE_NODE (id=11):(Active: 51.988ms, % non-child: 0.60%)

               - MemoryUsed: 0.00

               - RowsReturned: 142

               - RowsReturnedRate: 2.73 K/sec

            HDFS_SCAN_NODE (id=2):(Active: 51.674ms, % non-child: 99.07%)

              Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:6/23.92 KB

              Hdfs Read Thread Concurrency Bucket: 0:0% 1:0% 2:0% 3:0%

              File Formats: PARQUET/SNAPPY:78

              ExecOption: Codegen enabled: 0 out of 6

               - AverageHdfsReadThreadConcurrency: 0.00

               - AverageScannerThreadConcurrency: 0.00

               - BytesRead: 39.79 KB

               - BytesReadLocal: 39.79 KB

               - BytesReadShortCircuit: 39.79 KB

               - DecompressionTime: 127.855us

               - MemoryUsed: 0.00

               - NumColumns: 0

               - NumDisksAccessed: 1

               - NumScannerThreadsStarted: 3

               - PerReadThreadRawHdfsThroughput: 45.25 MB/sec

               - RowsRead: 611

               - RowsReturned: 175

               - RowsReturnedRate: 3.39 K/sec

               - ScanRangesComplete: 6

               - ScannerThreadsInvoluntaryContextSwitches: 5

               - ScannerThreadsTotalWallClockTime: 133.578ms

                 - MaterializeTupleTime(*): 310.822us

                 - ScannerThreadsSysTime: 0ns

                 - ScannerThreadsUserTime: 0ns

               - ScannerThreadsVoluntaryContextSwitches: 148

               - TotalRawHdfsReadTime(*): 858.757us

               - TotalReadThroughput: 0.00 /sec

        Fragment 6:

          Instance 624f4f947711d4be:1dd2e9d377914097
    (host=clouderavm.vm:22000):(Active: 148.220ms, % non-child: 0.00%)

            Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:10/73.38 KB

             - AverageThreadTokens: 11.00

             - PeakMemoryUsage: 121.09 MB

             - RowsProduced: 144

            CodeGen:(Active: 37.28ms, % non-child: 24.98%)

               - CodegenTime: 239.688us

               - CompileTime: 32.109ms

               - LoadTime: 4.917ms

               - ModuleFileSize: 75.62 KB

            DataStreamSender (dst_id=9):(Active: 392.899us, % non-child: 0.27%)

               - BytesSent: 12.53 KB

               - NetworkThroughput(*): 39.31 MB/sec

               - OverallThroughput: 31.15 MB/sec

               - SerializeBatchTime: 325.694us

               - ThriftTransmitTime(*): 311.371us

               - UncompressedRowBatchSize: 42.73 KB

            MERGE_NODE (id=10):(Active: 147.829ms, % non-child: 0.68%)

               - MemoryUsed: 0.00

               - RowsReturned: 144

               - RowsReturnedRate: 974.00 /sec

            HDFS_SCAN_NODE (id=1):(Active: 146.823ms, % non-child: 99.06%)

              Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:10/73.38 KB

              Hdfs Read Thread Concurrency Bucket: 0:100% 1:0% 2:0% 3:0%

              File Formats: PARQUET/SNAPPY:130

              ExecOption: Codegen enabled: 0 out of 10

               - AverageHdfsReadThreadConcurrency: 0.00

               - AverageScannerThreadConcurrency: 10.00

               - BytesRead: 133.28 KB

               - BytesReadLocal: 133.28 KB

               - BytesReadShortCircuit: 133.28 KB

               - DecompressionTime: 365.599us

               - MemoryUsed: 0.00

               - NumColumns: 0

               - NumDisksAccessed: 1

               - NumScannerThreadsStarted: 10

               - PerReadThreadRawHdfsThroughput: 66.98 MB/sec

               - RowsRead: 3.23K (3228)

               - RowsReturned: 166

               - RowsReturnedRate: 1.13 K/sec

               - ScanRangesComplete: 10

               - ScannerThreadsInvoluntaryContextSwitches: 1

               - ScannerThreadsTotalWallClockTime: 786.975ms

                 - MaterializeTupleTime(*): 2.414ms

                 - ScannerThreadsSysTime: 0ns

                 - ScannerThreadsUserTime: 4.0ms

               - ScannerThreadsVoluntaryContextSwitches: 320

               - TotalRawHdfsReadTime(*): 1.943ms

               - TotalReadThroughput: 8.39 KB/sec


    On Thursday, October 24, 2013 9:08:00 PM UTC-7, Greg Rahn wrote:

    This is not what I've seen in multi-user workloads both in-house and at
    customers so we'll have to dig deeper into your issue. Probably the
    easiest way to do that is capture the query execution profile from a single
    user execution and compare it with one from a 10 user run. If you're
    running your test via the impala-shell, use the --show_profiles flag and
    redirect stdout to a file. Attach each file to your response.


    On Thu, Oct 24, 2013 at 3:30 PM, <ke...@pulse.io <javascript:>> wrote:

    I've been running some tests on query throughput, and the results have
    been different than I expected. In short, even a few concurrent queries
    really slows down Impala.

    I have a test query that takes roughly 1 second to complete. If I run
    this query from 10 different parallel processes 10 times each (for 100
    total queries), the whole thing takes about 80 seconds to run. That means
    it's not running much faster than simply running these queries
    sequentially. Further more, the per query completion time spikes up to
    about 10 seconds each. My setup is a 4 node cluster, and all queries are
    being issued to the same impalad daemon (though presumably the resulting
    fragments are being run elsewhere). iostat shows there's plenty of
    headroom on the disks, and top says I have about 20% peak cpu use.

    Since Impala was built as a faster version of hive, I'll understand if
    multiple concurrent queries isn't really a case it's designed to handle.
    But before I abandon impala as not suitable for my project, I want to make
    sure this is expected behavior and not some sort of misconfiguration.

    Keith

    To unsubscribe from this group and stop receiving emails from it, send an
    email to impala-user...@cloudera.org <javascript:>.
    To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.
  • Greg Rahn at Oct 25, 2013 at 8:23 pm
    If you only have a single rotational disk, then you only have a single
    scanner (AverageScannerThreadConcurrency = 1 from the profile), and if the
    data is all in the fs page cache, it's a CPU problem that is on a single
    core because of a single thread.

    You can adjust the number of threads as mention in this email thread:
    https://groups.google.com/a/cloudera.org/d/msg/impala-user/CXRQon_CPR0/1fU09CqfNvEJ

    On Fri, Oct 25, 2013 at 11:26 AM, wrote:

    Gotcha. Sounds like it's worth digging into then. Here are the two
    profiles (from a single machine, not the cluster I was using before). The
    first profile is taken during my load test, which takes about 60 seconds to
    complete with 10 clients sending 10 requests each. The second request is a
    single request with no load, which takes about 600ms.

    From my untrained eye, under load, it seems that all the time is spent
    waiting on data in the hdfs scan node. It also looks like impala is
    artificially restricting the io resources dedicated to the query by
    limiting the number of scan threads assigned to 1. From my brief reading
    of the comments in your disk io manager, this makes sense. I wonder,
    however, if the constraint is too restrictive, and if there's a way for me
    to tweak it through some setting. This is only running on a single disk,
    but it's such a small amount of data, everything should be in the disk
    cache, which explains why iostat is reporting such a low utilization number.
    To unsubscribe from this group and stop receiving emails from it, send an email to impala-user+unsubscribe@cloudera.org.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupimpala-user @
categorieshadoop
postedOct 24, '13 at 10:30p
activeOct 25, '13 at 8:23p
posts3
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Keith: 2 posts Greg Rahn: 1 post

People

Translate

site design / logo © 2022 Grokbase