FAQ
Hi Iago,
Can you run these queries using Hive and see if you get the expected
results?

It would help to debug this more if you could provide:

1) The query/queries that are affected
2) The "CREATE TABLE" statement used for the target table
3) The impalad log after running these queries

Thanks,
Lenni
Software Engineer - Cloudera
On Sun, Jul 28, 2013 at 4:54 AM, Iago Tomas wrote:

I'm trying impala and after loading a dataset using default storage (i
also tried parquetfile) i cannot query the dataset other than raw select
which outputs the current dataset, but whenever the query has a clause
'WHERE','GROUP BY' .... this returns an empty set, even obvious queries
which should be returning something for sure.
Any clue?

My impala version
Impala Shell v1.1 (5e15fca) built on Sun Jul 21 15:51:04 PDT 2013

Search Discussions

  • Iago Tomas at Jul 29, 2013 at 12:14 pm
    Could it be something related to the data?

    1) The queries affected and one returning one row

    [impala-shell] > select * from wmdata4 where act>0 limit 10;
    Query: select * from wmdata4 where act>0 limit 10
    Query finished, fetching results ...

    Returned 0 row(s) in 4.01s
    [impala-shell] > select * from wmdata4 limit 1;
    Query: select * from wmdata4 limit 1
    Query finished, fetching results ...
    Prettytable cannot resolve string columns values that have embedded tabs.
    Reverting to tab delimited text output
    2013-05-03 09:23:58 d63086930bb9725275122e71188ded47e45b20c1 1
    20130503092316 192.168.2.109 2013-05-03 09:25:00 jmartinez;CN=Jaime
    Martínez,OU=Marketing,DC=silonbcn,DC=com
    SILONLPT05;SILONLPT05.silonbcn.com;52;1 1 20130503092316 20130503 09:18:28
    250 100 NULL NULL
    Returned 1 row(s) in 0.76s


    2) I couldn't get the create table statement, hope the 'describe table'
    result it's enough, I think this was a simple create table statement, after
    that i imported the data using INSERT ... SELECT from a hive table.

    [impala-shell] > describe wmdata4;
    Query: describe wmdata4
    Query finished, fetching results ...
    +--------------+--------+---------+
    name | type | comment |
    +--------------+--------+---------+
    inc_date | string | |
    inc_uniqueid | string | |
    account | string | |
    postid | string | |
    ip | string | |
    inc_parsed | string | |
    usr | string | |
    device | string | |
    account2 | string | |
    postid2 | string | |
    ts | string | |
    duration | int | |
    act | int | |
    +--------------+--------+---------+
    Returned 13 row(s) in 0.12s


    3) impalad log

    I0729 12:02:34.834764 6767 impala-beeswax-server.cc:133] query():
    query=select * from wmdata4 where act>0 limit 10
    I0729 12:02:34.834934 6767 impala-beeswax-server.cc:447] query: Query {
       01: query (string) = "select * from wmdata4 where act>0 limit 10",
       03: configuration (list) = list<string>[0] {
       },
       04: hadoop_user (string) = "ubuntu",
    }
    I0729 12:02:34.835175 6767 impala-beeswax-server.cc:460]
    TClientRequest.queryOptions: TQueryOptions {
       01: abort_on_error (bool) = false,
       02: max_errors (i32) = 0,
       03: disable_codegen (bool) = false,
       04: batch_size (i32) = 0,
       05: num_nodes (i32) = 0,
       06: max_scan_range_length (i64) = 0,
       07: num_scanner_threads (i32) = 0,
       08: max_io_buffers (i32) = 0,
       09: allow_unsupported_formats (bool) = false,
       10: default_order_by_limit (i64) = -1,
       11: debug_action (string) = "",
       12: mem_limit (i64) = 0,
       13: abort_on_default_limit_exceeded (bool) = false,
       14: parquet_compression_codec (i32) = 5,
       15: hbase_caching (i32) = 0,
       16: hbase_cache_blocks (bool) = false,
    }
    INFO0729 12:02:34.847000 Thread-10 com.cloudera.impala.service.Frontend]
    analyze query select * from wmdata4 where act>0 limit 10
    INFO0729 12:02:34.850000 Thread-10
    com.cloudera.impala.analysis.BinaryPredicate] act > 0 selectivity: 0.1
    INFO0729 12:02:34.850000 Thread-10 com.cloudera.impala.service.Frontend]
    create plan
    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.planner.Planner]
    desctbl: tuples:
    TupleDescriptor{id=0, tbl=default.wmdata4, byte_size=0,
    is_materialized=true, slots=[SlotDescriptor{id=0, col=inc_date,
    type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=1,
    col=inc_uniqueid, type=STRING, materialized=false, byteSize=0,
    byteOffset=-1, nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0},
    SlotDescriptor{id=2, col=account, type=STRING, materialized=false,
    byteSize=0, byteOffset=-1, nullIndicatorByte=0, nullIndicatorBit=0,
    slotIdx=0}, SlotDescriptor{id=3, col=postid, type=STRING,
    materialized=false, byteSize=0, byteOffset=-1, nullIndicatorByte=0,
    nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=4, col=ip, type=STRING,
    materialized=false, byteSize=0, byteOffset=-1, nullIndicatorByte=0,
    nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=5, col=inc_parsed,
    type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=6,
    col=usr, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=7,
    col=device, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=8,
    col=account2, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=9,
    col=postid2, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=10,
    col=ts, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=11,
    col=duration, type=INT, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=12,
    col=act, type=INT, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}]}

    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    valuetransfer: #slots=13
    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=0 members=(0)
    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=1 members=(1)
    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=2 members=(2)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=3 members=(3)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=4 members=(4)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=5 members=(5)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=6 members=(6)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=7 members=(7)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=8 members=(8)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=9 members=(9)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=10 members=(10)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=11 members=(11)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=12 members=(12)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.planner.Planner]
    create single-node plan
    INFO0729 12:02:34.852000 Thread-10
    com.cloudera.impala.planner.HdfsScanNode] collecting partitions for table
    wmdata4
    INFO0729 12:02:34.852000 Thread-10
    com.cloudera.impala.planner.HdfsScanNode] finalize HdfsScan: cardinality=-1
    INFO0729 12:02:34.853000 Thread-10
    com.cloudera.impala.planner.HdfsScanNode] finalize HdfsScan: #nodes=2
    INFO0729 12:02:34.853000 Thread-10 com.cloudera.impala.planner.Planner]
    create plan fragments
    INFO0729 12:02:34.853000 Thread-10 com.cloudera.impala.planner.Planner]
    memlimit=0
    INFO0729 12:02:34.853000 Thread-10 com.cloudera.impala.planner.Planner]
    finalize plan fragments
    INFO0729 12:02:34.853000 Thread-10 com.cloudera.impala.service.Frontend]
    get scan range locations
    INFO0729 12:02:34.854000 Thread-10 com.cloudera.impala.service.Frontend]
    create result set metadata
    INFO0729 12:02:34.854000 Thread-10 com.cloudera.impala.service.JniFrontend]
    PLAN FRAGMENT 0
       PARTITION: UNPARTITIONED

       1:EXCHANGE
          limit: 10
          tuple ids: 0

    PLAN FRAGMENT 1
       PARTITION: RANDOM

       STREAM DATA SINK
         EXCHANGE ID: 1
         UNPARTITIONED

       0:SCAN HDFS
          table=default.wmdata4 #partitions=1 size=306.68MB
          predicates: act > 0
          limit: 10
          tuple ids: 0

    I0729 12:02:34.856362 6767 coordinator.cc:295] Exec()
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:34.856575 6767 plan-fragment-executor.cc:76] Prepare():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    instance_id=e1449e468f546d85:78eb3c77dd198fbe
    I0729 12:02:34.865314 6767 plan-fragment-executor.cc:124] descriptor table
    for fragment=e1449e468f546d85:78eb3c77dd198fbe
    tuples:
    Tuple(id=0 size=192 slots=[Slot(id=0 type=STRING col=0 offset=16
    null=(offset=0 mask=4)), Slot(id=1 type=STRING col=1 offset=32
    null=(offset=0 mask=8)), Slot(id=2 type=STRING col=2 offset=48
    null=(offset=0 mask=10)), Slot(id=3 type=STRING col=3 offset=64
    null=(offset=0 mask=20)), Slot(id=4 type=STRING col=4 offset=80
    null=(offset=0 mask=40)), Slot(id=5 type=STRING col=5 offset=96
    null=(offset=0 mask=80)), Slot(id=6 type=STRING col=6 offset=112
    null=(offset=1 mask=1)), Slot(id=7 type=STRING col=7 offset=128
    null=(offset=1 mask=2)), Slot(id=8 type=STRING col=8 offset=144
    null=(offset=1 mask=4)), Slot(id=9 type=STRING col=9 offset=160
    null=(offset=1 mask=8)), Slot(id=10 type=STRING col=10 offset=176
    null=(offset=1 mask=10)), Slot(id=11 type=INT col=11 offset=4
    null=(offset=0 mask=1)), Slot(id=12 type=INT col=12 offset=8 null=(offset=0
    mask=2))])
    I0729 12:02:34.865497 6767 exchange-node.cc:50] Exch id=1
    input_desc=Tuple(id=0 size=192 slots=[Slot(id=0 type=STRING col=0 offset=16
    null=(offset=0 mask=4)), Slot(id=1 type=STRING col=1 offset=32
    null=(offset=0 mask=8)), Slot(id=2 type=STRING col=2 offset=48
    null=(offset=0 mask=10)), Slot(id=3 type=STRING col=3 offset=64
    null=(offset=0 mask=20)), Slot(id=4 type=STRING col=4 offset=80
    null=(offset=0 mask=40)), Slot(id=5 type=STRING col=5 offset=96
    null=(offset=0 mask=80)), Slot(id=6 type=STRING col=6 offset=112
    null=(offset=1 mask=1)), Slot(id=7 type=STRING col=7 offset=128
    null=(offset=1 mask=2)), Slot(id=8 type=STRING col=8 offset=144
    null=(offset=1 mask=4)), Slot(id=9 type=STRING col=9 offset=160
    null=(offset=1 mask=8)), Slot(id=10 type=STRING col=10 offset=176
    null=(offset=1 mask=10)), Slot(id=11 type=INT col=11 offset=4
    null=(offset=0 mask=1)), Slot(id=12 type=INT col=12 offset=8 null=(offset=0
    mask=2))])

    output_desc=Tuple(id=0 size=192 slots=[Slot(id=0 type=STRING col=0
    offset=16 null=(offset=0 mask=4)), Slot(id=1 type=STRING col=1 offset=32
    null=(offset=0 mask=8)), Slot(id=2 type=STRING col=2 offset=48
    null=(offset=0 mask=10)), Slot(id=3 type=STRING col=3 offset=64
    null=(offset=0 mask=20)), Slot(id=4 type=STRING col=4 offset=80
    null=(offset=0 mask=40)), Slot(id=5 type=STRING col=5 offset=96
    null=(offset=0 mask=80)), Slot(id=6 type=STRING col=6 offset=112
    null=(offset=1 mask=1)), Slot(id=7 type=STRING col=7 offset=128
    null=(offset=1 mask=2)), Slot(id=8 type=STRING col=8 offset=144
    null=(offset=1 mask=4)), Slot(id=9 type=STRING col=9 offset=160
    null=(offset=1 mask=8)), Slot(id=10 type=STRING col=10 offset=176
    null=(offset=1 mask=10)), Slot(id=11 type=INT col=11 offset=4
    null=(offset=0 mask=1)), Slot(id=12 type=INT col=12 offset=8 null=(offset=0
    mask=2))])
    I0729 12:02:34.964975 6767 coordinator.cc:398] starting 2 backends for
    query e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:34.965533 15878 impala-server.cc:1207] ExecPlanFragment()
    instance_id=e1449e468f546d85:78eb3c77dd198fc0
    coord= xxxxxxxx.compute.internal:22000 backend#=1
    I0729 12:02:34.965617 15878 plan-fragment-executor.cc:76] Prepare():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    instance_id=e1449e468f546d85:78eb3c77dd198fc0
    I0729 12:02:34.974009 15878 plan-fragment-executor.cc:124] descriptor table
    for fragment=e1449e468f546d85:78eb3c77dd198fc0
    tuples:
    Tuple(id=0 size=192 slots=[Slot(id=0 type=STRING col=0 offset=16
    null=(offset=0 mask=4)), Slot(id=1 type=STRING col=1 offset=32
    null=(offset=0 mask=8)), Slot(id=2 type=STRING col=2 offset=48
    null=(offset=0 mask=10)), Slot(id=3 type=STRING col=3 offset=64
    null=(offset=0 mask=20)), Slot(id=4 type=STRING col=4 offset=80
    null=(offset=0 mask=40)), Slot(id=5 type=STRING col=5 offset=96
    null=(offset=0 mask=80)), Slot(id=6 type=STRING col=6 offset=112
    null=(offset=1 mask=1)), Slot(id=7 type=STRING col=7 offset=128
    null=(offset=1 mask=2)), Slot(id=8 type=STRING col=8 offset=144
    null=(offset=1 mask=4)), Slot(id=9 type=STRING col=9 offset=160
    null=(offset=1 mask=8)), Slot(id=10 type=STRING col=10 offset=176
    null=(offset=1 mask=10)), Slot(id=11 type=INT col=11 offset=4
    null=(offset=0 mask=1)), Slot(id=12 type=INT col=12 offset=8 null=(offset=0
    mask=2))])
    I0729 12:02:35.136384 1449 plan-fragment-executor.cc:221] Open():
    instance_id=e1449e468f546d85:78eb3c77dd198fc0
    I0729 12:02:35.373730 15878 coordinator.cc:1044] Backend 1 completed, 1
    remaining: query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:35.374150 15878 coordinator.cc:1053]
    query_id=e1449e468f546d85:78eb3c77dd198fbd: first in-progress backend:
    xxxxxxxx.compute.internal:22000
    I0729 12:02:35.373913 1456 plan-fragment-executor.cc:221] Open():
    instance_id=e1449e468f546d85:78eb3c77dd198fbe
    I0729 12:02:35.475869 6767 impala-beeswax-server.cc:266]
    get_results_metadata(): query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.206039 6767 plan-fragment-executor.cc:376] Finished
    executing fragment query_id=e1449e468f546d85:78eb3c77dd198fbd
    instance_id=e1449e468f546d85:78eb3c77dd198fbe
    I0729 12:02:36.206151 6767 coordinator.cc:592] Coordinator waiting for
    backends to finish, 1 remaining
    I0729 12:02:36.207219 15895 progress-updater.cc:56] Query
    e1449e468f546d85:78eb3c77dd198fbd: 50% Complete (9 out of 18)
    I0729 12:02:36.207298 15895 coordinator.cc:1044] Backend 0 completed, 0
    remaining: query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.207487 6767 coordinator.cc:597] All backends finished or
    error.
    I0729 12:02:36.208255 6767 coordinator.cc:1209] Final profile for
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    Execution Profile e1449e468f546d85:78eb3c77dd198fbd:(Active: 518.529ms, %
    non-child: 0.00%)
       Per Node Peak Memory Usage: xxxxxxxxxxxx.compute.internal:22000(57.32
    MB) xxxxxxxxxxxx.compute.internal:22000(57.13 MB)
        - FinalizationTimer: 0ns
       Coordinator Fragment:(Active: 728.842ms, % non-child: 0.00%)
          - AverageThreadTokens: 1.00
          - PeakMemoryUsage: 57.13 MB
          - RowsProduced: 0
         CodeGen:(Active: 107.144ms, % non-child: 14.70%)
            - CodegenTime: 587.524us
            - CompileTime: 98.674ms
            - LoadTime: 8.463ms
            - ModuleFileSize: 83.11 KB
         EXCHANGE_NODE (id=1):(Active: 728.835ms, % non-child: 100.00%)
            - BytesReceived: 0.00
            - ConvertRowBatchTime: 1.396us
            - DataArrivalWaitTime: 728.827ms
            - DeserializeRowBatchTimer: 0ns
            - FirstBatchArrivalWaitTime: 0ns
            - MemoryUsed: 0.00
            - RowsReturned: 0
            - RowsReturnedRate: 0
            - SendersBlockedTimer: 0ns
            - SendersBlockedTotalTimer(*): 0ns
       Averaged Fragment 1:(Active: 533.699ms, % non-child: 0.00%)
         split sizes: min: 150.45 MB, max: 156.23 MB, avg: 153.34 MB, stddev:
    2.89 MB
         completion times: min:237.302ms max:833.680ms mean: 535.491ms
      stddev:298.189ms
         execution rates: min:180.46 MB/sec max:658.34 MB/sec mean:419.40
    MB/sec stddev:238.94 MB/sec
         num instances: 2
          - AverageThreadTokens: 2.67
          - PeakMemoryUsage: 57.23 MB
          - RowsProduced: 0
         CodeGen:(Active: 282.15ms, % non-child: 48.03%)
            - CodegenTime: 7.184ms
            - CompileTime: 273.67ms
            - LoadTime: 8.941ms
            - ModuleFileSize: 83.11 KB
         DataStreamSender (dst_id=1):(Active: 12.633us, % non-child: 0.00%)
            - BytesSent: 0.00
            - NetworkThroughput(*): 0.00 /sec
            - OverallThroughput: 0.00 /sec
            - SerializeBatchTime: 0ns
            - ThriftTransmitTime(*): 0ns
            - UncompressedRowBatchSize: 0.00
         HDFS_SCAN_NODE (id=0):(Active: 533.502ms, % non-child: 99.98%)
            - AverageHdfsReadThreadConcurrency: 0.25
            - AverageScannerThreadConcurrency: 2.00
            - BytesRead: 153.34 MB
            - BytesReadLocal: 153.34 MB
            - BytesReadShortCircuit: 153.34 MB
            - MemoryUsed: 4.05 KB
            - NumDisksAccessed: 1
            - NumScannerThreadsStarted: 2
            - PerReadThreadRawHdfsThroughput: 904.37 MB/sec
            - RowsRead: 794.58K (794577)
            - RowsReturned: 0
            - RowsReturnedRate: 0
            - ScanRangesComplete: 9
            - ScannerThreadsInvoluntaryContextSwitches: 149
            - ScannerThreadsTotalWallClockTime: 1s039ms
              - DelimiterParseTime: 440.182ms
              - MaterializeTupleTime(*): 17.861ms
              - ScannerThreadsSysTime: 4.0ms
              - ScannerThreadsUserTime: 298.18ms
            - ScannerThreadsVoluntaryContextSwitches: 53
            - TotalRawHdfsReadTime(*): 205.304ms
            - TotalReadThroughput: 134.23 MB/sec
       Fragment 1:
         Instance e1449e468f546d85:78eb3c77dd198fbf
    (host= xxxxxxxx.compute.internal:22000):(Active: 831.941ms, % non-child:
    0.00%)
           Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:9/150.45
    MB
            - AverageThreadTokens: 2.33
            - PeakMemoryUsage: 57.32 MB
            - RowsProduced: 0
           CodeGen:(Active: 399.581ms, % non-child: 48.03%)
              - CodegenTime: 7.433ms
              - CompileTime: 389.808ms
              - LoadTime: 9.764ms
              - ModuleFileSize: 83.11 KB
           DataStreamSender (dst_id=1):(Active: 14.239us, % non-child: 0.00%)
              - BytesSent: 0.00
              - NetworkThroughput(*): 0.00 /sec
              - OverallThroughput: 0.00 /sec
              - SerializeBatchTime: 0ns
              - ThriftTransmitTime(*): 0ns
              - UncompressedRowBatchSize: 0.00
           HDFS_SCAN_NODE (id=0):(Active: 831.806ms, % non-child: 99.98%)
             Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:9/150.45 MB
             Hdfs Read Thread Concurrency Bucket: 0:50% 1:50% 2:0%
             File Formats: TEXT/NONE:9
             ExecOption: Codegen enabled: 9 out of 9
              - AverageHdfsReadThreadConcurrency: 0.50
              - AverageScannerThreadConcurrency: 2.00
              - BytesRead: 150.45 MB
              - BytesReadLocal: 150.45 MB
              - BytesReadShortCircuit: 150.45 MB
              - MemoryUsed: 4.38 KB
              - NumDisksAccessed: 1
              - NumScannerThreadsStarted: 2
              - PerReadThreadRawHdfsThroughput: 519.87 MB/sec
              - RowsRead: 760.99K (760992)
              - RowsReturned: 0
              - RowsReturnedRate: 0
              - ScanRangesComplete: 9
              - ScannerThreadsInvoluntaryContextSwitches: 76
              - ScannerThreadsTotalWallClockTime: 1s629ms
                - DelimiterParseTime: 694.611ms
                - MaterializeTupleTime(*): 19.272ms
                - ScannerThreadsSysTime: 8.0ms
                - ScannerThreadsUserTime: 416.25ms
              - ScannerThreadsVoluntaryContextSwitches: 70
              - TotalRawHdfsReadTime(*): 289.398ms
              - TotalReadThroughput: 97.96 MB/sec
         Instance e1449e468f546d85:78eb3c77dd198fc0
    (host=xxxxxxxxxxxx.compute.internal:22000):(Active: 235.457ms, % non-child:
    0.00%)
           Hdfs split stats (<volume id>:<# splits>/<split lengths>): 0:9/156.23
    MB
            - AverageThreadTokens: 3.00
            - PeakMemoryUsage: 57.13 MB
            - RowsProduced: 0
           CodeGen:(Active: 164.450ms, % non-child: 69.84%)
              - CodegenTime: 6.936ms
              - CompileTime: 156.326ms
              - LoadTime: 8.117ms
              - ModuleFileSize: 83.11 KB
           DataStreamSender (dst_id=1):(Active: 11.28us, % non-child: 0.00%)
              - BytesSent: 0.00
              - NetworkThroughput(*): 0.00 /sec
              - OverallThroughput: 0.00 /sec
              - SerializeBatchTime: 0ns
              - ThriftTransmitTime(*): 0ns
              - UncompressedRowBatchSize: 0.00
           HDFS_SCAN_NODE (id=0):(Active: 235.197ms, % non-child: 99.89%)
             Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:9/156.23 MB
             Hdfs Read Thread Concurrency Bucket: 0:100% 1:0% 2:0%
             File Formats: TEXT/NONE:9
             ExecOption: Codegen enabled: 9 out of 9
              - AverageHdfsReadThreadConcurrency: 0.00
              - AverageScannerThreadConcurrency: 2.00
              - BytesRead: 156.23 MB
              - BytesReadLocal: 156.23 MB
              - BytesReadShortCircuit: 156.23 MB
              - MemoryUsed: 3.73 KB
              - NumDisksAccessed: 1
              - NumScannerThreadsStarted: 2
              - PerReadThreadRawHdfsThroughput: 1.26 GB/sec
              - RowsRead: 828.16K (828163)
              - RowsReturned: 0
              - RowsReturnedRate: 0
              - ScanRangesComplete: 9
              - ScannerThreadsInvoluntaryContextSwitches: 223
              - ScannerThreadsTotalWallClockTime: 448.822ms
                - DelimiterParseTime: 185.753ms
                - MaterializeTupleTime(*): 16.450ms
                - ScannerThreadsSysTime: 0ns
                - ScannerThreadsUserTime: 180.11ms
              - ScannerThreadsVoluntaryContextSwitches: 36
              - TotalRawHdfsReadTime(*): 121.211ms
              - TotalReadThroughput: 170.50 MB/sec
    I0729 12:02:36.209754 6767 impala-beeswax-server.cc:301] close():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.209816 6767 impala-server.cc:951] UnregisterQuery():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.218952 6767 impala-server.cc:1033] Cancel():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.222965 6767 data-stream-mgr.cc:274] DeregisterRecvr():
    fragment_instance_id=e1449e468f546d85:78eb3c77dd198fbe, node=1
    I0729 12:02:36.223062 6767 data-stream-mgr.cc:170] cancelled stream:
    fragment_instance_id_=e1449e468f546d85:78eb3c77dd198fbe node_id=1



    El diumenge 28 de juliol de 2013 23:05:01 UTC+2, lskuff va escriure:
    Hi Iago,
    Can you run these queries using Hive and see if you get the expected
    results?

    It would help to debug this more if you could provide:

    1) The query/queries that are affected
    2) The "CREATE TABLE" statement used for the target table
    3) The impalad log after running these queries

    Thanks,
    Lenni
    Software Engineer - Cloudera

    On Sun, Jul 28, 2013 at 4:54 AM, Iago Tomas <iago...@gmail.com<javascript:>
    wrote:
    I'm trying impala and after loading a dataset using default storage (i
    also tried parquetfile) i cannot query the dataset other than raw select
    which outputs the current dataset, but whenever the query has a clause
    'WHERE','GROUP BY' .... this returns an empty set, even obvious queries
    which should be returning something for sure.
    Any clue?

    My impala version
    Impala Shell v1.1 (5e15fca) built on Sun Jul 21 15:51:04 PDT 2013
  • Lenni Kuff at Jul 29, 2013 at 5:49 pm
    Hi,
    Your table specification needs to match how your data is formatted. When
    you issue a CREATE TABLE statement, you can add an optional ROW FORMAT
    clause. For example, if your data is separated by a pipe character | you
    would use something like:

    CREATE TABLE ... ROW FORMAT DELIMITED FIELDS TERMINATED BY '*|'*


    You can also optionally specify an escape character if some of your data
    contains the field terminator.

    The full usage is:

    CREATE TABLE ... ROW FORMAT DELIMITED FIELDS TERMINATED BY 'char'
    [DELIMITED BY 'char']

    See:
    http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_langref_sql.html#create_table_unique_1

    Hopefully this helps!

    Thanks,
    Lenni

    On Mon, Jul 29, 2013 at 5:14 AM, Iago Tomas wrote:

    Could it be something related to the data?

    1) The queries affected and one returning one row

    [impala-shell] > select * from wmdata4 where act>0 limit 10;
    Query: select * from wmdata4 where act>0 limit 10
    Query finished, fetching results ...

    Returned 0 row(s) in 4.01s
    [impala-shell] > select * from wmdata4 limit 1;
    Query: select * from wmdata4 limit 1
    Query finished, fetching results ...
    Prettytable cannot resolve string columns values that have embedded tabs.
    Reverting to tab delimited text output
    2013-05-03 09:23:58 d63086930bb9725275122e71188ded47e45b20c1 1
    20130503092316 192.168.2.109 2013-05-03 09:25:00 jmartinez;CN=Jaime
    Martínez,OU=Marketing,DC=silonbcn,DC=com SILONLPT05;
    SILONLPT05.silonbcn.com;52;1 1 20130503092316 20130503 09:18:28 250 100
    NULL NULL
    Returned 1 row(s) in 0.76s


    2) I couldn't get the create table statement, hope the 'describe table'
    result it's enough, I think this was a simple create table statement, after
    that i imported the data using INSERT ... SELECT from a hive table.

    [impala-shell] > describe wmdata4;
    Query: describe wmdata4
    Query finished, fetching results ...
    +--------------+--------+---------+
    name | type | comment |
    +--------------+--------+---------+
    inc_date | string | |
    inc_uniqueid | string | |
    account | string | |
    postid | string | |
    ip | string | |
    inc_parsed | string | |
    usr | string | |
    device | string | |
    account2 | string | |
    postid2 | string | |
    ts | string | |
    duration | int | |
    act | int | |
    +--------------+--------+---------+
    Returned 13 row(s) in 0.12s


    3) impalad log

    I0729 12:02:34.834764 6767 impala-beeswax-server.cc:133] query():
    query=select * from wmdata4 where act>0 limit 10
    I0729 12:02:34.834934 6767 impala-beeswax-server.cc:447] query: Query {
    01: query (string) = "select * from wmdata4 where act>0 limit 10",
    03: configuration (list) = list<string>[0] {
    },
    04: hadoop_user (string) = "ubuntu",
    }
    I0729 12:02:34.835175 6767 impala-beeswax-server.cc:460]
    TClientRequest.queryOptions: TQueryOptions {
    01: abort_on_error (bool) = false,
    02: max_errors (i32) = 0,
    03: disable_codegen (bool) = false,
    04: batch_size (i32) = 0,
    05: num_nodes (i32) = 0,
    06: max_scan_range_length (i64) = 0,
    07: num_scanner_threads (i32) = 0,
    08: max_io_buffers (i32) = 0,
    09: allow_unsupported_formats (bool) = false,
    10: default_order_by_limit (i64) = -1,
    11: debug_action (string) = "",
    12: mem_limit (i64) = 0,
    13: abort_on_default_limit_exceeded (bool) = false,
    14: parquet_compression_codec (i32) = 5,
    15: hbase_caching (i32) = 0,
    16: hbase_cache_blocks (bool) = false,
    }
    INFO0729 12:02:34.847000 Thread-10 com.cloudera.impala.service.Frontend]
    analyze query select * from wmdata4 where act>0 limit 10
    INFO0729 12:02:34.850000 Thread-10
    com.cloudera.impala.analysis.BinaryPredicate] act > 0 selectivity: 0.1
    INFO0729 12:02:34.850000 Thread-10 com.cloudera.impala.service.Frontend]
    create plan
    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.planner.Planner]
    desctbl: tuples:
    TupleDescriptor{id=0, tbl=default.wmdata4, byte_size=0,
    is_materialized=true, slots=[SlotDescriptor{id=0, col=inc_date,
    type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=1,
    col=inc_uniqueid, type=STRING, materialized=false, byteSize=0,
    byteOffset=-1, nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0},
    SlotDescriptor{id=2, col=account, type=STRING, materialized=false,
    byteSize=0, byteOffset=-1, nullIndicatorByte=0, nullIndicatorBit=0,
    slotIdx=0}, SlotDescriptor{id=3, col=postid, type=STRING,
    materialized=false, byteSize=0, byteOffset=-1, nullIndicatorByte=0,
    nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=4, col=ip, type=STRING,
    materialized=false, byteSize=0, byteOffset=-1, nullIndicatorByte=0,
    nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=5, col=inc_parsed,
    type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=6,
    col=usr, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=7,
    col=device, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=8,
    col=account2, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=9,
    col=postid2, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=10,
    col=ts, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=11,
    col=duration, type=INT, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=12,
    col=act, type=INT, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}]}

    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    valuetransfer: #slots=13
    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=0 members=(0)
    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=1 members=(1)
    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=2 members=(2)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=3 members=(3)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=4 members=(4)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=5 members=(5)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=6 members=(6)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=7 members=(7)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=8 members=(8)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=9 members=(9)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=10 members=(10)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=11 members=(11)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=12 members=(12)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.planner.Planner]
    create single-node plan
    INFO0729 12:02:34.852000 Thread-10
    com.cloudera.impala.planner.HdfsScanNode] collecting partitions for table
    wmdata4
    INFO0729 12:02:34.852000 Thread-10
    com.cloudera.impala.planner.HdfsScanNode] finalize HdfsScan: cardinality=-1
    INFO0729 12:02:34.853000 Thread-10
    com.cloudera.impala.planner.HdfsScanNode] finalize HdfsScan: #nodes=2
    INFO0729 12:02:34.853000 Thread-10 com.cloudera.impala.planner.Planner]
    create plan fragments
    INFO0729 12:02:34.853000 Thread-10 com.cloudera.impala.planner.Planner]
    memlimit=0
    INFO0729 12:02:34.853000 Thread-10 com.cloudera.impala.planner.Planner]
    finalize plan fragments
    INFO0729 12:02:34.853000 Thread-10 com.cloudera.impala.service.Frontend]
    get scan range locations
    INFO0729 12:02:34.854000 Thread-10 com.cloudera.impala.service.Frontend]
    create result set metadata
    INFO0729 12:02:34.854000 Thread-10
    com.cloudera.impala.service.JniFrontend] PLAN FRAGMENT 0
    PARTITION: UNPARTITIONED

    1:EXCHANGE
    limit: 10
    tuple ids: 0

    PLAN FRAGMENT 1
    PARTITION: RANDOM

    STREAM DATA SINK
    EXCHANGE ID: 1
    UNPARTITIONED

    0:SCAN HDFS
    table=default.wmdata4 #partitions=1 size=306.68MB
    predicates: act > 0
    limit: 10
    tuple ids: 0

    I0729 12:02:34.856362 6767 coordinator.cc:295] Exec()
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:34.856575 6767 plan-fragment-executor.cc:76] Prepare():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    instance_id=e1449e468f546d85:78eb3c77dd198fbe
    I0729 12:02:34.865314 6767 plan-fragment-executor.cc:124] descriptor
    table for fragment=e1449e468f546d85:78eb3c77dd198fbe
    tuples:
    Tuple(id=0 size=192 slots=[Slot(id=0 type=STRING col=0 offset=16
    null=(offset=0 mask=4)), Slot(id=1 type=STRING col=1 offset=32
    null=(offset=0 mask=8)), Slot(id=2 type=STRING col=2 offset=48
    null=(offset=0 mask=10)), Slot(id=3 type=STRING col=3 offset=64
    null=(offset=0 mask=20)), Slot(id=4 type=STRING col=4 offset=80
    null=(offset=0 mask=40)), Slot(id=5 type=STRING col=5 offset=96
    null=(offset=0 mask=80)), Slot(id=6 type=STRING col=6 offset=112
    null=(offset=1 mask=1)), Slot(id=7 type=STRING col=7 offset=128
    null=(offset=1 mask=2)), Slot(id=8 type=STRING col=8 offset=144
    null=(offset=1 mask=4)), Slot(id=9 type=STRING col=9 offset=160
    null=(offset=1 mask=8)), Slot(id=10 type=STRING col=10 offset=176
    null=(offset=1 mask=10)), Slot(id=11 type=INT col=11 offset=4
    null=(offset=0 mask=1)), Slot(id=12 type=INT col=12 offset=8 null=(offset=0
    mask=2))])
    I0729 12:02:34.865497 6767 exchange-node.cc:50] Exch id=1
    input_desc=Tuple(id=0 size=192 slots=[Slot(id=0 type=STRING col=0
    offset=16 null=(offset=0 mask=4)), Slot(id=1 type=STRING col=1 offset=32
    null=(offset=0 mask=8)), Slot(id=2 type=STRING col=2 offset=48
    null=(offset=0 mask=10)), Slot(id=3 type=STRING col=3 offset=64
    null=(offset=0 mask=20)), Slot(id=4 type=STRING col=4 offset=80
    null=(offset=0 mask=40)), Slot(id=5 type=STRING col=5 offset=96
    null=(offset=0 mask=80)), Slot(id=6 type=STRING col=6 offset=112
    null=(offset=1 mask=1)), Slot(id=7 type=STRING col=7 offset=128
    null=(offset=1 mask=2)), Slot(id=8 type=STRING col=8 offset=144
    null=(offset=1 mask=4)), Slot(id=9 type=STRING col=9 offset=160
    null=(offset=1 mask=8)), Slot(id=10 type=STRING col=10 offset=176
    null=(offset=1 mask=10)), Slot(id=11 type=INT col=11 offset=4
    null=(offset=0 mask=1)), Slot(id=12 type=INT col=12 offset=8 null=(offset=0
    mask=2))])

    output_desc=Tuple(id=0 size=192 slots=[Slot(id=0 type=STRING col=0
    offset=16 null=(offset=0 mask=4)), Slot(id=1 type=STRING col=1 offset=32
    null=(offset=0 mask=8)), Slot(id=2 type=STRING col=2 offset=48
    null=(offset=0 mask=10)), Slot(id=3 type=STRING col=3 offset=64
    null=(offset=0 mask=20)), Slot(id=4 type=STRING col=4 offset=80
    null=(offset=0 mask=40)), Slot(id=5 type=STRING col=5 offset=96
    null=(offset=0 mask=80)), Slot(id=6 type=STRING col=6 offset=112
    null=(offset=1 mask=1)), Slot(id=7 type=STRING col=7 offset=128
    null=(offset=1 mask=2)), Slot(id=8 type=STRING col=8 offset=144
    null=(offset=1 mask=4)), Slot(id=9 type=STRING col=9 offset=160
    null=(offset=1 mask=8)), Slot(id=10 type=STRING col=10 offset=176
    null=(offset=1 mask=10)), Slot(id=11 type=INT col=11 offset=4
    null=(offset=0 mask=1)), Slot(id=12 type=INT col=12 offset=8 null=(offset=0
    mask=2))])
    I0729 12:02:34.964975 6767 coordinator.cc:398] starting 2 backends for
    query e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:34.965533 15878 impala-server.cc:1207] ExecPlanFragment()
    instance_id=e1449e468f546d85:78eb3c77dd198fc0
    coord= xxxxxxxx.compute.internal:22000 backend#=1
    I0729 12:02:34.965617 15878 plan-fragment-executor.cc:76] Prepare():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    instance_id=e1449e468f546d85:78eb3c77dd198fc0
    I0729 12:02:34.974009 15878 plan-fragment-executor.cc:124] descriptor
    table for fragment=e1449e468f546d85:78eb3c77dd198fc0
    tuples:
    Tuple(id=0 size=192 slots=[Slot(id=0 type=STRING col=0 offset=16
    null=(offset=0 mask=4)), Slot(id=1 type=STRING col=1 offset=32
    null=(offset=0 mask=8)), Slot(id=2 type=STRING col=2 offset=48
    null=(offset=0 mask=10)), Slot(id=3 type=STRING col=3 offset=64
    null=(offset=0 mask=20)), Slot(id=4 type=STRING col=4 offset=80
    null=(offset=0 mask=40)), Slot(id=5 type=STRING col=5 offset=96
    null=(offset=0 mask=80)), Slot(id=6 type=STRING col=6 offset=112
    null=(offset=1 mask=1)), Slot(id=7 type=STRING col=7 offset=128
    null=(offset=1 mask=2)), Slot(id=8 type=STRING col=8 offset=144
    null=(offset=1 mask=4)), Slot(id=9 type=STRING col=9 offset=160
    null=(offset=1 mask=8)), Slot(id=10 type=STRING col=10 offset=176
    null=(offset=1 mask=10)), Slot(id=11 type=INT col=11 offset=4
    null=(offset=0 mask=1)), Slot(id=12 type=INT col=12 offset=8 null=(offset=0
    mask=2))])
    I0729 12:02:35.136384 1449 plan-fragment-executor.cc:221] Open():
    instance_id=e1449e468f546d85:78eb3c77dd198fc0
    I0729 12:02:35.373730 15878 coordinator.cc:1044] Backend 1 completed, 1
    remaining: query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:35.374150 15878 coordinator.cc:1053]
    query_id=e1449e468f546d85:78eb3c77dd198fbd: first in-progress backend:
    xxxxxxxx.compute.internal:22000
    I0729 12:02:35.373913 1456 plan-fragment-executor.cc:221] Open():
    instance_id=e1449e468f546d85:78eb3c77dd198fbe
    I0729 12:02:35.475869 6767 impala-beeswax-server.cc:266]
    get_results_metadata(): query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.206039 6767 plan-fragment-executor.cc:376] Finished
    executing fragment query_id=e1449e468f546d85:78eb3c77dd198fbd
    instance_id=e1449e468f546d85:78eb3c77dd198fbe
    I0729 12:02:36.206151 6767 coordinator.cc:592] Coordinator waiting for
    backends to finish, 1 remaining
    I0729 12:02:36.207219 15895 progress-updater.cc:56] Query
    e1449e468f546d85:78eb3c77dd198fbd: 50% Complete (9 out of 18)
    I0729 12:02:36.207298 15895 coordinator.cc:1044] Backend 0 completed, 0
    remaining: query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.207487 6767 coordinator.cc:597] All backends finished or
    error.
    I0729 12:02:36.208255 6767 coordinator.cc:1209] Final profile for
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    Execution Profile e1449e468f546d85:78eb3c77dd198fbd:(Active: 518.529ms, %
    non-child: 0.00%)
    Per Node Peak Memory Usage: xxxxxxxxxxxx.compute.internal:22000(57.32
    MB) xxxxxxxxxxxx.compute.internal:22000(57.13 MB)
    - FinalizationTimer: 0ns
    Coordinator Fragment:(Active: 728.842ms, % non-child: 0.00%)
    - AverageThreadTokens: 1.00
    - PeakMemoryUsage: 57.13 MB
    - RowsProduced: 0
    CodeGen:(Active: 107.144ms, % non-child: 14.70%)
    - CodegenTime: 587.524us
    - CompileTime: 98.674ms
    - LoadTime: 8.463ms
    - ModuleFileSize: 83.11 KB
    EXCHANGE_NODE (id=1):(Active: 728.835ms, % non-child: 100.00%)
    - BytesReceived: 0.00
    - ConvertRowBatchTime: 1.396us
    - DataArrivalWaitTime: 728.827ms
    - DeserializeRowBatchTimer: 0ns
    - FirstBatchArrivalWaitTime: 0ns
    - MemoryUsed: 0.00
    - RowsReturned: 0
    - RowsReturnedRate: 0
    - SendersBlockedTimer: 0ns
    - SendersBlockedTotalTimer(*): 0ns
    Averaged Fragment 1:(Active: 533.699ms, % non-child: 0.00%)
    split sizes: min: 150.45 MB, max: 156.23 MB, avg: 153.34 MB, stddev:
    2.89 MB
    completion times: min:237.302ms max:833.680ms mean: 535.491ms
    stddev:298.189ms
    execution rates: min:180.46 MB/sec max:658.34 MB/sec mean:419.40
    MB/sec stddev:238.94 MB/sec
    num instances: 2
    - AverageThreadTokens: 2.67
    - PeakMemoryUsage: 57.23 MB
    - RowsProduced: 0
    CodeGen:(Active: 282.15ms, % non-child: 48.03%)
    - CodegenTime: 7.184ms
    - CompileTime: 273.67ms
    - LoadTime: 8.941ms
    - ModuleFileSize: 83.11 KB
    DataStreamSender (dst_id=1):(Active: 12.633us, % non-child: 0.00%)
    - BytesSent: 0.00
    - NetworkThroughput(*): 0.00 /sec
    - OverallThroughput: 0.00 /sec
    - SerializeBatchTime: 0ns
    - ThriftTransmitTime(*): 0ns
    - UncompressedRowBatchSize: 0.00
    HDFS_SCAN_NODE (id=0):(Active: 533.502ms, % non-child: 99.98%)
    - AverageHdfsReadThreadConcurrency: 0.25
    - AverageScannerThreadConcurrency: 2.00
    - BytesRead: 153.34 MB
    - BytesReadLocal: 153.34 MB
    - BytesReadShortCircuit: 153.34 MB
    - MemoryUsed: 4.05 KB
    - NumDisksAccessed: 1
    - NumScannerThreadsStarted: 2
    - PerReadThreadRawHdfsThroughput: 904.37 MB/sec
    - RowsRead: 794.58K (794577)
    - RowsReturned: 0
    - RowsReturnedRate: 0
    - ScanRangesComplete: 9
    - ScannerThreadsInvoluntaryContextSwitches: 149
    - ScannerThreadsTotalWallClockTime: 1s039ms
    - DelimiterParseTime: 440.182ms
    - MaterializeTupleTime(*): 17.861ms
    - ScannerThreadsSysTime: 4.0ms
    - ScannerThreadsUserTime: 298.18ms
    - ScannerThreadsVoluntaryContextSwitches: 53
    - TotalRawHdfsReadTime(*): 205.304ms
    - TotalReadThroughput: 134.23 MB/sec
    Fragment 1:
    Instance e1449e468f546d85:78eb3c77dd198fbf
    (host= xxxxxxxx.compute.internal:22000):(Active: 831.941ms, % non-child:
    0.00%)
    Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:9/150.45 MB
    - AverageThreadTokens: 2.33
    - PeakMemoryUsage: 57.32 MB
    - RowsProduced: 0
    CodeGen:(Active: 399.581ms, % non-child: 48.03%)
    - CodegenTime: 7.433ms
    - CompileTime: 389.808ms
    - LoadTime: 9.764ms
    - ModuleFileSize: 83.11 KB
    DataStreamSender (dst_id=1):(Active: 14.239us, % non-child: 0.00%)
    - BytesSent: 0.00
    - NetworkThroughput(*): 0.00 /sec
    - OverallThroughput: 0.00 /sec
    - SerializeBatchTime: 0ns
    - ThriftTransmitTime(*): 0ns
    - UncompressedRowBatchSize: 0.00
    HDFS_SCAN_NODE (id=0):(Active: 831.806ms, % non-child: 99.98%)
    Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:9/150.45 MB
    Hdfs Read Thread Concurrency Bucket: 0:50% 1:50% 2:0%
    File Formats: TEXT/NONE:9
    ExecOption: Codegen enabled: 9 out of 9
    - AverageHdfsReadThreadConcurrency: 0.50
    - AverageScannerThreadConcurrency: 2.00
    - BytesRead: 150.45 MB
    - BytesReadLocal: 150.45 MB
    - BytesReadShortCircuit: 150.45 MB
    - MemoryUsed: 4.38 KB
    - NumDisksAccessed: 1
    - NumScannerThreadsStarted: 2
    - PerReadThreadRawHdfsThroughput: 519.87 MB/sec
    - RowsRead: 760.99K (760992)
    - RowsReturned: 0
    - RowsReturnedRate: 0
    - ScanRangesComplete: 9
    - ScannerThreadsInvoluntaryContextSwitches: 76
    - ScannerThreadsTotalWallClockTime: 1s629ms
    - DelimiterParseTime: 694.611ms
    - MaterializeTupleTime(*): 19.272ms
    - ScannerThreadsSysTime: 8.0ms
    - ScannerThreadsUserTime: 416.25ms
    - ScannerThreadsVoluntaryContextSwitches: 70
    - TotalRawHdfsReadTime(*): 289.398ms
    - TotalReadThroughput: 97.96 MB/sec
    Instance e1449e468f546d85:78eb3c77dd198fc0
    (host=xxxxxxxxxxxx.compute.internal:22000):(Active: 235.457ms, % non-child:
    0.00%)
    Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:9/156.23 MB
    - AverageThreadTokens: 3.00
    - PeakMemoryUsage: 57.13 MB
    - RowsProduced: 0
    CodeGen:(Active: 164.450ms, % non-child: 69.84%)
    - CodegenTime: 6.936ms
    - CompileTime: 156.326ms
    - LoadTime: 8.117ms
    - ModuleFileSize: 83.11 KB
    DataStreamSender (dst_id=1):(Active: 11.28us, % non-child: 0.00%)
    - BytesSent: 0.00
    - NetworkThroughput(*): 0.00 /sec
    - OverallThroughput: 0.00 /sec
    - SerializeBatchTime: 0ns
    - ThriftTransmitTime(*): 0ns
    - UncompressedRowBatchSize: 0.00
    HDFS_SCAN_NODE (id=0):(Active: 235.197ms, % non-child: 99.89%)
    Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:9/156.23 MB
    Hdfs Read Thread Concurrency Bucket: 0:100% 1:0% 2:0%
    File Formats: TEXT/NONE:9
    ExecOption: Codegen enabled: 9 out of 9
    - AverageHdfsReadThreadConcurrency: 0.00
    - AverageScannerThreadConcurrency: 2.00
    - BytesRead: 156.23 MB
    - BytesReadLocal: 156.23 MB
    - BytesReadShortCircuit: 156.23 MB
    - MemoryUsed: 3.73 KB
    - NumDisksAccessed: 1
    - NumScannerThreadsStarted: 2
    - PerReadThreadRawHdfsThroughput: 1.26 GB/sec
    - RowsRead: 828.16K (828163)
    - RowsReturned: 0
    - RowsReturnedRate: 0
    - ScanRangesComplete: 9
    - ScannerThreadsInvoluntaryContextSwitches: 223
    - ScannerThreadsTotalWallClockTime: 448.822ms
    - DelimiterParseTime: 185.753ms
    - MaterializeTupleTime(*): 16.450ms
    - ScannerThreadsSysTime: 0ns
    - ScannerThreadsUserTime: 180.11ms
    - ScannerThreadsVoluntaryContextSwitches: 36
    - TotalRawHdfsReadTime(*): 121.211ms
    - TotalReadThroughput: 170.50 MB/sec
    I0729 12:02:36.209754 6767 impala-beeswax-server.cc:301] close():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.209816 6767 impala-server.cc:951] UnregisterQuery():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.218952 6767 impala-server.cc:1033] Cancel():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.222965 6767 data-stream-mgr.cc:274] DeregisterRecvr():
    fragment_instance_id=e1449e468f546d85:78eb3c77dd198fbe, node=1
    I0729 12:02:36.223062 6767 data-stream-mgr.cc:170] cancelled stream:
    fragment_instance_id_=e1449e468f546d85:78eb3c77dd198fbe node_id=1



    El diumenge 28 de juliol de 2013 23:05:01 UTC+2, lskuff va escriure:
    Hi Iago,
    Can you run these queries using Hive and see if you get the expected
    results?

    It would help to debug this more if you could provide:

    1) The query/queries that are affected
    2) The "CREATE TABLE" statement used for the target table
    3) The impalad log after running these queries

    Thanks,
    Lenni
    Software Engineer - Cloudera

    On Sun, Jul 28, 2013 at 4:54 AM, Iago Tomas wrote:

    I'm trying impala and after loading a dataset using default storage (i
    also tried parquetfile) i cannot query the dataset other than raw select
    which outputs the current dataset, but whenever the query has a clause
    'WHERE','GROUP BY' .... this returns an empty set, even obvious queries
    which should be returning something for sure.
    Any clue?

    My impala version
    Impala Shell v1.1 (5e15fca) built on Sun Jul 21 15:51:04 PDT 2013
  • Iago Tomas at Jul 30, 2013 at 2:52 pm
    The table was initially empty, it's not an external table, after i loaded
    the data using INSERT ... SELECT, i would expect to be formatted
    accordingly, i'm right?

    El dilluns 29 de juliol de 2013 19:49:56 UTC+2, lskuff va escriure:
    Hi,
    Your table specification needs to match how your data is formatted. When
    you issue a CREATE TABLE statement, you can add an optional ROW FORMAT
    clause. For example, if your data is separated by a pipe character | you
    would use something like:

    CREATE TABLE ... ROW FORMAT DELIMITED FIELDS TERMINATED BY '*|'*


    You can also optionally specify an escape character if some of your data
    contains the field terminator.

    The full usage is:

    CREATE TABLE ... ROW FORMAT DELIMITED FIELDS TERMINATED BY 'char' [DELIMITED BY 'char']

    See:

    http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_langref_sql.html#create_table_unique_1

    Hopefully this helps!

    Thanks,
    Lenni


    On Mon, Jul 29, 2013 at 5:14 AM, Iago Tomas <iago...@gmail.com<javascript:>
    wrote:
    Could it be something related to the data?

    1) The queries affected and one returning one row

    [impala-shell] > select * from wmdata4 where act>0 limit 10;
    Query: select * from wmdata4 where act>0 limit 10
    Query finished, fetching results ...

    Returned 0 row(s) in 4.01s
    [impala-shell] > select * from wmdata4 limit 1;
    Query: select * from wmdata4 limit 1
    Query finished, fetching results ...
    Prettytable cannot resolve string columns values that have embedded
    tabs. Reverting to tab delimited text output
    2013-05-03 09:23:58 d63086930bb9725275122e71188ded47e45b20c1 1
    20130503092316 192.168.2.109 2013-05-03 09:25:00 jmartinez;CN=Jaime
    Martínez,OU=Marketing,DC=silonbcn,DC=com SILONLPT05;
    SILONLPT05.silonbcn.com;52;1 1 20130503092316 20130503 09:18:28 250 100
    NULL NULL
    Returned 1 row(s) in 0.76s


    2) I couldn't get the create table statement, hope the 'describe table'
    result it's enough, I think this was a simple create table statement, after
    that i imported the data using INSERT ... SELECT from a hive table.

    [impala-shell] > describe wmdata4;
    Query: describe wmdata4
    Query finished, fetching results ...
    +--------------+--------+---------+
    name | type | comment |
    +--------------+--------+---------+
    inc_date | string | |
    inc_uniqueid | string | |
    account | string | |
    postid | string | |
    ip | string | |
    inc_parsed | string | |
    usr | string | |
    device | string | |
    account2 | string | |
    postid2 | string | |
    ts | string | |
    duration | int | |
    act | int | |
    +--------------+--------+---------+
    Returned 13 row(s) in 0.12s


    3) impalad log

    I0729 12:02:34.834764 6767 impala-beeswax-server.cc:133] query():
    query=select * from wmdata4 where act>0 limit 10
    I0729 12:02:34.834934 6767 impala-beeswax-server.cc:447] query: Query {
    01: query (string) = "select * from wmdata4 where act>0 limit 10",
    03: configuration (list) = list<string>[0] {
    },
    04: hadoop_user (string) = "ubuntu",
    }
    I0729 12:02:34.835175 6767 impala-beeswax-server.cc:460]
    TClientRequest.queryOptions: TQueryOptions {
    01: abort_on_error (bool) = false,
    02: max_errors (i32) = 0,
    03: disable_codegen (bool) = false,
    04: batch_size (i32) = 0,
    05: num_nodes (i32) = 0,
    06: max_scan_range_length (i64) = 0,
    07: num_scanner_threads (i32) = 0,
    08: max_io_buffers (i32) = 0,
    09: allow_unsupported_formats (bool) = false,
    10: default_order_by_limit (i64) = -1,
    11: debug_action (string) = "",
    12: mem_limit (i64) = 0,
    13: abort_on_default_limit_exceeded (bool) = false,
    14: parquet_compression_codec (i32) = 5,
    15: hbase_caching (i32) = 0,
    16: hbase_cache_blocks (bool) = false,
    }
    INFO0729 12:02:34.847000 Thread-10 com.cloudera.impala.service.Frontend]
    analyze query select * from wmdata4 where act>0 limit 10
    INFO0729 12:02:34.850000 Thread-10
    com.cloudera.impala.analysis.BinaryPredicate] act > 0 selectivity: 0.1
    INFO0729 12:02:34.850000 Thread-10 com.cloudera.impala.service.Frontend]
    create plan
    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.planner.Planner]
    desctbl: tuples:
    TupleDescriptor{id=0, tbl=default.wmdata4, byte_size=0,
    is_materialized=true, slots=[SlotDescriptor{id=0, col=inc_date,
    type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=1,
    col=inc_uniqueid, type=STRING, materialized=false, byteSize=0,
    byteOffset=-1, nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0},
    SlotDescriptor{id=2, col=account, type=STRING, materialized=false,
    byteSize=0, byteOffset=-1, nullIndicatorByte=0, nullIndicatorBit=0,
    slotIdx=0}, SlotDescriptor{id=3, col=postid, type=STRING,
    materialized=false, byteSize=0, byteOffset=-1, nullIndicatorByte=0,
    nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=4, col=ip, type=STRING,
    materialized=false, byteSize=0, byteOffset=-1, nullIndicatorByte=0,
    nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=5, col=inc_parsed,
    type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=6,
    col=usr, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=7,
    col=device, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=8,
    col=account2, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=9,
    col=postid2, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=10,
    col=ts, type=STRING, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=11,
    col=duration, type=INT, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}, SlotDescriptor{id=12,
    col=act, type=INT, materialized=false, byteSize=0, byteOffset=-1,
    nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}]}

    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    valuetransfer: #slots=13
    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=0 members=(0)
    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=1 members=(1)
    INFO0729 12:02:34.851000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=2 members=(2)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=3 members=(3)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=4 members=(4)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=5 members=(5)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=6 members=(6)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=7 members=(7)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=8 members=(8)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=9 members=(9)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=10 members=(10)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=11 members=(11)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.analysis.Analyzer]
    equiv class: id=12 members=(12)
    INFO0729 12:02:34.852000 Thread-10 com.cloudera.impala.planner.Planner]
    create single-node plan
    INFO0729 12:02:34.852000 Thread-10
    com.cloudera.impala.planner.HdfsScanNode] collecting partitions for table
    wmdata4
    INFO0729 12:02:34.852000 Thread-10
    com.cloudera.impala.planner.HdfsScanNode] finalize HdfsScan: cardinality=-1
    INFO0729 12:02:34.853000 Thread-10
    com.cloudera.impala.planner.HdfsScanNode] finalize HdfsScan: #nodes=2
    INFO0729 12:02:34.853000 Thread-10 com.cloudera.impala.planner.Planner]
    create plan fragments
    INFO0729 12:02:34.853000 Thread-10 com.cloudera.impala.planner.Planner]
    memlimit=0
    INFO0729 12:02:34.853000 Thread-10 com.cloudera.impala.planner.Planner]
    finalize plan fragments
    INFO0729 12:02:34.853000 Thread-10 com.cloudera.impala.service.Frontend]
    get scan range locations
    INFO0729 12:02:34.854000 Thread-10 com.cloudera.impala.service.Frontend]
    create result set metadata
    INFO0729 12:02:34.854000 Thread-10
    com.cloudera.impala.service.JniFrontend] PLAN FRAGMENT 0
    PARTITION: UNPARTITIONED

    1:EXCHANGE
    limit: 10
    tuple ids: 0

    PLAN FRAGMENT 1
    PARTITION: RANDOM

    STREAM DATA SINK
    EXCHANGE ID: 1
    UNPARTITIONED

    0:SCAN HDFS
    table=default.wmdata4 #partitions=1 size=306.68MB
    predicates: act > 0
    limit: 10
    tuple ids: 0

    I0729 12:02:34.856362 6767 coordinator.cc:295] Exec()
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:34.856575 6767 plan-fragment-executor.cc:76] Prepare():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    instance_id=e1449e468f546d85:78eb3c77dd198fbe
    I0729 12:02:34.865314 6767 plan-fragment-executor.cc:124] descriptor
    table for fragment=e1449e468f546d85:78eb3c77dd198fbe
    tuples:
    Tuple(id=0 size=192 slots=[Slot(id=0 type=STRING col=0 offset=16
    null=(offset=0 mask=4)), Slot(id=1 type=STRING col=1 offset=32
    null=(offset=0 mask=8)), Slot(id=2 type=STRING col=2 offset=48
    null=(offset=0 mask=10)), Slot(id=3 type=STRING col=3 offset=64
    null=(offset=0 mask=20)), Slot(id=4 type=STRING col=4 offset=80
    null=(offset=0 mask=40)), Slot(id=5 type=STRING col=5 offset=96
    null=(offset=0 mask=80)), Slot(id=6 type=STRING col=6 offset=112
    null=(offset=1 mask=1)), Slot(id=7 type=STRING col=7 offset=128
    null=(offset=1 mask=2)), Slot(id=8 type=STRING col=8 offset=144
    null=(offset=1 mask=4)), Slot(id=9 type=STRING col=9 offset=160
    null=(offset=1 mask=8)), Slot(id=10 type=STRING col=10 offset=176
    null=(offset=1 mask=10)), Slot(id=11 type=INT col=11 offset=4
    null=(offset=0 mask=1)), Slot(id=12 type=INT col=12 offset=8 null=(offset=0
    mask=2))])
    I0729 12:02:34.865497 6767 exchange-node.cc:50] Exch id=1
    input_desc=Tuple(id=0 size=192 slots=[Slot(id=0 type=STRING col=0
    offset=16 null=(offset=0 mask=4)), Slot(id=1 type=STRING col=1 offset=32
    null=(offset=0 mask=8)), Slot(id=2 type=STRING col=2 offset=48
    null=(offset=0 mask=10)), Slot(id=3 type=STRING col=3 offset=64
    null=(offset=0 mask=20)), Slot(id=4 type=STRING col=4 offset=80
    null=(offset=0 mask=40)), Slot(id=5 type=STRING col=5 offset=96
    null=(offset=0 mask=80)), Slot(id=6 type=STRING col=6 offset=112
    null=(offset=1 mask=1)), Slot(id=7 type=STRING col=7 offset=128
    null=(offset=1 mask=2)), Slot(id=8 type=STRING col=8 offset=144
    null=(offset=1 mask=4)), Slot(id=9 type=STRING col=9 offset=160
    null=(offset=1 mask=8)), Slot(id=10 type=STRING col=10 offset=176
    null=(offset=1 mask=10)), Slot(id=11 type=INT col=11 offset=4
    null=(offset=0 mask=1)), Slot(id=12 type=INT col=12 offset=8 null=(offset=0
    mask=2))])

    output_desc=Tuple(id=0 size=192 slots=[Slot(id=0 type=STRING col=0
    offset=16 null=(offset=0 mask=4)), Slot(id=1 type=STRING col=1 offset=32
    null=(offset=0 mask=8)), Slot(id=2 type=STRING col=2 offset=48
    null=(offset=0 mask=10)), Slot(id=3 type=STRING col=3 offset=64
    null=(offset=0 mask=20)), Slot(id=4 type=STRING col=4 offset=80
    null=(offset=0 mask=40)), Slot(id=5 type=STRING col=5 offset=96
    null=(offset=0 mask=80)), Slot(id=6 type=STRING col=6 offset=112
    null=(offset=1 mask=1)), Slot(id=7 type=STRING col=7 offset=128
    null=(offset=1 mask=2)), Slot(id=8 type=STRING col=8 offset=144
    null=(offset=1 mask=4)), Slot(id=9 type=STRING col=9 offset=160
    null=(offset=1 mask=8)), Slot(id=10 type=STRING col=10 offset=176
    null=(offset=1 mask=10)), Slot(id=11 type=INT col=11 offset=4
    null=(offset=0 mask=1)), Slot(id=12 type=INT col=12 offset=8 null=(offset=0
    mask=2))])
    I0729 12:02:34.964975 6767 coordinator.cc:398] starting 2 backends for
    query e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:34.965533 15878 impala-server.cc:1207] ExecPlanFragment()
    instance_id=e1449e468f546d85:78eb3c77dd198fc0
    coord= xxxxxxxx.compute.internal:22000 backend#=1
    I0729 12:02:34.965617 15878 plan-fragment-executor.cc:76] Prepare():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    instance_id=e1449e468f546d85:78eb3c77dd198fc0
    I0729 12:02:34.974009 15878 plan-fragment-executor.cc:124] descriptor
    table for fragment=e1449e468f546d85:78eb3c77dd198fc0
    tuples:
    Tuple(id=0 size=192 slots=[Slot(id=0 type=STRING col=0 offset=16
    null=(offset=0 mask=4)), Slot(id=1 type=STRING col=1 offset=32
    null=(offset=0 mask=8)), Slot(id=2 type=STRING col=2 offset=48
    null=(offset=0 mask=10)), Slot(id=3 type=STRING col=3 offset=64
    null=(offset=0 mask=20)), Slot(id=4 type=STRING col=4 offset=80
    null=(offset=0 mask=40)), Slot(id=5 type=STRING col=5 offset=96
    null=(offset=0 mask=80)), Slot(id=6 type=STRING col=6 offset=112
    null=(offset=1 mask=1)), Slot(id=7 type=STRING col=7 offset=128
    null=(offset=1 mask=2)), Slot(id=8 type=STRING col=8 offset=144
    null=(offset=1 mask=4)), Slot(id=9 type=STRING col=9 offset=160
    null=(offset=1 mask=8)), Slot(id=10 type=STRING col=10 offset=176
    null=(offset=1 mask=10)), Slot(id=11 type=INT col=11 offset=4
    null=(offset=0 mask=1)), Slot(id=12 type=INT col=12 offset=8 null=(offset=0
    mask=2))])
    I0729 12:02:35.136384 1449 plan-fragment-executor.cc:221] Open():
    instance_id=e1449e468f546d85:78eb3c77dd198fc0
    I0729 12:02:35.373730 15878 coordinator.cc:1044] Backend 1 completed, 1
    remaining: query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:35.374150 15878 coordinator.cc:1053]
    query_id=e1449e468f546d85:78eb3c77dd198fbd: first in-progress backend:
    xxxxxxxx.compute.internal:22000
    I0729 12:02:35.373913 1456 plan-fragment-executor.cc:221] Open():
    instance_id=e1449e468f546d85:78eb3c77dd198fbe
    I0729 12:02:35.475869 6767 impala-beeswax-server.cc:266]
    get_results_metadata(): query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.206039 6767 plan-fragment-executor.cc:376] Finished
    executing fragment query_id=e1449e468f546d85:78eb3c77dd198fbd
    instance_id=e1449e468f546d85:78eb3c77dd198fbe
    I0729 12:02:36.206151 6767 coordinator.cc:592] Coordinator waiting for
    backends to finish, 1 remaining
    I0729 12:02:36.207219 15895 progress-updater.cc:56] Query
    e1449e468f546d85:78eb3c77dd198fbd: 50% Complete (9 out of 18)
    I0729 12:02:36.207298 15895 coordinator.cc:1044] Backend 0 completed, 0
    remaining: query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.207487 6767 coordinator.cc:597] All backends finished or
    error.
    I0729 12:02:36.208255 6767 coordinator.cc:1209] Final profile for
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    Execution Profile e1449e468f546d85:78eb3c77dd198fbd:(Active: 518.529ms, %
    non-child: 0.00%)
    Per Node Peak Memory Usage: xxxxxxxxxxxx.compute.internal:22000(57.32
    MB) xxxxxxxxxxxx.compute.internal:22000(57.13 MB)
    - FinalizationTimer: 0ns
    Coordinator Fragment:(Active: 728.842ms, % non-child: 0.00%)
    - AverageThreadTokens: 1.00
    - PeakMemoryUsage: 57.13 MB
    - RowsProduced: 0
    CodeGen:(Active: 107.144ms, % non-child: 14.70%)
    - CodegenTime: 587.524us
    - CompileTime: 98.674ms
    - LoadTime: 8.463ms
    - ModuleFileSize: 83.11 KB
    EXCHANGE_NODE (id=1):(Active: 728.835ms, % non-child: 100.00%)
    - BytesReceived: 0.00
    - ConvertRowBatchTime: 1.396us
    - DataArrivalWaitTime: 728.827ms
    - DeserializeRowBatchTimer: 0ns
    - FirstBatchArrivalWaitTime: 0ns
    - MemoryUsed: 0.00
    - RowsReturned: 0
    - RowsReturnedRate: 0
    - SendersBlockedTimer: 0ns
    - SendersBlockedTotalTimer(*): 0ns
    Averaged Fragment 1:(Active: 533.699ms, % non-child: 0.00%)
    split sizes: min: 150.45 MB, max: 156.23 MB, avg: 153.34 MB, stddev:
    2.89 MB
    completion times: min:237.302ms max:833.680ms mean: 535.491ms
    stddev:298.189ms
    execution rates: min:180.46 MB/sec max:658.34 MB/sec mean:419.40
    MB/sec stddev:238.94 MB/sec
    num instances: 2
    - AverageThreadTokens: 2.67
    - PeakMemoryUsage: 57.23 MB
    - RowsProduced: 0
    CodeGen:(Active: 282.15ms, % non-child: 48.03%)
    - CodegenTime: 7.184ms
    - CompileTime: 273.67ms
    - LoadTime: 8.941ms
    - ModuleFileSize: 83.11 KB
    DataStreamSender (dst_id=1):(Active: 12.633us, % non-child: 0.00%)
    - BytesSent: 0.00
    - NetworkThroughput(*): 0.00 /sec
    - OverallThroughput: 0.00 /sec
    - SerializeBatchTime: 0ns
    - ThriftTransmitTime(*): 0ns
    - UncompressedRowBatchSize: 0.00
    HDFS_SCAN_NODE (id=0):(Active: 533.502ms, % non-child: 99.98%)
    - AverageHdfsReadThreadConcurrency: 0.25
    - AverageScannerThreadConcurrency: 2.00
    - BytesRead: 153.34 MB
    - BytesReadLocal: 153.34 MB
    - BytesReadShortCircuit: 153.34 MB
    - MemoryUsed: 4.05 KB
    - NumDisksAccessed: 1
    - NumScannerThreadsStarted: 2
    - PerReadThreadRawHdfsThroughput: 904.37 MB/sec
    - RowsRead: 794.58K (794577)
    - RowsReturned: 0
    - RowsReturnedRate: 0
    - ScanRangesComplete: 9
    - ScannerThreadsInvoluntaryContextSwitches: 149
    - ScannerThreadsTotalWallClockTime: 1s039ms
    - DelimiterParseTime: 440.182ms
    - MaterializeTupleTime(*): 17.861ms
    - ScannerThreadsSysTime: 4.0ms
    - ScannerThreadsUserTime: 298.18ms
    - ScannerThreadsVoluntaryContextSwitches: 53
    - TotalRawHdfsReadTime(*): 205.304ms
    - TotalReadThroughput: 134.23 MB/sec
    Fragment 1:
    Instance e1449e468f546d85:78eb3c77dd198fbf
    (host= xxxxxxxx.compute.internal:22000):(Active: 831.941ms, % non-child:
    0.00%)
    Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:9/150.45 MB
    - AverageThreadTokens: 2.33
    - PeakMemoryUsage: 57.32 MB
    - RowsProduced: 0
    CodeGen:(Active: 399.581ms, % non-child: 48.03%)
    - CodegenTime: 7.433ms
    - CompileTime: 389.808ms
    - LoadTime: 9.764ms
    - ModuleFileSize: 83.11 KB
    DataStreamSender (dst_id=1):(Active: 14.239us, % non-child: 0.00%)
    - BytesSent: 0.00
    - NetworkThroughput(*): 0.00 /sec
    - OverallThroughput: 0.00 /sec
    - SerializeBatchTime: 0ns
    - ThriftTransmitTime(*): 0ns
    - UncompressedRowBatchSize: 0.00
    HDFS_SCAN_NODE (id=0):(Active: 831.806ms, % non-child: 99.98%)
    Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:9/150.45 MB
    Hdfs Read Thread Concurrency Bucket: 0:50% 1:50% 2:0%
    File Formats: TEXT/NONE:9
    ExecOption: Codegen enabled: 9 out of 9
    - AverageHdfsReadThreadConcurrency: 0.50
    - AverageScannerThreadConcurrency: 2.00
    - BytesRead: 150.45 MB
    - BytesReadLocal: 150.45 MB
    - BytesReadShortCircuit: 150.45 MB
    - MemoryUsed: 4.38 KB
    - NumDisksAccessed: 1
    - NumScannerThreadsStarted: 2
    - PerReadThreadRawHdfsThroughput: 519.87 MB/sec
    - RowsRead: 760.99K (760992)
    - RowsReturned: 0
    - RowsReturnedRate: 0
    - ScanRangesComplete: 9
    - ScannerThreadsInvoluntaryContextSwitches: 76
    - ScannerThreadsTotalWallClockTime: 1s629ms
    - DelimiterParseTime: 694.611ms
    - MaterializeTupleTime(*): 19.272ms
    - ScannerThreadsSysTime: 8.0ms
    - ScannerThreadsUserTime: 416.25ms
    - ScannerThreadsVoluntaryContextSwitches: 70
    - TotalRawHdfsReadTime(*): 289.398ms
    - TotalReadThroughput: 97.96 MB/sec
    Instance e1449e468f546d85:78eb3c77dd198fc0
    (host=xxxxxxxxxxxx.compute.internal:22000):(Active: 235.457ms, % non-child:
    0.00%)
    Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:9/156.23 MB
    - AverageThreadTokens: 3.00
    - PeakMemoryUsage: 57.13 MB
    - RowsProduced: 0
    CodeGen:(Active: 164.450ms, % non-child: 69.84%)
    - CodegenTime: 6.936ms
    - CompileTime: 156.326ms
    - LoadTime: 8.117ms
    - ModuleFileSize: 83.11 KB
    DataStreamSender (dst_id=1):(Active: 11.28us, % non-child: 0.00%)
    - BytesSent: 0.00
    - NetworkThroughput(*): 0.00 /sec
    - OverallThroughput: 0.00 /sec
    - SerializeBatchTime: 0ns
    - ThriftTransmitTime(*): 0ns
    - UncompressedRowBatchSize: 0.00
    HDFS_SCAN_NODE (id=0):(Active: 235.197ms, % non-child: 99.89%)
    Hdfs split stats (<volume id>:<# splits>/<split lengths>):
    0:9/156.23 MB
    Hdfs Read Thread Concurrency Bucket: 0:100% 1:0% 2:0%
    File Formats: TEXT/NONE:9
    ExecOption: Codegen enabled: 9 out of 9
    - AverageHdfsReadThreadConcurrency: 0.00
    - AverageScannerThreadConcurrency: 2.00
    - BytesRead: 156.23 MB
    - BytesReadLocal: 156.23 MB
    - BytesReadShortCircuit: 156.23 MB
    - MemoryUsed: 3.73 KB
    - NumDisksAccessed: 1
    - NumScannerThreadsStarted: 2
    - PerReadThreadRawHdfsThroughput: 1.26 GB/sec
    - RowsRead: 828.16K (828163)
    - RowsReturned: 0
    - RowsReturnedRate: 0
    - ScanRangesComplete: 9
    - ScannerThreadsInvoluntaryContextSwitches: 223
    - ScannerThreadsTotalWallClockTime: 448.822ms
    - DelimiterParseTime: 185.753ms
    - MaterializeTupleTime(*): 16.450ms
    - ScannerThreadsSysTime: 0ns
    - ScannerThreadsUserTime: 180.11ms
    - ScannerThreadsVoluntaryContextSwitches: 36
    - TotalRawHdfsReadTime(*): 121.211ms
    - TotalReadThroughput: 170.50 MB/sec
    I0729 12:02:36.209754 6767 impala-beeswax-server.cc:301] close():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.209816 6767 impala-server.cc:951] UnregisterQuery():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.218952 6767 impala-server.cc:1033] Cancel():
    query_id=e1449e468f546d85:78eb3c77dd198fbd
    I0729 12:02:36.222965 6767 data-stream-mgr.cc:274] DeregisterRecvr():
    fragment_instance_id=e1449e468f546d85:78eb3c77dd198fbe, node=1
    I0729 12:02:36.223062 6767 data-stream-mgr.cc:170] cancelled stream:
    fragment_instance_id_=e1449e468f546d85:78eb3c77dd198fbe node_id=1



    El diumenge 28 de juliol de 2013 23:05:01 UTC+2, lskuff va escriure:
    Hi Iago,
    Can you run these queries using Hive and see if you get the expected
    results?

    It would help to debug this more if you could provide:

    1) The query/queries that are affected
    2) The "CREATE TABLE" statement used for the target table
    3) The impalad log after running these queries

    Thanks,
    Lenni
    Software Engineer - Cloudera

    On Sun, Jul 28, 2013 at 4:54 AM, Iago Tomas wrote:

    I'm trying impala and after loading a dataset using default storage (i
    also tried parquetfile) i cannot query the dataset other than raw select
    which outputs the current dataset, but whenever the query has a clause
    'WHERE','GROUP BY' .... this returns an empty set, even obvious queries
    which should be returning something for sure.
    Any clue?

    My impala version
    Impala Shell v1.1 (5e15fca) built on Sun Jul 21 15:51:04 PDT 2013

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupimpala-user @
categorieshadoop
postedJul 28, '13 at 9:05p
activeJul 30, '13 at 2:52p
posts4
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Lenni Kuff: 2 posts Iago Tomas: 2 posts

People

Translate

site design / logo © 2022 Grokbase