FAQ
Hello,

     I'v been trying to grok the query profiles of some of the queries we'r
running while benchmarking Impala to better understand what's really
causing bottlenecks. There's a lot of good info in the query profiles and
i'd like to understand what the individual "nodes" mean viz : DataStreamSender,
AGGREGATION_NODE, HDFS_SCAN_NODE etc. under each plan fragment. For most
join queries we see HDFS_SCAN_NODE (essentially disk reads) aren't the
bottle neck at all (when run after cleaning kernel cache). But mot time
spent between DataStreamSender, AggregationNode and ExchangeNode depending
on the query complexity and nature. Some more light on these components
will be valuable for us.

Cheers,
-Gautam.

Search Discussions

  • Anty Rao at Jul 24, 2013 at 5:11 am
    What version are you using? in impala 1.0.1, the hash-table-based
    aggregation and join are inefficient ,due to defects in hash function they
    are using. Go check out this JIRA
    IMPALA-219<https://issues.cloudera.org/browse/IMPALA-219>
      and IMPALA-219 <https://issues.cloudera.org/browse/IMPALA-219>.
    I have a feeling that you encounter then same problem as me, and you can
    search previous thread on aggregation performance.
    On Wed, Jul 24, 2013 at 12:12 PM, Gautam wrote:

    Hello,

    I'v been trying to grok the query profiles of some of the queries we'r
    running while benchmarking Impala to better understand what's really
    causing bottlenecks. There's a lot of good info in the query profiles and
    i'd like to understand what the individual "nodes" mean viz : DataStreamSender,
    AGGREGATION_NODE, HDFS_SCAN_NODE etc. under each plan fragment. For most
    join queries we see HDFS_SCAN_NODE (essentially disk reads) aren't the
    bottle neck at all (when run after cleaning kernel cache). But mot time
    spent between DataStreamSender, AggregationNode and ExchangeNode depending
    on the query complexity and nature. Some more light on these components
    will be valuable for us.

    Cheers,
    -Gautam.


    --
    Anty Rao
  • Gautam at Jul 24, 2013 at 6:59 am
    The query below was run with query data in kernel memory and without.
    Please find attached query profiles for both. *Caveat: DISABLE_CODEGEN is
    set to True due to IMPALA-477*
    *
    *
    Still the difference in on-disk and in-memory query performance ought to be
    similar to this with code generation.
    *
    *

    query ===========

    SET DISABLE_CODEGEN=true;

    insert overwrite table lineitem_tmp
    select
       l_partkey as t_partkey, 0.2 * avg(l_quantity) as t_avg_quantity
    from
       lineitem
    group by l_partkey;

    insert overwrite table q17_small_quantity_order_revenue
    select
       sum(l_extendedprice) / 7.0 as avg_yearly
    from
       (select l_quantity, l_extendedprice, t_avg_quantity from
        lineitem_tmp t join
          (select
             l_quantity, l_partkey, l_extendedprice
           from
             part p join lineitem l
             on
               p.p_partkey = l.l_partkey
               and p.p_brand = 'Brand#23'
               and p.p_container = 'MED BOX'
           ) l1 on l1.l_partkey = t.t_partkey
        ) a
    where l_quantity < t_avg_quantity;





    On Tue, Jul 23, 2013 at 11:40 PM, Greg Rahn wrote:

    IMPALA-219 was resolved in 0.7 so must be something else. I'd also
    recommend keeping speculation to a minimum until some data (a query
    profile) is provided to do analysis on.

    WRT understanding profiles, it's easiest to learn by example, so attach
    (don't paste it into the email body) a query execution profile, or upload
    it to gist.github.com, and we can take a look at it and offer some
    insight.

    On Tue, Jul 23, 2013 at 10:11 PM, Anty Rao wrote:

    What version are you using? in impala 1.0.1, the hash-table-based
    aggregation and join are inefficient ,due to defects in hash function they
    are using. Go check out this JIRA IMPALA-219<https://issues.cloudera.org/browse/IMPALA-219>
    and IMPALA-219 <https://issues.cloudera.org/browse/IMPALA-219>.
    I have a feeling that you encounter then same problem as me, and you can
    search previous thread on aggregation performance.

    On Wed, Jul 24, 2013 at 12:12 PM, Gautam wrote:

    Hello,

    I'v been trying to grok the query profiles of some of the queries
    we'r running while benchmarking Impala to better understand what's really
    causing bottlenecks. There's a lot of good info in the query profiles and
    i'd like to understand what the individual "nodes" mean viz : DataStreamSender,
    AGGREGATION_NODE, HDFS_SCAN_NODE etc. under each plan fragment. For
    most join queries we see HDFS_SCAN_NODE (essentially disk reads) aren't the
    bottle neck at all (when run after cleaning kernel cache). But mot time
    spent between DataStreamSender, AggregationNode and ExchangeNode depending
    on the query complexity and nature. Some more light on these components
    will be valuable for us.

    Cheers,
    -Gautam.

    --
    "If you really want something in this life, you have to work for it. Now,
    quiet! They're about to announce the lottery numbers..."
  • Greg Rahn at Jul 24, 2013 at 3:26 pm
    One of the main problems with this query/plan is inverted join order. For
    the inner most block you have:
    from part p join lineitem l on p.p_partkey = l.l_partkey and p.p_brand =
    'Brand#23' and p.p_container = 'MED BOX'

    The corresponding plan fragment is:

    3:HASH JOIN
    join op: INNER JOIN (BROADCAST)
    hash predicates:
    p.p_partkey = l.l_partkey
    tuple ids: 1 2

    ----6:EXCHANGE
    tuple ids: 2
    1:SCAN HDFS
       table=default.pq_part #partitions=1 size=891.57MB compact
       predicates: p.p_brand = 'Brand#23', p.p_container = 'MED BOX'
       tuple ids: 1

    and execution details of that here:

    HASH_JOIN_NODE (id=3):(Active: 2m41s, % non-child: 75.81%)
        - BuildBuckets: 16.78M (16777216)
        - BuildRows: 600.04M (600037902)
        - BuildTime: 1m56s
        - LoadFactor: 0.60
        - MemoryUsed: 0.00
        - ProbeRows: 4.01K (4006)
        - ProbeTime: 92.515ms
        - RowsReturned: 120.20K (120196)
        - RowsReturnedRate: 744.00 /sec

    We see from this that the right side of the join, the build side (lineitem)
    returns 600M rows, and the left (probe) side (part) returns 4000 rows. We
    can tell this join is inverted because the BuildRows side is bigger than
    the ProbRows side. We want to have the most selective relation be the
    build side of the hash join.

    It's also noteworthy that since the join is inverted, not only are we
    building using the "big" relation, it is a broadcast join so we are
    broadcasting the larger relation which also contributes to poor execution.






    On Tue, Jul 23, 2013 at 11:59 PM, Gautam wrote:

    The query below was run with query data in kernel memory and without.
    Please find attached query profiles for both. *Caveat: DISABLE_CODEGEN is
    set to True due to IMPALA-477*
    *
    *
    Still the difference in on-disk and in-memory query performance ought to
    be similar to this with code generation.
    *
    *

    query ===========

    SET DISABLE_CODEGEN=true;

    insert overwrite table lineitem_tmp
    select
    l_partkey as t_partkey, 0.2 * avg(l_quantity) as t_avg_quantity
    from
    lineitem
    group by l_partkey;

    insert overwrite table q17_small_quantity_order_revenue
    select
    sum(l_extendedprice) / 7.0 as avg_yearly
    from
    (select l_quantity, l_extendedprice, t_avg_quantity from
    lineitem_tmp t join
    (select
    l_quantity, l_partkey, l_extendedprice
    from
    part p join lineitem l
    on
    p.p_partkey = l.l_partkey
    and p.p_brand = 'Brand#23'
    and p.p_container = 'MED BOX'
    ) l1 on l1.l_partkey = t.t_partkey
    ) a
    where l_quantity < t_avg_quantity;





    On Tue, Jul 23, 2013 at 11:40 PM, Greg Rahn wrote:

    IMPALA-219 was resolved in 0.7 so must be something else. I'd also
    recommend keeping speculation to a minimum until some data (a query
    profile) is provided to do analysis on.

    WRT understanding profiles, it's easiest to learn by example, so attach
    (don't paste it into the email body) a query execution profile, or upload
    it to gist.github.com, and we can take a look at it and offer some
    insight.

    On Tue, Jul 23, 2013 at 10:11 PM, Anty Rao wrote:

    What version are you using? in impala 1.0.1, the hash-table-based
    aggregation and join are inefficient ,due to defects in hash function they
    are using. Go check out this JIRA IMPALA-219<https://issues.cloudera.org/browse/IMPALA-219>
    and IMPALA-219 <https://issues.cloudera.org/browse/IMPALA-219>.
    I have a feeling that you encounter then same problem as me, and you can
    search previous thread on aggregation performance.

    On Wed, Jul 24, 2013 at 12:12 PM, Gautam wrote:

    Hello,

    I'v been trying to grok the query profiles of some of the queries
    we'r running while benchmarking Impala to better understand what's really
    causing bottlenecks. There's a lot of good info in the query profiles and
    i'd like to understand what the individual "nodes" mean viz : DataStreamSender,
    AGGREGATION_NODE, HDFS_SCAN_NODE etc. under each plan fragment. For
    most join queries we see HDFS_SCAN_NODE (essentially disk reads) aren't the
    bottle neck at all (when run after cleaning kernel cache). But mot time
    spent between DataStreamSender, AggregationNode and ExchangeNode depending
    on the query complexity and nature. Some more light on these components
    will be valuable for us.

    Cheers,
    -Gautam.

    --
    "If you really want something in this life, you have to work for it. Now,
    quiet! They're about to announce the lottery numbers..."
  • Anty Rao at Jul 24, 2013 at 10:36 am
    Hi: Greg

    Though IMPALA-219 was indeed resolved in 0.7, but it didn't really resolve
    the problem totally.Worse Key collision condition still exists.This problem
    will be resolved in next release as the Cloudera Gus sad.

    Sorry about the typo in my last mail, there are two JIRA related
    :IMPALA-219 and IMPALA-453.

    I have done a little bit of performance evaluation on bash-table based
    aggregation and join, i experienced this. And the symptom described by
    Gautom is very similar to mine, so i sad that he maybe encounter the same
    problem.

    On Wed, Jul 24, 2013 at 2:40 PM, Greg Rahn wrote:

    IMPALA-219 was resolved in 0.7 so must be something else. I'd also
    recommend keeping speculation to a minimum until some data (a query
    profile) is provided to do analysis on.

    WRT understanding profiles, it's easiest to learn by example, so attach
    (don't paste it into the email body) a query execution profile, or upload
    it to gist.github.com, and we can take a look at it and offer some
    insight.

    On Tue, Jul 23, 2013 at 10:11 PM, Anty Rao wrote:

    What version are you using? in impala 1.0.1, the hash-table-based
    aggregation and join are inefficient ,due to defects in hash function they
    are using. Go check out this JIRA IMPALA-219<https://issues.cloudera.org/browse/IMPALA-219>
    and IMPALA-219 <https://issues.cloudera.org/browse/IMPALA-219>.
    I have a feeling that you encounter then same problem as me, and you can
    search previous thread on aggregation performance.

    On Wed, Jul 24, 2013 at 12:12 PM, Gautam wrote:

    Hello,

    I'v been trying to grok the query profiles of some of the queries
    we'r running while benchmarking Impala to better understand what's really
    causing bottlenecks. There's a lot of good info in the query profiles and
    i'd like to understand what the individual "nodes" mean viz : DataStreamSender,
    AGGREGATION_NODE, HDFS_SCAN_NODE etc. under each plan fragment. For
    most join queries we see HDFS_SCAN_NODE (essentially disk reads) aren't the
    bottle neck at all (when run after cleaning kernel cache). But mot time
    spent between DataStreamSender, AggregationNode and ExchangeNode depending
    on the query complexity and nature. Some more light on these components
    will be valuable for us.

    Cheers,
    -Gautam.

    --
    Anty Rao
  • Greg Rahn at Jul 24, 2013 at 3:31 pm
    The 1.1 bits are now available for download and on github and the release
    email will be out shortly, so you can attempt your experiment with both
    fixes now.

    On Wed, Jul 24, 2013 at 3:36 AM, Anty Rao wrote:

    Hi: Greg

    Though IMPALA-219 was indeed resolved in 0.7, but it didn't really resolve
    the problem totally.Worse Key collision condition still exists.This problem
    will be resolved in next release as the Cloudera Gus sad.

    Sorry about the typo in my last mail, there are two JIRA related
    :IMPALA-219 and IMPALA-453.

    I have done a little bit of performance evaluation on bash-table based
    aggregation and join, i experienced this. And the symptom described by
    Gautom is very similar to mine, so i sad that he maybe encounter the same
    problem.

    On Wed, Jul 24, 2013 at 2:40 PM, Greg Rahn wrote:

    IMPALA-219 was resolved in 0.7 so must be something else. I'd also
    recommend keeping speculation to a minimum until some data (a query
    profile) is provided to do analysis on.

    WRT understanding profiles, it's easiest to learn by example, so attach
    (don't paste it into the email body) a query execution profile, or upload
    it to gist.github.com, and we can take a look at it and offer some
    insight.

    On Tue, Jul 23, 2013 at 10:11 PM, Anty Rao wrote:

    What version are you using? in impala 1.0.1, the hash-table-based
    aggregation and join are inefficient ,due to defects in hash function they
    are using. Go check out this JIRA IMPALA-219<https://issues.cloudera.org/browse/IMPALA-219>
    and IMPALA-219 <https://issues.cloudera.org/browse/IMPALA-219>.
    I have a feeling that you encounter then same problem as me, and you can
    search previous thread on aggregation performance.

    On Wed, Jul 24, 2013 at 12:12 PM, Gautam wrote:

    Hello,

    I'v been trying to grok the query profiles of some of the queries
    we'r running while benchmarking Impala to better understand what's really
    causing bottlenecks. There's a lot of good info in the query profiles and
    i'd like to understand what the individual "nodes" mean viz : DataStreamSender,
    AGGREGATION_NODE, HDFS_SCAN_NODE etc. under each plan fragment. For
    most join queries we see HDFS_SCAN_NODE (essentially disk reads) aren't the
    bottle neck at all (when run after cleaning kernel cache). But mot time
    spent between DataStreamSender, AggregationNode and ExchangeNode depending
    on the query complexity and nature. Some more light on these components
    will be valuable for us.

    Cheers,
    -Gautam.

    --
    Anty Rao
  • Anty Rao at Jul 24, 2013 at 3:56 pm
    Thx.Greg
    Glad to hear this.
    On Wed, Jul 24, 2013 at 11:31 PM, Greg Rahn wrote:

    The 1.1 bits are now available for download and on github and the release
    email will be out shortly, so you can attempt your experiment with both
    fixes now.

    On Wed, Jul 24, 2013 at 3:36 AM, Anty Rao wrote:

    Hi: Greg

    Though IMPALA-219 was indeed resolved in 0.7, but it didn't really
    resolve the problem totally.Worse Key collision condition still exists.This
    problem will be resolved in next release as the Cloudera Gus sad.

    Sorry about the typo in my last mail, there are two JIRA related
    :IMPALA-219 and IMPALA-453.

    I have done a little bit of performance evaluation on bash-table based
    aggregation and join, i experienced this. And the symptom described by
    Gautom is very similar to mine, so i sad that he maybe encounter the same
    problem.

    On Wed, Jul 24, 2013 at 2:40 PM, Greg Rahn wrote:

    IMPALA-219 was resolved in 0.7 so must be something else. I'd also
    recommend keeping speculation to a minimum until some data (a query
    profile) is provided to do analysis on.

    WRT understanding profiles, it's easiest to learn by example, so attach
    (don't paste it into the email body) a query execution profile, or upload
    it to gist.github.com, and we can take a look at it and offer some
    insight.

    On Tue, Jul 23, 2013 at 10:11 PM, Anty Rao wrote:

    What version are you using? in impala 1.0.1, the hash-table-based
    aggregation and join are inefficient ,due to defects in hash function they
    are using. Go check out this JIRA IMPALA-219<https://issues.cloudera.org/browse/IMPALA-219>
    and IMPALA-219 <https://issues.cloudera.org/browse/IMPALA-219>.
    I have a feeling that you encounter then same problem as me, and you
    can search previous thread on aggregation performance.

    On Wed, Jul 24, 2013 at 12:12 PM, Gautam wrote:

    Hello,

    I'v been trying to grok the query profiles of some of the queries
    we'r running while benchmarking Impala to better understand what's really
    causing bottlenecks. There's a lot of good info in the query profiles and
    i'd like to understand what the individual "nodes" mean viz : DataStreamSender,
    AGGREGATION_NODE, HDFS_SCAN_NODE etc. under each plan fragment. For
    most join queries we see HDFS_SCAN_NODE (essentially disk reads) aren't the
    bottle neck at all (when run after cleaning kernel cache). But mot time
    spent between DataStreamSender, AggregationNode and ExchangeNode depending
    on the query complexity and nature. Some more light on these components
    will be valuable for us.

    Cheers,
    -Gautam.

    --
    Anty Rao

    --
    Anty Rao

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupimpala-user @
categorieshadoop
postedJul 24, '13 at 4:12a
activeJul 24, '13 at 3:56p
posts7
users3
websitecloudera.com
irc#hadoop

3 users in discussion

Anty Rao: 3 posts Greg Rahn: 2 posts Gautam: 2 posts

People

Translate

site design / logo © 2022 Grokbase