FAQ
Due to the syntax of left joins, you can't change the order:

alias = JOIN left-alias BY left-alias-column [LEFT|RIGHT|FULL] [OUTER],
right-alias BY right-alias-column [USING 'replicated' | 'skewed' | 'merge']
[PARTITION BY partitioner] [PARALLEL n];

…. so I assume the regular join optimizations don't apply here? I imagine
this being defined as a 'regular' join …


Optimization for regular joins ensures that the last table in the join is
not brought into memory but streamed through instead. Optimization reduces
the amount of memory used which means you can avoid spilling the data and
also should be able to scale your query to larger data volumes.
To take advantage of this optimization, make sure that the table with the
largest number of tuples per key is the last table in your query. In some of
our tests we saw 10x performance improvement as the result of this
optimization.


--

Founder/CEO Spinn3r.com

Location: *San Francisco, CA*
Skype: *burtonator*

Skype-in: *(415) 871-0687*

Search Discussions

  • Dmitriy Ryaboy at Sep 1, 2011 at 1:17 am
    They do. You can just flip between left and right outers as needed, right?

    D
    On Wed, Aug 31, 2011 at 6:13 PM, Kevin Burton wrote:

    Due to the syntax of left joins, you can't change the order:

    alias = JOIN left-alias BY left-alias-column [LEFT|RIGHT|FULL] [OUTER],
    right-alias BY right-alias-column [USING 'replicated' | 'skewed' | 'merge']
    [PARTITION BY partitioner] [PARALLEL n];

    …. so I assume the regular join optimizations don't apply here? I imagine
    this being defined as a 'regular' join …


    Optimization for regular joins ensures that the last table in the join is
    not brought into memory but streamed through instead. Optimization reduces
    the amount of memory used which means you can avoid spilling the data and
    also should be able to scale your query to larger data volumes.
    To take advantage of this optimization, make sure that the table with the
    largest number of tuples per key is the last table in your query. In some of
    our tests we saw 10x performance improvement as the result of this
    optimization.


    --

    Founder/CEO Spinn3r.com

    Location: *San Francisco, CA*
    Skype: *burtonator*

    Skype-in: *(415) 871-0687*
  • Kevin Burton at Sep 1, 2011 at 1:23 am
    AHA!!!!! haha
    On Wed, Aug 31, 2011 at 6:17 PM, Dmitriy Ryaboy wrote:

    They do. You can just flip between left and right outers as needed, right?

    D
    On Wed, Aug 31, 2011 at 6:13 PM, Kevin Burton wrote:

    Due to the syntax of left joins, you can't change the order:

    alias = JOIN left-alias BY left-alias-column [LEFT|RIGHT|FULL] [OUTER],
    right-alias BY right-alias-column [USING 'replicated' | 'skewed' | 'merge']
    [PARTITION BY partitioner] [PARALLEL n];

    …. so I assume the regular join optimizations don't apply here? I imagine
    this being defined as a 'regular' join …


    Optimization for regular joins ensures that the last table in the join
    is
    not brought into memory but streamed through instead. Optimization reduces
    the amount of memory used which means you can avoid spilling the data
    and
    also should be able to scale your query to larger data volumes.
    To take advantage of this optimization, make sure that the table with
    the
    largest number of tuples per key is the last table in your query. In
    some
    of
    our tests we saw 10x performance improvement as the result of this
    optimization.


    --

    Founder/CEO Spinn3r.com

    Location: *San Francisco, CA*
    Skype: *burtonator*

    Skype-in: *(415) 871-0687*


    --

    Founder/CEO Spinn3r.com

    Location: *San Francisco, CA*
    Skype: *burtonator*

    Skype-in: *(415) 871-0687*

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedSep 1, '11 at 1:14a
activeSep 1, '11 at 1:23a
posts3
users2
websitepig.apache.org

2 users in discussion

Kevin Burton: 2 posts Dmitriy Ryaboy: 1 post

People

Translate

site design / logo © 2021 Grokbase