FAQ
Hello all,

What would be the best way to write a function that would perform aggregation computations on records in a table and return multiple rows (and possibly columns)? For example, imagine a function called DECILES that computes all the deciles for a given measure and returns them as 10 rows with 2 columns, decile and value. It seems like what I want is some sort of combination of a UDAF and a UDTF. Does such an animal exist in the Hive world?

Jason

Search Discussions

  • Zheng Shao at Jan 29, 2010 at 4:37 am
    The easiest way to go is to write a UDAF to return the answer in
    array<struct<decile:int, value:double>>.

    Then you can do: (note that explode is a predefined UDTF)

    SELECT
    tmp.key, tmp2.d.decile, tmp2.d.value
    FROM
    (SELECT key, Decile(value) as deciles
    GROUP BY key) tmp
    LATERAL VIEW explode(tmp.deciles) tmp2 AS d


    Zheng
    On Thu, Jan 28, 2010 at 2:07 PM, Jason Michael wrote:
    Hello all,

    What would be the best way to write a function that would perform
    aggregation computations on records in a table and return multiple rows (and
    possibly columns)?  For example, imagine a function called DECILES that
    computes all the deciles for a given measure and returns them as 10 rows
    with 2 columns, decile and value.  It seems like what I want is some sort of
    combination of a UDAF and a UDTF.  Does such an animal exist in the Hive
    world?

    Jason


    --
    Yours,
    Zheng

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedJan 28, '10 at 10:08p
activeJan 29, '10 at 4:37a
posts2
users2
websitehive.apache.org

2 users in discussion

Zheng Shao: 1 post Jason Michael: 1 post

People

Translate

site design / logo © 2021 Grokbase