FAQ
Actually I need to port some SQL queries to hive QL.

Lets say I have hive table t which has columns mobile_no, cookie, ip,
access_id.

Lets say I want to count unique users. My definition of of unique user = all
unique mobile numbers + all unique cookie (if for them mobile number not
present) + all unique ip ( where both mobile number and cookie is not
present)

For example:

mobile_no, cookie, ip , access_id
'9741112345', '', '1.2.3.4', 1 // may be from sms so cookie is not present
'9741112346', '', '1.2.3.4', 2
'', 'aa', '1.2.3.4', 3
'', 'bb', '1.2.3.4', 4
'','', '1.2.3.5',5
'','','1.2.3.4',6

There are 6 unique users .

in MySQL we can handle like

select count(distinct if(mobile_no !='', mobile_no, if(cookie != '',
cookie,ip)) from table.

Is it possible to do the same thing in Hive in one query itself?
To be more specific can I do IF (control functions) in Hive?


Amlan

Search Discussions

  • Viral Bajaria at Feb 4, 2011 at 7:58 am
    http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF

    <http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF>check conditional
    functions in the link above, it has the IF and CASE statement definitions. I
    am guessing some of them might not work with older version of Hive but not
    too sure.
    On Thu, Feb 3, 2011 at 11:38 PM, Amlan Mandal wrote:

    Actually I need to port some SQL queries to hive QL.

    Lets say I have hive table t which has columns mobile_no, cookie, ip,
    access_id.

    Lets say I want to count unique users. My definition of of unique user =
    all unique mobile numbers + all unique cookie (if for them mobile number not
    present) + all unique ip ( where both mobile number and cookie is not
    present)

    For example:

    mobile_no, cookie, ip , access_id
    '9741112345', '', '1.2.3.4', 1 // may be from sms so cookie is not present
    '9741112346', '', '1.2.3.4', 2
    '', 'aa', '1.2.3.4', 3
    '', 'bb', '1.2.3.4', 4
    '','', '1.2.3.5',5
    '','','1.2.3.4',6

    There are 6 unique users .

    in MySQL we can handle like

    select count(distinct if(mobile_no !='', mobile_no, if(cookie != '',
    cookie,ip)) from table.

    Is it possible to do the same thing in Hive in one query itself?
    To be more specific can I do IF (control functions) in Hive?


    Amlan

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedFeb 4, '11 at 7:39a
activeFeb 4, '11 at 7:58a
posts2
users2
websitehive.apache.org

2 users in discussion

Viral Bajaria: 1 post Amlan Mandal: 1 post

People

Translate

site design / logo © 2021 Grokbase