Grokbase Groups Hive user April 2010
FAQ
hive> describe ut;
OK
time bigint
day string
Time taken: 0.128 seconds
hive> select * from ut;
OK
1270145333155 tuesday
Time taken: 0.085 seconds

When I run this simple query, I'm getting a NULL for the time column with
data type bigint.

hive> select unix_timestamp(time),day from ut;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201003031204_0080, Tracking URL =
http://master:50030/jobdetails.jsp?jobid=job_201003031204_0080
Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
-Dmapred.job.tracker=master:54311 -kill job_201003031204_0080
2010-04-01 16:03:54,024 Stage-1 map = 0%, reduce = 0%
2010-04-01 16:04:06,128 Stage-1 map = 100%, reduce = 0%
2010-04-01 16:04:09,150 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201003031204_0080
OK
NULL tuesday
Time taken: 16.981 seconds

Any ideas?

Thanks!

/tom

Search Discussions

  • Paul Yang at Apr 1, 2010 at 8:36 pm
    unix_timestamp() returns the unix time given a date string:

    e.g. unix_timestamp('2009-03-20 11:30:01') = 1237573801

    please see:
    http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Date_Functions

    Cheers,
    Paul


    From: tom kersnick
    Sent: Thursday, April 01, 2010 1:12 PM
    To: hive-user@hadoop.apache.org
    Subject: unix_timestamp function

    hive> describe ut;
    OK
    time bigint
    day string
    Time taken: 0.128 seconds
    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.085 seconds

    When I run this simple query, I'm getting a NULL for the time column with data type bigint.

    hive> select unix_timestamp(time),day from ut;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0080, Tracking URL = http://master:50030/jobdetails.jsp?jobid=job_201003031204_0080
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=master:54311 -kill job_201003031204_0080
    2010-04-01 16:03:54,024 Stage-1 map = 0%, reduce = 0%
    2010-04-01 16:04:06,128 Stage-1 map = 100%, reduce = 0%
    2010-04-01 16:04:09,150 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0080
    OK
    NULL tuesday
    Time taken: 16.981 seconds

    Any ideas?

    Thanks!

    /tom
  • Carl Steinbach at Apr 1, 2010 at 8:37 pm
    Hi Tom,

    I think you want to use the from_unixtime UDF:

    hive> describe function extended from_unixtime;
    describe function extended from_unixtime;
    OK
    from_unixtime(unix_time, format) - returns unix_time in the specified format
    Example:
    SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss') FROM src LIMIT 1;
    '1970-01-01 00:00:00'
    Time taken: 0.647 seconds
    hive>

    Thanks.

    Carl
    On Thu, Apr 1, 2010 at 1:11 PM, tom kersnick wrote:

    hive> describe ut;
    OK
    time bigint
    day string
    Time taken: 0.128 seconds
    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.085 seconds

    When I run this simple query, I'm getting a NULL for the time column with
    data type bigint.

    hive> select unix_timestamp(time),day from ut;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0080, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0080
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0080
    2010-04-01 16:03:54,024 Stage-1 map = 0%, reduce = 0%
    2010-04-01 16:04:06,128 Stage-1 map = 100%, reduce = 0%
    2010-04-01 16:04:09,150 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0080
    OK
    NULL tuesday
    Time taken: 16.981 seconds

    Any ideas?

    Thanks!

    /tom
  • Tom kersnick at Apr 1, 2010 at 9:02 pm
    Thanks, but there is something fishy going on.

    Im using hive 0.5.0 with hadoop 0.20.1

    I tried the column as both a bigint and a string. According the hive ddl:

    string

    from_unixtime(int unixtime)

    Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to
    a string representing the timestamp of that moment in the current system
    time zone in the format of "1970-01-01 00:00:00"

    It looks like the input is int, that would be too small for my
    1270145333155 timestamp.

    Any ideas?

    Example below:

    /tom


    hive> describe ut;
    OK
    cola bigint
    colb string
    Time taken: 0.101 seconds

    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.065 seconds

    hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0083, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0083
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0083
    2010-04-01 16:57:32,407 Stage-1 map = 0%, reduce = 0%
    2010-04-01 16:57:45,577 Stage-1 map = 100%, reduce = 0%
    2010-04-01 16:57:48,605 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0083
    OK
    42219-04-22 00:05:55 tuesday
    Time taken: 18.066 seconds


    hive> describe ut;
    OK
    cola string
    colb string
    Time taken: 0.077 seconds

    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.065 seconds


    hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;
    FAILED: Error in semantic analysis: line 1:7 Function Argument Type Mismatch
    from_unixtime: Looking for UDF "from_unixtime" with parameters [class
    org.apache.hadoop.io.Text, class org.apache.hadoop.io.Text]



    On Thu, Apr 1, 2010 at 1:37 PM, Carl Steinbach wrote:

    Hi Tom,

    I think you want to use the from_unixtime UDF:

    hive> describe function extended from_unixtime;
    describe function extended from_unixtime;
    OK
    from_unixtime(unix_time, format) - returns unix_time in the specified
    format
    Example:
    SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss') FROM src LIMIT 1;
    '1970-01-01 00:00:00'
    Time taken: 0.647 seconds
    hive>

    Thanks.

    Carl
    On Thu, Apr 1, 2010 at 1:11 PM, tom kersnick wrote:

    hive> describe ut;
    OK
    time bigint
    day string
    Time taken: 0.128 seconds
    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.085 seconds

    When I run this simple query, I'm getting a NULL for the time column with
    data type bigint.

    hive> select unix_timestamp(time),day from ut;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0080, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0080
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0080
    2010-04-01 16:03:54,024 Stage-1 map = 0%, reduce = 0%
    2010-04-01 16:04:06,128 Stage-1 map = 100%, reduce = 0%
    2010-04-01 16:04:09,150 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0080
    OK
    NULL tuesday
    Time taken: 16.981 seconds

    Any ideas?

    Thanks!

    /tom
  • Carl Steinbach at Apr 1, 2010 at 9:14 pm
    Hi Tom,

    "Unix Time" is defined as the number of *seconds* since January 1, 1970. It
    looks like the data you have in cola is in milliseconds. You need to divide
    this value by 1000 before calling from_unixtime() on the result.

    Thanks.

    Carl

    On Thu, Apr 1, 2010 at 2:02 PM, tom kersnick wrote:

    Thanks, but there is something fishy going on.

    Im using hive 0.5.0 with hadoop 0.20.1

    I tried the column as both a bigint and a string. According the hive ddl:

    string

    from_unixtime(int unixtime)

    Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to
    a string representing the timestamp of that moment in the current system
    time zone in the format of "1970-01-01 00:00:00"

    It looks like the input is int, that would be too small for my
    1270145333155 timestamp.

    Any ideas?

    Example below:

    /tom


    hive> describe ut;
    OK
    cola bigint
    colb string
    Time taken: 0.101 seconds


    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.065 seconds

    hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;

    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0083, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0083
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0083
    2010-04-01 16:57:32,407 Stage-1 map = 0%, reduce = 0%
    2010-04-01 16:57:45,577 Stage-1 map = 100%, reduce = 0%
    2010-04-01 16:57:48,605 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0083
    OK
    42219-04-22 00:05:55 tuesday
    Time taken: 18.066 seconds


    hive> describe ut;
    OK
    cola string
    colb string
    Time taken: 0.077 seconds


    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.065 seconds


    hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;
    FAILED: Error in semantic analysis: line 1:7 Function Argument Type
    Mismatch from_unixtime: Looking for UDF "from_unixtime" with parameters
    [class org.apache.hadoop.io.Text, class org.apache.hadoop.io.Text]




    On Thu, Apr 1, 2010 at 1:37 PM, Carl Steinbach wrote:

    Hi Tom,

    I think you want to use the from_unixtime UDF:

    hive> describe function extended from_unixtime;
    describe function extended from_unixtime;
    OK
    from_unixtime(unix_time, format) - returns unix_time in the specified
    format
    Example:
    SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss') FROM src LIMIT 1;
    '1970-01-01 00:00:00'
    Time taken: 0.647 seconds
    hive>

    Thanks.

    Carl
    On Thu, Apr 1, 2010 at 1:11 PM, tom kersnick wrote:

    hive> describe ut;
    OK
    time bigint
    day string
    Time taken: 0.128 seconds
    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.085 seconds

    When I run this simple query, I'm getting a NULL for the time column with
    data type bigint.

    hive> select unix_timestamp(time),day from ut;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0080, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0080
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0080
    2010-04-01 16:03:54,024 Stage-1 map = 0%, reduce = 0%
    2010-04-01 16:04:06,128 Stage-1 map = 100%, reduce = 0%
    2010-04-01 16:04:09,150 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0080
    OK
    NULL tuesday
    Time taken: 16.981 seconds

    Any ideas?

    Thanks!

    /tom
  • Tom kersnick at Apr 1, 2010 at 9:17 pm
    ok thanks....

    I should have caught that.

    /tom


    On Thu, Apr 1, 2010 at 2:13 PM, Carl Steinbach wrote:

    Hi Tom,

    "Unix Time" is defined as the number of *seconds* since January 1, 1970. It
    looks like the data you have in cola is in milliseconds. You need to divide
    this value by 1000 before calling from_unixtime() on the result.

    Thanks.

    Carl

    On Thu, Apr 1, 2010 at 2:02 PM, tom kersnick wrote:

    Thanks, but there is something fishy going on.

    Im using hive 0.5.0 with hadoop 0.20.1

    I tried the column as both a bigint and a string. According the hive ddl:

    string

    from_unixtime(int unixtime)

    Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC)
    to a string representing the timestamp of that moment in the current system
    time zone in the format of "1970-01-01 00:00:00"

    It looks like the input is int, that would be too small for my
    1270145333155 timestamp.

    Any ideas?

    Example below:

    /tom


    hive> describe ut;
    OK
    cola bigint
    colb string
    Time taken: 0.101 seconds


    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.065 seconds

    hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;

    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0083, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0083
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0083
    2010-04-01 16:57:32,407 Stage-1 map = 0%, reduce = 0%
    2010-04-01 16:57:45,577 Stage-1 map = 100%, reduce = 0%
    2010-04-01 16:57:48,605 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0083
    OK
    42219-04-22 00:05:55 tuesday
    Time taken: 18.066 seconds


    hive> describe ut;
    OK
    cola string
    colb string
    Time taken: 0.077 seconds


    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.065 seconds


    hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;
    FAILED: Error in semantic analysis: line 1:7 Function Argument Type
    Mismatch from_unixtime: Looking for UDF "from_unixtime" with parameters
    [class org.apache.hadoop.io.Text, class org.apache.hadoop.io.Text]




    On Thu, Apr 1, 2010 at 1:37 PM, Carl Steinbach wrote:

    Hi Tom,

    I think you want to use the from_unixtime UDF:

    hive> describe function extended from_unixtime;
    describe function extended from_unixtime;
    OK
    from_unixtime(unix_time, format) - returns unix_time in the specified
    format
    Example:
    SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss') FROM src LIMIT 1;
    '1970-01-01 00:00:00'
    Time taken: 0.647 seconds
    hive>

    Thanks.

    Carl
    On Thu, Apr 1, 2010 at 1:11 PM, tom kersnick wrote:

    hive> describe ut;
    OK
    time bigint
    day string
    Time taken: 0.128 seconds
    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.085 seconds

    When I run this simple query, I'm getting a NULL for the time column
    with data type bigint.

    hive> select unix_timestamp(time),day from ut;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0080, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0080
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0080
    2010-04-01 16:03:54,024 Stage-1 map = 0%, reduce = 0%
    2010-04-01 16:04:06,128 Stage-1 map = 100%, reduce = 0%
    2010-04-01 16:04:09,150 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0080
    OK
    NULL tuesday
    Time taken: 16.981 seconds

    Any ideas?

    Thanks!

    /tom
  • Tom kersnick at Apr 1, 2010 at 10:32 pm
    So its working, but Im having a time zone issue.

    My servers are located in EST, but i need this data in PST.

    So when it converts this:

    hive> select from_unixtime(1270145333,'yyyy-MM-dd HH:mm:ss') from
    ut2;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0102, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0102
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0102
    2010-04-01 18:28:23,041 Stage-1 map = 0%, reduce = 0%
    2010-04-01 18:28:37,315 Stage-1 map = 67%, reduce = 0%
    2010-04-01 18:28:43,386 Stage-1 map = 100%, reduce = 0%
    2010-04-01 18:28:46,412 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0102
    OK
    2010-04-01 14:08:53
    Time taken: 30.191 seconds


    I need it to be :
    2010-04-01 11:08:53


    I tried setting the variable in my .bash_profile for TZ=/ /Americas/ = no
    go.

    Nothing in the hive ddl link you is leading me in the right direction. Is
    there something you guys can recommend? I can write a script outside of
    hive, but it would be great if I can have users handle this within their
    queries.

    Thanks in advance!

    /tom


    On Thu, Apr 1, 2010 at 2:17 PM, tom kersnick wrote:

    ok thanks....

    I should have caught that.

    /tom



    On Thu, Apr 1, 2010 at 2:13 PM, Carl Steinbach wrote:

    Hi Tom,

    "Unix Time" is defined as the number of *seconds* since January 1, 1970.
    It looks like the data you have in cola is in milliseconds. You need to
    divide this value by 1000 before calling from_unixtime() on the result.

    Thanks.

    Carl

    On Thu, Apr 1, 2010 at 2:02 PM, tom kersnick wrote:

    Thanks, but there is something fishy going on.

    Im using hive 0.5.0 with hadoop 0.20.1

    I tried the column as both a bigint and a string. According the hive
    ddl:

    string

    from_unixtime(int unixtime)

    Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC)
    to a string representing the timestamp of that moment in the current system
    time zone in the format of "1970-01-01 00:00:00"

    It looks like the input is int, that would be too small for my
    1270145333155 timestamp.

    Any ideas?

    Example below:

    /tom


    hive> describe ut;
    OK
    cola bigint
    colb string
    Time taken: 0.101 seconds


    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.065 seconds

    hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;

    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0083, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0083
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0083
    2010-04-01 16:57:32,407 Stage-1 map = 0%, reduce = 0%
    2010-04-01 16:57:45,577 Stage-1 map = 100%, reduce = 0%
    2010-04-01 16:57:48,605 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0083
    OK
    42219-04-22 00:05:55 tuesday
    Time taken: 18.066 seconds


    hive> describe ut;
    OK
    cola string
    colb string
    Time taken: 0.077 seconds


    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.065 seconds


    hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;
    FAILED: Error in semantic analysis: line 1:7 Function Argument Type
    Mismatch from_unixtime: Looking for UDF "from_unixtime" with parameters
    [class org.apache.hadoop.io.Text, class org.apache.hadoop.io.Text]




    On Thu, Apr 1, 2010 at 1:37 PM, Carl Steinbach wrote:

    Hi Tom,

    I think you want to use the from_unixtime UDF:

    hive> describe function extended from_unixtime;
    describe function extended from_unixtime;
    OK
    from_unixtime(unix_time, format) - returns unix_time in the specified
    format
    Example:
    SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss') FROM src LIMIT 1;
    '1970-01-01 00:00:00'
    Time taken: 0.647 seconds
    hive>

    Thanks.

    Carl
    On Thu, Apr 1, 2010 at 1:11 PM, tom kersnick wrote:

    hive> describe ut;
    OK
    time bigint
    day string
    Time taken: 0.128 seconds
    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.085 seconds

    When I run this simple query, I'm getting a NULL for the time column
    with data type bigint.

    hive> select unix_timestamp(time),day from ut;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0080, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0080
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0080
    2010-04-01 16:03:54,024 Stage-1 map = 0%, reduce = 0%
    2010-04-01 16:04:06,128 Stage-1 map = 100%, reduce = 0%
    2010-04-01 16:04:09,150 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0080
    OK
    NULL tuesday
    Time taken: 16.981 seconds

    Any ideas?

    Thanks!

    /tom
  • Zheng Shao at Apr 1, 2010 at 11:32 pm
    Setting TZ in your .bash_profile won't work because the map/reduce tasks
    runs on the hadoop clusters.
    If you start your hadoop tasktracker with that TZ setting, it will probably
    work.

    Zheng
    On Thu, Apr 1, 2010 at 3:32 PM, tom kersnick wrote:

    So its working, but Im having a time zone issue.

    My servers are located in EST, but i need this data in PST.

    So when it converts this:

    hive> select from_unixtime(1270145333,'yyyy-MM-dd HH:mm:ss') from
    ut2;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0102, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0102
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0102
    2010-04-01 18:28:23,041 Stage-1 map = 0%, reduce = 0%
    2010-04-01 18:28:37,315 Stage-1 map = 67%, reduce = 0%
    2010-04-01 18:28:43,386 Stage-1 map = 100%, reduce = 0%
    2010-04-01 18:28:46,412 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0102
    OK
    2010-04-01 14:08:53
    Time taken: 30.191 seconds


    I need it to be :
    2010-04-01 11:08:53


    I tried setting the variable in my .bash_profile for TZ=/ /Americas/ = no
    go.

    Nothing in the hive ddl link you is leading me in the right direction. Is
    there something you guys can recommend? I can write a script outside of
    hive, but it would be great if I can have users handle this within their
    queries.

    Thanks in advance!

    /tom



    On Thu, Apr 1, 2010 at 2:17 PM, tom kersnick wrote:

    ok thanks....

    I should have caught that.

    /tom



    On Thu, Apr 1, 2010 at 2:13 PM, Carl Steinbach wrote:

    Hi Tom,

    "Unix Time" is defined as the number of *seconds* since January 1, 1970.
    It looks like the data you have in cola is in milliseconds. You need to
    divide this value by 1000 before calling from_unixtime() on the result.

    Thanks.

    Carl

    On Thu, Apr 1, 2010 at 2:02 PM, tom kersnick wrote:

    Thanks, but there is something fishy going on.

    Im using hive 0.5.0 with hadoop 0.20.1

    I tried the column as both a bigint and a string. According the hive
    ddl:

    string

    from_unixtime(int unixtime)

    Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC)
    to a string representing the timestamp of that moment in the current system
    time zone in the format of "1970-01-01 00:00:00"

    It looks like the input is int, that would be too small for my
    1270145333155 timestamp.

    Any ideas?

    Example below:

    /tom


    hive> describe ut;
    OK
    cola bigint
    colb string
    Time taken: 0.101 seconds


    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.065 seconds

    hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;

    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0083, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0083
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0083
    2010-04-01 16:57:32,407 Stage-1 map = 0%, reduce = 0%
    2010-04-01 16:57:45,577 Stage-1 map = 100%, reduce = 0%
    2010-04-01 16:57:48,605 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0083
    OK
    42219-04-22 00:05:55 tuesday
    Time taken: 18.066 seconds


    hive> describe ut;
    OK
    cola string
    colb string
    Time taken: 0.077 seconds


    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.065 seconds


    hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;
    FAILED: Error in semantic analysis: line 1:7 Function Argument Type
    Mismatch from_unixtime: Looking for UDF "from_unixtime" with parameters
    [class org.apache.hadoop.io.Text, class org.apache.hadoop.io.Text]




    On Thu, Apr 1, 2010 at 1:37 PM, Carl Steinbach wrote:

    Hi Tom,

    I think you want to use the from_unixtime UDF:

    hive> describe function extended from_unixtime;
    describe function extended from_unixtime;
    OK
    from_unixtime(unix_time, format) - returns unix_time in the specified
    format
    Example:
    SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss') FROM src LIMIT 1;
    '1970-01-01 00:00:00'
    Time taken: 0.647 seconds
    hive>

    Thanks.

    Carl
    On Thu, Apr 1, 2010 at 1:11 PM, tom kersnick wrote:

    hive> describe ut;
    OK
    time bigint
    day string
    Time taken: 0.128 seconds
    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.085 seconds

    When I run this simple query, I'm getting a NULL for the time column
    with data type bigint.

    hive> select unix_timestamp(time),day from ut;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0080, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0080
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0080
    2010-04-01 16:03:54,024 Stage-1 map = 0%, reduce = 0%
    2010-04-01 16:04:06,128 Stage-1 map = 100%, reduce = 0%
    2010-04-01 16:04:09,150 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0080
    OK
    NULL tuesday
    Time taken: 16.981 seconds

    Any ideas?

    Thanks!

    /tom

    --
    Yours,
    Zheng
  • Tom kersnick at Apr 1, 2010 at 11:37 pm
    Ok thanks!

    I will try it out.....

    /tom


    On Thu, Apr 1, 2010 at 4:31 PM, Zheng Shao wrote:

    Setting TZ in your .bash_profile won't work because the map/reduce tasks
    runs on the hadoop clusters.
    If you start your hadoop tasktracker with that TZ setting, it will probably
    work.

    Zheng
    On Thu, Apr 1, 2010 at 3:32 PM, tom kersnick wrote:

    So its working, but Im having a time zone issue.

    My servers are located in EST, but i need this data in PST.

    So when it converts this:

    hive> select from_unixtime(1270145333,'yyyy-MM-dd HH:mm:ss') from
    ut2;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0102, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0102
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0102
    2010-04-01 18:28:23,041 Stage-1 map = 0%, reduce = 0%
    2010-04-01 18:28:37,315 Stage-1 map = 67%, reduce = 0%
    2010-04-01 18:28:43,386 Stage-1 map = 100%, reduce = 0%
    2010-04-01 18:28:46,412 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0102
    OK
    2010-04-01 14:08:53
    Time taken: 30.191 seconds


    I need it to be :
    2010-04-01 11:08:53


    I tried setting the variable in my .bash_profile for TZ=/ /Americas/ =
    no go.

    Nothing in the hive ddl link you is leading me in the right direction. Is
    there something you guys can recommend? I can write a script outside of
    hive, but it would be great if I can have users handle this within their
    queries.

    Thanks in advance!

    /tom



    On Thu, Apr 1, 2010 at 2:17 PM, tom kersnick wrote:

    ok thanks....

    I should have caught that.

    /tom



    On Thu, Apr 1, 2010 at 2:13 PM, Carl Steinbach wrote:

    Hi Tom,

    "Unix Time" is defined as the number of *seconds* since January 1, 1970.
    It looks like the data you have in cola is in milliseconds. You need to
    divide this value by 1000 before calling from_unixtime() on the result.

    Thanks.

    Carl

    On Thu, Apr 1, 2010 at 2:02 PM, tom kersnick wrote:

    Thanks, but there is something fishy going on.

    Im using hive 0.5.0 with hadoop 0.20.1

    I tried the column as both a bigint and a string. According the hive
    ddl:

    string

    from_unixtime(int unixtime)

    Converts the number of seconds from unix epoch (1970-01-01 00:00:00
    UTC) to a string representing the timestamp of that moment in the current
    system time zone in the format of "1970-01-01 00:00:00"

    It looks like the input is int, that would be too small for my
    1270145333155 timestamp.

    Any ideas?

    Example below:

    /tom


    hive> describe ut;
    OK
    cola bigint
    colb string
    Time taken: 0.101 seconds


    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.065 seconds

    hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;

    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0083, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0083
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0083
    2010-04-01 16:57:32,407 Stage-1 map = 0%, reduce = 0%
    2010-04-01 16:57:45,577 Stage-1 map = 100%, reduce = 0%
    2010-04-01 16:57:48,605 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0083
    OK
    42219-04-22 00:05:55 tuesday
    Time taken: 18.066 seconds


    hive> describe ut;
    OK
    cola string
    colb string
    Time taken: 0.077 seconds


    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.065 seconds


    hive> select from_unixtime(cola,'yyyy-MM-dd HH:mm:ss'),colb from ut;
    FAILED: Error in semantic analysis: line 1:7 Function Argument Type
    Mismatch from_unixtime: Looking for UDF "from_unixtime" with parameters
    [class org.apache.hadoop.io.Text, class org.apache.hadoop.io.Text]




    On Thu, Apr 1, 2010 at 1:37 PM, Carl Steinbach wrote:

    Hi Tom,

    I think you want to use the from_unixtime UDF:

    hive> describe function extended from_unixtime;
    describe function extended from_unixtime;
    OK
    from_unixtime(unix_time, format) - returns unix_time in the specified
    format
    Example:
    SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss') FROM src LIMIT 1;
    '1970-01-01 00:00:00'
    Time taken: 0.647 seconds
    hive>

    Thanks.

    Carl
    On Thu, Apr 1, 2010 at 1:11 PM, tom kersnick wrote:

    hive> describe ut;
    OK
    time bigint
    day string
    Time taken: 0.128 seconds
    hive> select * from ut;
    OK
    1270145333155 tuesday
    Time taken: 0.085 seconds

    When I run this simple query, I'm getting a NULL for the time column
    with data type bigint.

    hive> select unix_timestamp(time),day from ut;

    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201003031204_0080, Tracking URL =
    http://master:50030/jobdetails.jsp?jobid=job_201003031204_0080
    Kill Command = /usr/local/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=master:54311 -kill job_201003031204_0080
    2010-04-01 16:03:54,024 Stage-1 map = 0%, reduce = 0%
    2010-04-01 16:04:06,128 Stage-1 map = 100%, reduce = 0%
    2010-04-01 16:04:09,150 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201003031204_0080
    OK
    NULL tuesday
    Time taken: 16.981 seconds

    Any ideas?

    Thanks!

    /tom

    --
    Yours,
    Zheng

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedApr 1, '10 at 8:12p
activeApr 1, '10 at 11:37p
posts9
users4
websitehive.apache.org

People

Translate

site design / logo © 2022 Grokbase