Grokbase Groups Hive user July 2009

Search Discussions

63 discussions - 322 posts

  • I loaded 5 files of bzip2 compressed data into a table in Hive. Three are small test files containing 10,000 records. Two were large ~8Gb compressed. When I run a query against the table I see three ...
    Bill CraigBill Craig
    Jul 21, 2009 at 3:15 pm
    Jul 29, 2009 at 5:02 am
  • Hi, I'm trying to create a partitioned table and the partition is not appearing for some reason. Am I doing something wrong, or is this a bug? Below are the commands I'm executing with their output. ...
    Bill GrahamBill Graham
    Jul 29, 2009 at 12:54 am
    Jul 31, 2009 at 11:10 pm
  • Hi, I have a query like select distinct username from (select user as username from page_views pv union all select name as username from users u) ur; But I see that result is not actually distinct. ...
    Rakesh SettyRakesh Setty
    Jul 2, 2009 at 7:39 pm
    Jul 8, 2009 at 5:34 pm
  • I'm trying to register a UDF to parse my log file format. Where can I find documentation for creating and registering a UDF? My attempts failed with this error: hive create temporary function ...
    Saurabh NandaSaurabh Nanda
    Jul 14, 2009 at 8:09 am
    Jul 18, 2009 at 1:09 pm
  • Hi, The DDL page in the Hive Language Manual ( refers to SerDe (, but the page is non-existent. I'm ...
    Saurabh NandaSaurabh Nanda
    Jul 13, 2009 at 8:26 am
    Jul 14, 2009 at 9:15 am
  • Hi, How do I import log files that are not delimited properly (and don't have timestamps in standard formats) into a tab-separated Hive table? What's the simplest approach for me? Saurabh. -- ...
    Saurabh NandaSaurabh Nanda
    Jul 16, 2009 at 5:03 am
    Jul 25, 2009 at 7:56 pm
  • Hi all, We've set up a environment for multiple users to use hive, seperated mysql database for storing metadata was created for each user, so those users' execution can be completely isolated. But ...
    Min ZhouMin Zhou
    Jul 14, 2009 at 4:45 am
    Jul 14, 2009 at 4:22 pm
  • Is there an UPDATE statement in Hive? If not, are there any plans for adding support for it in the future? This is why I ask: I want to maintain a table which, against each user ID, stores the first ...
    Saurabh NandaSaurabh Nanda
    Jul 28, 2009 at 10:47 am
    Jul 29, 2009 at 4:21 am
  • Hey all, I am working on a UDF that can be used with prepared statements ad a technique to go from Hive- SQL. The usage would be something like from src SELECT ...
    Edward CaprioloEdward Capriolo
    Jul 14, 2009 at 2:58 pm
    Jul 19, 2009 at 12:03 am
  • I'm trying to load data into table using the command below. However, I only got a bunch of NULL in the field. The data fields are seperated by tab. CREATE TABLE IF NOT EXISTS userweight(source INT, ...
    Chen kevenChen keven
    Jul 20, 2009 at 10:52 pm
    Jul 27, 2009 at 2:57 am
  • I tried to use" insert into " command to insert data in table. However, hive doesn't recognize it. It gives me error like mismatched input 'into' expecting overwrite in insert clause. "insert ...
    Chen kevenChen keven
    Jul 21, 2009 at 9:33 pm
    Jul 22, 2009 at 5:55 am
  • Hi, I have the following table in Hive Posts(Id, UserId, PostDate, ...) CLUSTERED BY (UserId) SORTED BY (PostDate) INTO 256 BUCKETS; Since the data is hash partitioned based on the 'UserId' column, ...
    Deepak ADeepak A
    Jul 16, 2009 at 10:25 am
    Jul 17, 2009 at 7:01 am
  • Hi, I am a beginner at Hive's SQL so I am sorry if this question is answered somewhere else. I tried to find the answer in Wiki, but no luck. I have a dataset in which one of the columns is text. I ...
    Andraz ToriAndraz Tori
    Jul 27, 2009 at 6:34 pm
    Jul 28, 2009 at 9:57 pm
  • Hive version: r786648 w/ HIVE-487 2nd patch. However, it is working on Hive 0.3. Thanks, Eva. Running the script in this email gives the following errors: Hive history ...
    Eva TseEva Tse
    Jul 17, 2009 at 6:06 pm
    Jul 22, 2009 at 1:43 am
  • Hi, I am writing an UDF which can 'select' required rows/records from a hive-table and dump it into my mysql database over jdbc. However, I am not being able to get mysql-connector jar into the ...
    Abhishek TiwariAbhishek Tiwari
    Jul 16, 2009 at 3:25 pm
    Jul 16, 2009 at 5:38 pm
  • In our hive instance, we have one large fact-type table that joins to several dimension tables on integer keys. I know from reading the Language Manual that in ordering joins it is best to join the ...
    Jason MichaelJason Michael
    Jul 14, 2009 at 4:44 am
    Jul 14, 2009 at 4:02 pm
  • I found all things about jpox*.jar has been removed from the trunk(HIVE-445,HIVE-610). I configured my hive following the old way , found an exception: hive show tables; FAILED: Error in metadata: ...
    Min ZhouMin Zhou
    Jul 13, 2009 at 1:36 am
    Jul 13, 2009 at 4:27 am
  • Attempting to join three tables is consistently failing with a ClassCastException using Hive trunk (r792966) and Hadoop 0.18.3. The three tables are defined as follows: create table foo (foo_id int, ...
    David LermanDavid Lerman
    Jul 10, 2009 at 6:28 pm
    Jul 10, 2009 at 11:14 pm
  • I've gotten new equipment to do an upgrade, but I need to keep my Hadoop cluster pushing data. :-) I am getting the following: Job Submission failed with exception 'Input path doesnt exist : ...
    Jul 8, 2009 at 11:25 pm
    Jul 9, 2009 at 5:40 am
  • I have been following some threads on the hadoop mailing list about speeding up MR jobs. I have a few questions I am sure I can find the answer to if I dig into the source code but I thought I could ...
    Edward CaprioloEdward Capriolo
    Jul 24, 2009 at 4:41 pm
    Jul 31, 2009 at 5:31 am
  • Hi Hive Users. I'm a newbie to hive so this might be a dumb configuration issue. I am trying to run Hive on Amazon EC2 and getting an error when attempting to follow the sample script from the Hive ...
    Ray DuongRay Duong
    Jul 20, 2009 at 2:25 pm
    Jul 22, 2009 at 11:15 pm
  • When test failed, we only see only see message like below [junit] junit.framework.AssertionFailedError: Client execution results failed with error code = 1 [junit] at ...
    Min ZhouMin Zhou
    Jul 15, 2009 at 6:24 am
    Jul 15, 2009 at 5:47 pm
  • Hi all, How do you export hive tables into oracle/mysql? through oci(Oracle Call Interface), jdbc(oci/thin/MysqlDriver) or odbc? Written such tools in c/c++ through HDFS native library or in ...
    Min ZhouMin Zhou
    Jul 14, 2009 at 10:18 am
    Jul 14, 2009 at 4:53 pm
  • Hi all, The situation: create table srcpart(key string, value string) partitioned by (ds string, hr int); load data local inpath \"/Users/char/Documents/workspace/Hive-Clean/data/files/kv1.txt\" ...
    He YongqiangHe Yongqiang
    Jul 9, 2009 at 1:14 pm
    Jul 9, 2009 at 4:57 pm
  • Hi HIVErs, I'm trying to perform the following aggregation query in HIVE, which finds the largest purchase for all combinations of customer and store: SELECT customer, store, max(purchasePrice) FROM ...
    Michael E. DriscollMichael E. Driscoll
    Jul 3, 2009 at 9:35 am
    Jul 7, 2009 at 2:51 am
  • I am currently pulling our 5 minute logs into a Hive table. This results in a partition with ~4,000 tiny files in text format about 4MB per file, per day. I have created a table with an identical ...
    Edward CaprioloEdward Capriolo
    Jul 6, 2009 at 4:47 pm
    Jul 6, 2009 at 9:32 pm
  • Hi All, So hive was a standalone project, then in 0.19.0 I saw it was in the hadoop package but I never used it and now I see in it is a sub project ...
    Tim robertsonTim robertson
    Jul 6, 2009 at 3:21 pm
    Jul 6, 2009 at 7:18 pm
  • Below is the output of Hive for an INSERT-SELECT from one 'EXTERNAL' table to another. This is running in EC2 and the external tables have partitions registered as path-keys in S3. The final upload ...
    Neal RichterNeal Richter
    Jul 31, 2009 at 1:12 am
    Aug 8, 2009 at 5:39 am
  • Hi,I am writing a SerDe class to be able to query some proprietary format we have from hive. The format is basically a sequence of records that are maps coded in binary for which we have access ...
    Roberto CongiuRoberto Congiu
    Jul 8, 2009 at 11:26 pm
    Jul 24, 2009 at 8:38 am
  • I had create a tables using hive and also uploaded some data using 'data load infile'. I also queried that table and everything looked fine at that point. I exited hive and logged in after some time ...
    Vijay Kumar AdhikariVijay Kumar Adhikari
    Jul 22, 2009 at 1:20 am
    Jul 22, 2009 at 1:45 am
  • I'm trying to create tables pragmatically using JDBC. However, I can't really see the table I created from the hive shell. What's worse, when i access hive shell from different directories, i see ...
    Chen kevenChen keven
    Jul 17, 2009 at 5:48 pm
    Jul 20, 2009 at 10:05 pm
  • We set the env variable HIVE_AUX_JARS_PATH to a local path on the master node, which is on EC2 LS , which has a few jar files. When we start hive, it fails to start Hadoop/Hive version : Hadoop 0.20 ...
    Eva TseEva Tse
    Jul 8, 2009 at 10:45 pm
    Jul 9, 2009 at 5:34 pm
  • Haritha, Please post your questions on Thanks, -namit From: Haritha Javvadi Sent: Tuesday, July 07, 2009 4:29 PM To: Namit Jain Subject: Regarding Hive Hello, This is ...
    Namit JainNamit Jain
    Jul 7, 2009 at 11:32 pm
    Jul 8, 2009 at 1:35 am
  • Hi all, I am newbie to Hive. I'd like know whether Hive has Java API like Pig. I did not found any tutorial about the Java API in wiki. I want to the use Hive to generate reports, so using Java API ...
    Zhang jianfengZhang jianfeng
    Jul 30, 2009 at 12:33 pm
    Jul 30, 2009 at 3:05 pm
  • I'd like to duplicate a very large, partitioned table in Hive, preserving all data and partitions. What's the most efficient way to do this?
    Jason MichaelJason Michael
    Jul 28, 2009 at 7:48 pm
    Jul 28, 2009 at 9:36 pm
  • Apparently, hive can't access the temporary file or something. My SQLs fail with an IPException, +++++++++ hive select 1 from netflow; Total MapReduce jobs = 1 Number of reduce tasks is set to 0 ...
    Vijay Kumar AdhikariVijay Kumar Adhikari
    Jul 28, 2009 at 2:40 pm
    Jul 28, 2009 at 9:26 pm
  • I wrote a simple program to a query to pull some information from the database. There is only one query is called with different parameters. It was going fine at the beginning. However, it gives ...
    Keven ChenKeven Chen
    Jul 25, 2009 at 12:56 am
    Jul 25, 2009 at 10:12 pm
  • Hi, I am using EC2 to start a hadoop cluster (cloudera's distribution) and setup hive on it (specifically, the hive client is on the master/jobtracker). I am using the latest version of hive (with ...
    Gaurav ChandaliaGaurav Chandalia
    Jul 18, 2009 at 12:48 am
    Jul 21, 2009 at 2:32 am
  • Hi, Does Hive have any function equivalent to Oracle's dense_rank ( function? Here's what I'm trying to do: For each ...
    Saurabh NandaSaurabh Nanda
    Jul 20, 2009 at 5:23 am
    Jul 20, 2009 at 10:37 pm
  • Hi, I'm a complete Hive newbie, so please excuse me if this question sounds too dumb. I didn't see a native date/time data type at Is this an ...
    Saurabh NandaSaurabh Nanda
    Jul 13, 2009 at 8:33 am
    Jul 13, 2009 at 12:37 pm
  • The concatenation is hanging for you. How much data is there after the filter ? The number of reducers is a function of the above size - by default 1G/reducer. To increase the number of reducers, set ...
    Namit JainNamit Jain
    Jul 6, 2009 at 11:09 pm
    Jul 8, 2009 at 5:40 pm
  • Hi all, It seems that hive would go wrong when storing unicode strings. Hive use byte comparision for delimiting fields of a record( see, a parse method). If we use gbk or utf-8 ...
    Min ZhouMin Zhou
    Jul 8, 2009 at 7:00 am
    Jul 8, 2009 at 7:04 am
  • Hi, The issue of nested types addressed recently through JIRA HIVE-603 is very useful. But I have an issue with the schema specification. I have a table page_views with two columns - page_info is a ...
    Rakesh SettyRakesh Setty
    Jul 7, 2009 at 6:38 pm
    Jul 7, 2009 at 8:24 pm
  • Hi all, I have several MapReduce jobs that are basically doing counts with group by on tab delimited files. Getting tired of writing the same thing over again for each report I am thinking of trying ...
    Tim robertsonTim robertson
    Jul 3, 2009 at 9:12 pm
    Jul 4, 2009 at 7:17 am
  • Hi, I need to store a list of maps as one of the fields of a record. The documentation suggests that arrays are supported only for primitive data types. Is there any way to do this in Hive? One ...
    Rakesh SettyRakesh Setty
    Jul 1, 2009 at 8:21 pm
    Jul 2, 2009 at 3:30 pm
  • Two questions: 1. How would one group by week (instead of date or time)? My first idea was the following: INSERT OVERWRITE TABLE platforms_weekly select platform, count(distinct apikey), count(1), ...
    Andraz ToriAndraz Tori
    Jul 31, 2009 at 11:19 am
    Jul 31, 2009 at 1:52 pm
  • Hello, I noticed the following behavior when trying to create an external table over top of an existing, partitioned table: <snip hive create table foo (id1 int, id2 int) partitioned by(p1 int) ROW ...
    Jason MichaelJason Michael
    Jul 29, 2009 at 11:08 pm
    Jul 29, 2009 at 11:14 pm
  • Is there a way to tell Hive to take multiple input files as input for a single map task. Task setup time is so high in Hive/Hadoop that it really degrades performance when there are many smaller ...
    Andraz ToriAndraz Tori
    Jul 29, 2009 at 8:42 am
    Jul 29, 2009 at 9:41 am
  • Hi, I'm pretty new to hadoop/hive. I have everything running pretty good on a single server. I have a simple table defined with hive for access logs and was trying to import log files with the LOAD ...
    Jul 27, 2009 at 8:23 pm
    Jul 27, 2009 at 8:39 pm
  • Hello, Below is a Hive SQL example of what I'd say is a logic bug of the unnecessary-extra-processing variety. CREATE TABLE tmp_pageviews( user_id INT, count INT) PARTITIONED BY (timeslot STRING) ROW ...
    Neal RichterNeal Richter
    Jul 23, 2009 at 3:53 pm
    Jul 23, 2009 at 5:43 pm
Group Navigation
period‹ prev | Jul 2009 | next ›
Group Overview
groupuser @
categorieshive, hadoop

43 users for July 2009

Saurabh Nanda: 46 posts Zheng Shao: 45 posts Namit Jain: 24 posts Edward Capriolo: 21 posts Prasad Chakka: 21 posts Ashish Thusoo: 18 posts Min Zhou: 14 posts Bill Graham: 12 posts He Yongqiang: 10 posts Rakesh Setty: 10 posts Keven Chen: 9 posts Frederick Oko: 7 posts David Lerman: 6 posts Eva Tse: 6 posts Tim robertson: 6 posts Amr Awadallah: 5 posts Raghu Murthy: 5 posts Abhishek Tiwari: 4 posts Andraz Tori: 4 posts Deepak A: 4 posts
show more