FAQ

Search Discussions

70 discussions - 366 posts

  • What's the best way to compute median and other percentiles using Hive 0.40? I've run across http://issues.apache.org/jira/browse/HIVE-259 but there doesn't seem to be any planned implementation yet. ...
    Bryan TalbotBryan Talbot
    Feb 5, 2010 at 5:09 am
    Mar 20, 2014 at 2:49 pm
  • Hi, Installation is giving an error as master/hadoop/hadoop-0.20.1/build.xml:895: 'java5.home' is not defined. Forrest requires Java 5. Please pass -Djava5.home=<base of Java 5 distribution to Ant on ...
    Vidyasagar Venkata NallapatiVidyasagar Venkata Nallapati
    Feb 11, 2010 at 7:20 am
    Feb 15, 2010 at 6:55 pm
  • I started HiveServer for the first time using instructions from the following page: http://wiki.apache.org/hadoop/Hive/HiveServer 1) bin/hive --service hiveserver 2) ant test ...
    Something SomethingSomething Something
    Feb 21, 2010 at 12:37 am
    Feb 23, 2010 at 6:22 pm
  • Anyone have a working UDF jar for GeoIP lookups using MaxMind's data? I saw one being discussed a few months ago, but haven't seen it in any contrib branches. Also, I tried building a simple one ...
    Adam O'DonnellAdam O'Donnell
    Feb 15, 2010 at 3:43 am
    Feb 18, 2010 at 4:45 pm
  • Here is my std-error : hive insert overwrite local directory '/tmp/mystuff' select transform(*) using 'my.py' FROM myhivetable; Total MapReduce jobs = 1 Number of reduce tasks is set to 0 since ...
    Prasenjit mukherjeePrasenjit mukherjee
    Feb 17, 2010 at 10:49 am
    Apr 14, 2010 at 1:32 am
  • I have Hive 4.1-rc2. My query runs in Time taken: 312.956 seconds using the map/reduce join. I was interested in using mapjoin, I get an OOM error. hive java.lang.OutOfMemoryError: GC overhead limit ...
    Edward CaprioloEdward Capriolo
    Feb 18, 2010 at 10:45 pm
    Feb 26, 2010 at 4:58 am
  • Hello, I've seen issues similar to this one come up once or twice before, but I haven't ever seen a solution to the problem that I'm having. I was following the Compressed Storage page on the Hive ...
    Brent MillerBrent Miller
    Feb 16, 2010 at 9:44 pm
    Feb 18, 2010 at 7:36 am
  • Hi all, I read the tutorial of Hive, and it says that "no two aggregations can have different DISTINCT columns". Could anyone tell what is the reason ? Does the following Distinct will been translate ...
    Jeff ZhangJeff Zhang
    Feb 25, 2010 at 9:01 am
    Mar 30, 2010 at 9:23 am
  • I'm trying to run Hive, but I'm getting the message Missing Hive Execution Jar: hive/lib/hive-exec-*.jar. I followed all the download and build instructions on the Wiki page.
    Aryeh BerkowitzAryeh Berkowitz
    Feb 23, 2010 at 2:29 pm
    Feb 24, 2010 at 6:33 pm
  • Hi all, We are going to hold the second Hive User Group Meeting at 7PM on 3/18/2010 Thursday. The agenda will be: * Hive Tutorial: 20 min * Hive User Case Study: 20 min * New Features and API: 25 min ...
    Zheng ShaoZheng Shao
    Feb 26, 2010 at 9:56 pm
    Mar 15, 2010 at 9:00 pm
  • Hi folks, We have released Hive 0.5.0. You can find it from the download page in 24 hours (still waiting to be mirrored) http://hadoop.apache.org/hive/releases.html#Download -- Yours, Zheng
    Zheng ShaoZheng Shao
    Feb 24, 2010 at 8:34 am
    Feb 24, 2010 at 6:35 pm
  • We just upgraded to hadoop 0.20 (from hadoop 0.18), impressively our same hive package kept working against the new hadoop setup. Since the upgrade every hive starts with only 1 map task though. Even ...
    Tim SellTim Sell
    Feb 23, 2010 at 7:00 pm
    Feb 24, 2010 at 1:01 pm
  • Hi, The size of my Gzipped weblog files is about 35MB. However, upon enabling block compression, and inserting the logs into another Hive table (sequencefile), the file size bloats up to about 233MB. ...
    Saurabh NandaSaurabh Nanda
    Feb 1, 2010 at 5:03 am
    Feb 19, 2010 at 6:09 pm
  • All, Could anyone tell me on how to generate a row id for a new record in Hive? Many thanks. weiwei
    Weiwei HsiehWeiwei Hsieh
    Feb 25, 2010 at 2:48 am
    Mar 1, 2010 at 8:16 pm
  • Hi Hive users, I've got a somewhat convoluted query which I'm wondering how I would translate it to hive... It is similar to the first FAQ example here: http://www.techonthenet.com/sql/max.php So ...
    Tom NicholsTom Nichols
    Feb 23, 2010 at 8:47 pm
    Feb 24, 2010 at 2:19 pm
  • Wondering if there is a pre-built working AMI containing Hive+Hadoop. Cloudera's AMI installs the packages, and hence need some additional startup scripts, which I am trying to avoid.
    Prasenjit mukherjeePrasenjit mukherjee
    Feb 15, 2010 at 6:18 pm
    Feb 16, 2010 at 4:40 am
  • Hi, I am writing a UDAF which takes in 4 parameters. I have 2 cases - one where all the paramters are ints, and second where the last parameter is double. I wrote two evaluators for this, with ...
    Sonal GoyalSonal Goyal
    Feb 3, 2010 at 11:24 am
    Feb 4, 2010 at 7:44 am
  • We created a table without the ŒEXTERNAL¹ qualifier but did specify a location for the warehouse. We would like to modify this to be an external table. We tried to drop the table, but it does delete ...
    Eva TseEva Tse
    Feb 18, 2010 at 6:23 pm
    Feb 19, 2010 at 9:55 pm
  • hi, I've use hive map reduce to process some log files. I found out that hive will output like "num1 rows loaded to table_name" message every run. But the "num1" not equal to "select count(1) from ...
    WdWd
    Feb 9, 2010 at 3:06 am
    Feb 9, 2010 at 9:08 am
  • Hey, I am unable to locate any information on the Hive wiki about the various join strategies and optimizations available in Hive (similar to Pig's http://wiki.apache.org/pig/JoinFramework). Would ...
    Jeff HammerbacherJeff Hammerbacher
    Feb 18, 2010 at 9:20 pm
    Sep 28, 2010 at 5:26 am
  • When I try to do a SELECT DISTINCT, I get "No such file" errors. hive SELECT DISTINCT URL FROM URLS; Total MapReduce jobs = 1 Launching Job 1 out of 1 java.io.IOException: No such file or directory ...
    Aryeh BerkowitzAryeh Berkowitz
    Feb 26, 2010 at 1:34 pm
    Mar 25, 2010 at 5:40 pm
  • I am trying to get started with Hive by following instructions on the 'Getting Started' page: http://wiki.apache.org/hadoop/Hive/GettingStarted Here's the problem I am facing: 1) Ran the following ...
    Something SomethingSomething Something
    Feb 20, 2010 at 8:00 pm
    Feb 21, 2010 at 1:07 am
  • Hi , When starting the hive I am getting an error even after I am including in class path, attached is the hadoop-env I am using. Exception in thread "main" java.lang.NoClassDefFoundError: ...
    Vidyasagar Venkata NallapatiVidyasagar Venkata Nallapati
    Feb 17, 2010 at 10:22 am
    Feb 19, 2010 at 5:34 am
  • The Hive wiki states the following for the CREATE TABLE ... DDL: "AS select_statement] (Note: this feature is only available on the latest trunk or versions higher than 0.4.0.)" I'm testing this ...
    E. SammerE. Sammer
    Feb 13, 2010 at 12:41 am
    Feb 13, 2010 at 1:00 am
  • Hey guys, I have another Hadoop cluster that has Hive installed with its own metastore and all. I would like to move/copy/export data from a bunch of Hive tables from a different Hadoop cluster into ...
    Ryan LeCompteRyan LeCompte
    Feb 11, 2010 at 5:01 pm
    Feb 12, 2010 at 4:15 pm
  • Hi, While building the Hive I am getting an error, please help me with the changes I need to do to build it. I was getting the same with hadoop 20.1. ivy-retrieve-hadoop-source: [ivy:retrieve] :: Ivy ...
    Vidyasagar Venkata NallapatiVidyasagar Venkata Nallapati
    Feb 3, 2010 at 5:39 am
    Feb 4, 2010 at 6:00 am
  • Hi, I am writing a UDAF which returns the top x results per key. Lets say my input is key attribute count 1 1 6 1 2 5 1 3 4 2 1 8 2 2 4 2 3 1 I want the top 2 results per key. Which will be: key ...
    Sonal GoyalSonal Goyal
    Feb 1, 2010 at 5:39 am
    Feb 4, 2010 at 5:53 am
  • If I have a line like the following: <2010-02-09 18:00:16.123 UTC :[48394803]:<MDS-CS_MDS1 :<DEBUG :<LAYER = EP2P, EVENT = Receiving, DEVICEPIN = 2032acb14, GMETAG = -1966209606, TYPE = 22, METHOD = ...
    Daniel JoanesDaniel Joanes
    Feb 25, 2010 at 4:45 pm
    Feb 25, 2010 at 7:15 pm
  • Hey guys, It looks like Hive 0.5.0 is not working with our existing table definitions that have a column named "first". When we try to reference this column or create a new table with this column, ...
    Ryan LeCompteRyan LeCompte
    Feb 25, 2010 at 12:22 pm
    Feb 25, 2010 at 3:14 pm
  • Guys: How do you go about distributing additional files that may be needed by your reduce scripts? For example, I need to distribute a GeoIP database with my reduce script to do some lookups. Thanks! ...
    Adam O'DonnellAdam O'Donnell
    Feb 12, 2010 at 6:24 am
    Feb 13, 2010 at 12:58 am
  • I have a tab separated files I have loaded it with "load data inpath" then I do a SET hive.exec.compress.output=true; SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec; SET ...
    Bennie SchutBennie Schut
    Feb 5, 2010 at 11:23 am
    Feb 5, 2010 at 10:38 pm
  • Hi , Can anyone give me an example in which there is an optimization of "Converting multiple joins into a single multi-way join" .. i.e., reducing the number of map-reduce jobs . I read this from ...
    Bharath vBharath v
    Feb 3, 2010 at 10:07 am
    Feb 4, 2010 at 4:30 am
  • Is there a way to turn this and other map reduce output off? Hive history file=/tmp/hive/hive_job_log_hive_201002262012_705729056.txt hive Instead just this: hive select count(1) from test; OK 232249 ...
    Tom kersnickTom kersnick
    Feb 27, 2010 at 1:15 am
    Feb 27, 2010 at 4:01 am
  • Hi, While starting hive I am still getting an error, attached are the hadoop env and hive-ste I am using phoenix@ph1:/master/hadoop/hive/build/dist$ bin/hive Exception in thread "main" ...
    Vidyasagar Venkata NallapatiVidyasagar Venkata Nallapati
    Feb 22, 2010 at 7:13 am
    Feb 22, 2010 at 7:22 am
  • Hi, I just made a release candidate at https://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.5.0-rc0 The tarballs are at: http://people.apache.org/~zshao/hive-0.4.1-candidate-3/ Please vote. -- ...
    Zheng ShaoZheng Shao
    Feb 20, 2010 at 2:49 am
    Feb 20, 2010 at 8:18 am
  • I couldn't find anything on the wiki so thought I would try here. Does Hive have an IN() operator similar to in MySQL? If not then is there an alternative way of testing for inclusion? Thanks, Andy.
    Andy KentAndy Kent
    Feb 19, 2010 at 4:59 pm
    Feb 20, 2010 at 4:27 am
  • I'm currently running a hive build from trunk, revision number 911889. I've built a UDTF called map_explode which just emits the key and value of each entry in a map as a row in the result table. The ...
    Jason MichaelJason Michael
    Feb 19, 2010 at 10:23 pm
    Feb 19, 2010 at 10:58 pm
  • Hello, I am a newbie to Hive. My question is, can I run the HiveQL scripts from within a Java or Python program? I believe at this time I cannot, correct? Please let me know. Thanks.
    Something SomethingSomething Something
    Feb 19, 2010 at 7:33 am
    Feb 19, 2010 at 7:48 am
  • Hey All, I want to use a JSON serde to read some data in Hive, and was wondering if there is an open source one available somewhere? I know AWS has one available at: ...
    Peter SankauskasPeter Sankauskas
    Feb 12, 2010 at 10:05 pm
    Feb 15, 2010 at 10:53 pm
  • Hey guys, I wrote a SerDe to support lwes (http://lwes.org) using BinarySortableSerDe as a model. The code is very similar, and I serialize an lwes event to a BytesWritable, and deserialize from it. ...
    Roberto CongiuRoberto Congiu
    Feb 12, 2010 at 9:05 pm
    Feb 12, 2010 at 10:53 pm
  • I have a bit of an edge case on using lzo which I think might be related to HIVE-524. When running a query like this: select distinct login_cldr_id as cldr_id from chatsessions_load; I get a ...
    Bennie SchutBennie Schut
    Feb 9, 2010 at 2:04 pm
    Feb 10, 2010 at 7:46 am
  • I would like to run hive release, unfortunately I have an older 18.3 hadoop that was build without version information. I am going to check out the 4.1 release and hack at the shell scripts and the ...
    Edward CaprioloEdward Capriolo
    Feb 9, 2010 at 7:13 pm
    Feb 9, 2010 at 10:51 pm
  • Since the most active HIVE users are (should be) in this list, I wanted to ask your opinion about using the bare hadoop/hive distribution vs. Cloudera's distribution. What are the pros and cons of ...
    Massoud MazarMassoud Mazar
    Feb 9, 2010 at 7:41 pm
    Feb 9, 2010 at 8:41 pm
  • Guys: I have a series of queries that looks for the most recent value associated with a given key. For example, consider the following query: select max(timestamp), hash, most_recent(value) group by ...
    Adam J. O'DonnellAdam J. O'Donnell
    Feb 5, 2010 at 1:04 am
    Feb 5, 2010 at 1:30 am
  • OK 55504011 Time taken: 290.216 seconds hive select count(1) from pageviews; select count(1) from files f; Ended Job = job_200909171715_20347 OK 10164516 Time taken: 29.946 seconds select count(1) ...
    Edward CaprioloEdward Capriolo
    Feb 4, 2010 at 4:41 pm
    Feb 4, 2010 at 6:24 pm
  • Hey guys, Is it possible to concurrently load data into Hive tables (same table, different partition)? I'd like to concurrently execute the LOAD DATA command by two separate processes. Is Hive ...
    Ryan LeCompteRyan LeCompte
    Feb 4, 2010 at 2:52 pm
    Feb 4, 2010 at 5:44 pm
  • I've been using the hive jdbc driver more and more and was missing some functionality which I added HiveDatabaseMetaData.getTables Using "show tables" to get the info from hive. ...
    Bennie SchutBennie Schut
    Feb 3, 2010 at 8:31 am
    Feb 3, 2010 at 5:54 pm
  • Hi all, We're working on the problem in HIVE-984 which a number of people have been hitting, and with luck we'll have a resolution within the next week or so. To address it, we're setting up ...
    John SichiJohn Sichi
    Feb 18, 2010 at 7:27 pm
    Feb 23, 2010 at 7:54 pm
  • Hello, I'm trying to get the hive HWI service up and running. I'm on r911664 and when I try to start the HWI service I get: hive@hadoop-master:~$ hive --service hwi ls: cannot access ...
    Brent MillerBrent Miller
    Feb 19, 2010 at 11:11 pm
    Feb 20, 2010 at 4:57 pm
  • Hi, Is there any page/document that describes the methods/techniques used by Hive to arrive at the optimum number of map tasks & optimum number of reduce tasks? I'm running a 3-node Amazon EMR ...
    Saurabh NandaSaurabh Nanda
    Feb 19, 2010 at 8:52 pm
    Feb 19, 2010 at 9:04 pm
Group Navigation
period‹ prev | Feb 2010 | next ›
Group Overview
groupuser @
categorieshive, hadoop
discussions70
posts366
users61
websitehive.apache.org

61 users for February 2010

Zheng Shao: 67 posts Carl Steinbach: 32 posts Edward Capriolo: 28 posts Vidyasagar Venkata Nallapati: 16 posts Bennie Schut: 13 posts Adam O'Donnell: 12 posts Sonal Goyal: 12 posts Ning Zhang: 11 posts Something Something: 11 posts Ryan LeCompte: 10 posts Prasenjit mukherjee: 9 posts Aryeh Berkowitz: 8 posts Mafish Liu: 7 posts Namit Jain: 7 posts Andy Kent: 6 posts baburaj.S: 6 posts Prasenjit mukherjee: 6 posts Saurabh Nanda: 6 posts Tim Sell: 6 posts Yongqiang He: 5 posts
show more
Archives