Search Discussions

69 discussions - 284 posts

  • I do think it is a great idea that hive/pig/ and map reduce share a meta store. However I am not sure I agree with the approach. IMHO Howl should be a hive sub project. "The initial release of Howl ...
    Edward CaprioloEdward Capriolo
    Feb 2, 2011 at 11:12 pm
    Feb 4, 2011 at 2:10 am
  • Hi, I was wondering if hive supports Sequence File format. If yes, could me point me to some documentation about how to use Seq files in hive. Thanks, -JJ
    Mapred LearnMapred Learn
    Feb 17, 2011 at 8:00 pm
    Jun 28, 2011 at 5:30 pm
  • Hi Experts I'm using hive for a few projects and i found it a great tool in hadoop to process end to end structured data. Unfortunately I'm facing a few challenges out here as follows Availability of ...
    Bejoy KsBejoy Ks
    Feb 21, 2011 at 2:54 pm
    Feb 24, 2011 at 2:43 pm
  • Hi! I'm having some trouble running queries from a java client against a remote Thrift Hive server. Its all setup and quicker queries do run through fine. But queries which run longer than about 10 ...
    Ayush GuptaAyush Gupta
    Feb 25, 2011 at 2:53 am
    Feb 25, 2011 at 6:33 pm
  • Earlier I had hive running on single node hadoop which was working fine. Now I made it 2 node hadoop cluster. When I run hive from cli I am getting following error java.lang.RuntimeException: Error ...
    Amlan MandalAmlan Mandal
    Feb 21, 2011 at 6:55 am
    Feb 22, 2011 at 6:26 am
  • Hey, I started using Toad for querying hive, looks promising http://nosql.mypopescu.com/post/2913202510/hive-and-hbase-in-toad-for-cloud-demo http://toadforcloud.com/index.jspa enjoy Thanks, Guy
    Guy DoulbergGuy Doulberg
    Feb 15, 2011 at 4:51 pm
    Feb 28, 2011 at 4:59 am
  • Hello, So I have table of item views with item_sid, ip_number, session_id I know it will not be that exact, but I want to get unique views per item, and i will accept ip_number, session_id tuple as ...
    Cam BazzCam Bazz
    Feb 21, 2011 at 11:08 am
    Feb 23, 2011 at 9:05 am
  • Hello, What kind of strategy must i follow, in order to periodically run certain things. For example, each hour, i want to look up log files from certain dir, and for new files, i need to run: load ...
    Cam BazzCam Bazz
    Feb 9, 2011 at 12:29 am
    Feb 10, 2011 at 9:38 am
  • When we try to join two large tables some of the reducers stop with an OutOfMemory exception. Error: java.lang.OutOfMemoryError: Java heap space at ...
    Bennie SchutBennie Schut
    Feb 18, 2011 at 9:54 am
    Feb 24, 2011 at 6:13 am
  • Hello, I set up my one node pseudo distributed system, left with a cronjob, copying data from a remote server and loading them to hadoop, and doing some calculations per hour. It stopped working ...
    Cam BazzCam Bazz
    Feb 12, 2011 at 12:38 am
    Feb 12, 2011 at 9:51 pm
  • Hi All, I am a hive newbie. LOAD DATA *LOCAL* INPATH 'filepath' [OVERWRITE] INTO TABLE tablename When I use LOCAL keyword does hive create a hdfs file for it? I used above statement to put data into ...
    Amlan MandalAmlan Mandal
    Feb 1, 2011 at 11:15 am
    Feb 2, 2011 at 5:29 pm
  • Hi, I am trying to perform union of two tables which are having identical schemas and distinct data.There are two tables 'oldtable' and 'newtable'. The old table contains the information of old users ...
    Sangeetha sSangeetha s
    Feb 18, 2011 at 7:12 am
    Feb 20, 2011 at 4:00 pm
  • Hi all, I am loading data into hive tables by connecting to hiveserver through thrift api using "load data local inpath ... " query . Hive server is running as a background process for days . After ...
    Vaibhav negiVaibhav negi
    Feb 11, 2011 at 8:05 am
    Feb 15, 2011 at 1:06 pm
  • Hello, I have the following table definition (simplified to help in debugging): create external table pvs ( time INT, server STRING, thread_id STRING ) partitioned by ( dt string ) row format ...
    Feb 13, 2011 at 7:09 am
    Feb 13, 2011 at 5:12 pm
  • Hello, In my Hive cluster, I have setup the mapred.reduce.tasks to be -1 i.e. I am allowing HIVE to figure out the # of reducers that it would need from the data. When I run a query, it determines ...
    Viral BajariaViral Bajaria
    Feb 10, 2011 at 9:58 pm
    Feb 11, 2011 at 1:08 am
  • Dear All I need your opinions about the problem I encountered during the data migration process. The file, which includes "|" pipe, is recognized as a Delimiter, and than an error occurs. What could ...
    Feb 14, 2011 at 11:41 am
    Feb 17, 2011 at 1:53 pm
  • Hello, Is it possible to delete rows belonging to a partition? or is it undeletable like a table's data? best regards, -c.b.
    Cam BazzCam Bazz
    Feb 12, 2011 at 10:17 pm
    Feb 12, 2011 at 10:47 pm
  • Hello, How can I do some process for each partition in some other table. for example lets say table A has partitions 1,2,3 I want to be able to say for each partition in A do { select * from A where ...
    Cam BazzCam Bazz
    Feb 9, 2011 at 4:57 am
    Feb 10, 2011 at 1:09 am
  • Hi Is there any function in hive with which we can add hours/minutes to a given stamp. Say I have a timestamp oriented column 'Arrival_Time', to do some database oriented calculations i have to add 4 ...
    Bejoy KsBejoy Ks
    Feb 23, 2011 at 11:49 am
    Feb 24, 2011 at 3:22 pm
  • Hello, I am running hive with a hadoop mini cluster and want to switch to mysql as a metastore because I need multiple connections at same time (not sure if this will work at all). I tried the local ...
    Feb 23, 2011 at 12:29 pm
    Feb 24, 2011 at 9:03 am
  • Hello, I have three tables, one that counts hits, the other unique visits, and the other clicks on that page: The query below will fail to produce correct results: (number of uniques is wrong, always ...
    Cam BazzCam Bazz
    Feb 23, 2011 at 11:17 am
    Feb 24, 2011 at 7:38 am
  • I would like to implement the moving average as a UDF (instead of a streaming reducer). Here is what I am thinking. Please let me know if I am missing something here: SELECT product, date, ...
    Igor TatarinovIgor Tatarinov
    Feb 22, 2011 at 6:45 am
    Feb 22, 2011 at 10:59 pm
  • Hello, I was wondering if anyone managed to unit test Hive scripts and share his/her experience? My first thought was to prepare sample data, run hive scripts in order to generate output and then ...
    Radek MaciaszekRadek Maciaszek
    Feb 18, 2011 at 11:59 am
    Feb 18, 2011 at 9:24 pm
  • Hello, I sometimes need to delete everything in hdfs and recreate the tables. The question is: how do I clear everything in the hdfs and hive? I delete everything in /tmp, hadoop/logs and any ...
    Cam BazzCam Bazz
    Feb 11, 2011 at 10:52 pm
    Feb 12, 2011 at 12:57 am
  • I have been using the Bulk Load example here: http://wiki.apache.org/hadoop/Hive/HBaseBulkLoad I am having an issue with a bulk load of 1 million records into HBase on a cluster of 6 using Hive. Hive ...
    Brian SalazarBrian Salazar
    Feb 4, 2011 at 7:36 pm
    Feb 4, 2011 at 8:42 pm
  • Hi, I have a Hive query that has a statement like this "(sum(itemcount) / count(item))". I want to specify only two digits of precision (i.e. 53.55). The result is stored inside of a string, not its ...
    Aurora Skarra-GallagherAurora Skarra-Gallagher
    Feb 24, 2011 at 7:32 pm
    Mar 15, 2011 at 11:00 am
  • Hi, I wrote a simple UDAF for Hive 0.6 and I had to include null checks in terminatePartial even though the object should never be null if init is always called before terminatePartial. For instance, ...
    Aurora Skarra-GallagherAurora Skarra-Gallagher
    Feb 15, 2011 at 4:55 pm
    Mar 15, 2011 at 9:35 am
  • Hi, Reading the wiki on dynamic partition, there is best practice example to solve the issue of creating too many dynamic partitions on a specific node. However, the query does not work. ...
    Wil -Wil -
    Feb 28, 2011 at 11:59 pm
    Mar 1, 2011 at 11:28 pm
  • Vivek KrishnaVivek Krishna
    Feb 28, 2011 at 1:42 pm
    Feb 28, 2011 at 8:57 pm
  • **************************************************************************** *********** This e-mail and attachments contain confidential information from HUAWEI, which is intended only for the ...
    Feb 24, 2011 at 11:07 am
    Feb 25, 2011 at 4:45 am
  • Hi all, I am using UDFRowSequence as follows: CREATE TEMPORARY FUNCTION rowSequence AS 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence'; mapred.reduce.tasks=1; CREATE TABLE temp_tc1_test as SELECT ...
    Tim RobertsonTim Robertson
    Feb 21, 2011 at 4:48 am
    Feb 21, 2011 at 8:49 pm
  • Thanks for the reply.. (I'm new to Hive). I can't find the driver class. Do you know which files I should be looking for? Regards Stuart by the sound of the error ... it sounds like you don't have ...
    Stuart ScottStuart Scott
    Feb 17, 2011 at 6:36 am
    Feb 17, 2011 at 6:56 am
  • Does Hive have any UDF function to calculate median for a given column? -anurag
    Anurag PhadkeAnurag Phadke
    Feb 10, 2011 at 6:16 pm
    Feb 10, 2011 at 6:21 pm
  • I have been using the Bulk Load example here: http://wiki.apache.org/hadoop/Hive/HBaseBulkLoad I am having an issue with a bulk load of 1 million records into HBase on a cluster of 6 using Hive. Hive ...
    Brian SalazarBrian Salazar
    Feb 4, 2011 at 10:25 pm
    Feb 5, 2011 at 12:09 am
  • (Please pardon my ignorance as I am Hive newbie) When I do Hive QL select * from app_log where partner='abc'; and select * from app_log where partner='ABC' I get different result. That means by ...
    Amlan MandalAmlan Mandal
    Feb 4, 2011 at 10:39 am
    Feb 4, 2011 at 11:16 am
  • Hi, The simplest of hive queries seem to be consuming 100% cpu. This is with a small 4-node cluster. The machines are pretty beefy (16 cores per machine, tons of RAM, 16 M+R maximum tasks configured, ...
    Feb 3, 2011 at 8:49 pm
    Feb 3, 2011 at 11:50 pm
  • Hi, I'm trying to get an idea of how many people plan on running Hive 0.7.0 on top of Hadoop 0.20.0 (as opposed to 0.20.1 or 0.20.2), and are in a position where they can't upgrade to one of more ...
    Carl SteinbachCarl Steinbach
    Feb 1, 2011 at 1:14 am
    Feb 3, 2011 at 6:23 pm
  • Hi, What does setting the "serialization.last.column.takes.rest" SERDEPROPERTIES do for the LazySimpleSerDe? ...
    Aurora Skarra-GallagherAurora Skarra-Gallagher
    Feb 16, 2011 at 7:29 pm
    Mar 15, 2011 at 10:59 am
  • I am trying to query against a partitioned Hive table where the input format of different partitions may be different. I'd like to change the partition file format, and reading the language manual at ...
    Charlie wCharlie w
    Feb 24, 2011 at 3:27 pm
    Feb 24, 2011 at 5:13 pm
  • Hi, Im new to this forum as well as to Hive. In RDBMS we talk of multiple index types such as Btree, BitMap and so. When an index is created in HIVE using CREATE INDEX, what sort of index is created? ...
    Tony martinTony martin
    Feb 15, 2011 at 9:24 pm
    Feb 20, 2011 at 11:06 pm
  • Hi, I have a question regarding the existing date functions in Hive ( http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Date_Functions) The unix_timestamp() functions return a bigint while the ...
    Viral BajariaViral Bajaria
    Feb 18, 2011 at 8:48 pm
    Feb 18, 2011 at 9:09 pm
  • Hello, When we do a left outer join, and the right table does not have row, it will return NULL s for those values. is there any way to turn those nulls into 0's ? since it is cointing operation, if ...
    Cam BazzCam Bazz
    Feb 18, 2011 at 8:02 am
    Feb 18, 2011 at 8:03 pm
  • Is there way to use to use hibernate to work with hive ONLY for select queries. Amlan
    Amlan MandalAmlan Mandal
    Feb 17, 2011 at 6:18 am
    Feb 17, 2011 at 7:52 am
  • An update on this. I've finished doing changes in Oozie Hive-action to work with Hive 0.7. As mentioned before the problem is that not all needed Hive & dependent JARs are available in public Maven ...
    Alejandro AbdelnurAlejandro Abdelnur
    Feb 17, 2011 at 1:12 am
    Feb 17, 2011 at 1:14 am
  • Hi, Does anyone know how to get a Windows client to Connect to Hive successfully? I've tried the code below: Class.forName("org.apache.hadoop.hive.jdbc.HiveDriver"); Connection con = ...
    Stuart ScottStuart Scott
    Feb 16, 2011 at 10:04 pm
    Feb 17, 2011 at 12:23 am
  • Hi, I'm trying this use case: do a simple select from an existing table and pass the results through a reduce script to do some analysis. The table has web logs so the select uses a pseudo user ID as ...
    Feb 16, 2011 at 10:07 pm
    Feb 16, 2011 at 10:27 pm
  • Hello, We have thousands of tables in a Hive database. Many tables have billions of records and multi TB of data data in them. We are looking for efficient mechanism to achieve row level updates on ...
    Sheetal DolasSheetal Dolas
    Feb 16, 2011 at 3:17 am
    Feb 16, 2011 at 7:27 pm
  • Hello, So all my statistics is finally being calculated, results being processed etc, i have a 1 node cluster. Mainly taking 3 aggreate logs from my apache logs. How far this setup will go? I have ...
    Cam BazzCam Bazz
    Feb 14, 2011 at 12:58 am
    Feb 14, 2011 at 4:33 pm
  • Hi all, Sorry if I am missing something obvious but is there an inverse of an explode? E.g. given t1 ID Name 1 Tim 2 Tim 3 Tom 4 Frank 5 Tim Can you create t2: Name ID Tim 1,2,5 Tom 3 Frank 4 In ...
    Tim RobertsonTim Robertson
    Feb 11, 2011 at 4:14 am
    Feb 11, 2011 at 5:03 am
  • Hello, I am making a query such that: insert overwrite table selection_hourly_clicks partition (date_hour = PARTNAME) select sel_sid, count(*) cc from (select split(parse_url(iv.referrer_url,'PATH'), ...
    Cam BazzCam Bazz
    Feb 10, 2011 at 6:04 am
    Feb 10, 2011 at 1:06 pm
Group Navigation
period‹ prev | Feb 2011 | next ›
Group Overview
groupuser @
categorieshive, hadoop

84 users for February 2011

Cam Bazz: 27 posts Ajo Fod: 20 posts Edward Capriolo: 17 posts Amlan Mandal: 16 posts John Sichi: 14 posts Viral Bajaria: 13 posts Carl Steinbach: 7 posts Bejoy Ks: 6 posts Christopher, Pat: 6 posts Jay Ramadorai: 6 posts Ayush Gupta: 5 posts Brian Salazar: 5 posts Sangeetha s: 5 posts Alan Gates: 4 posts Bennie Schut: 4 posts Bharath vissapragada: 4 posts Guy Doulberg: 4 posts Igor Tatarinov: 4 posts Mapred Learn: 4 posts Thiruvel Thirumoolan: 4 posts
show more