Search Discussions

96 discussions - 446 posts

  • Hi all, A lot of the ideas I have for incorporating Hadoop into internal projects revolves around distributing long-running tasks over multiple machines. I've been able to get a quick prototype up in ...
    Kirk TrueKirk True
    Dec 21, 2007 at 2:47 am
    Dec 26, 2007 at 10:40 pm
  • If we decide to implement our own file-backed Hashmap I'd be willing to contribute it back to the project. We are rolling up unique counts per event ID. So we use event ID as a key, and want to count ...
    C GC G
    Dec 20, 2007 at 2:31 pm
    Dec 26, 2007 at 8:23 pm
  • Is there a way to interact with HBase via TCP/socket connection directly instead of just using the REST api? Thanks
    Thiago JackiwThiago Jackiw
    Dec 7, 2007 at 4:25 am
    Dec 13, 2007 at 3:42 am
  • I not sure of this but why does the master server use up so much memory. I been running an script that been inserting data into a table for a little over 24 hours and the master crashed because of ...
    Dec 20, 2007 at 9:20 pm
    Dec 30, 2007 at 6:57 pm
  • I¹m interested in driving hbase from PHP or Ruby. I have seen at least two approaches mentioned and wonder if there are others out there. * Hbase shell. * REST interface. Can anyone let me know if ...
    Pat FerrelPat Ferrel
    Dec 19, 2007 at 11:11 pm
    Dec 20, 2007 at 4:54 am
  • Hi, I looked at different file types, input and output formats, but got quite confused, and am not sure how to connect the pipe from one format to another. Here is what I would like to do: 1. Pass in ...
    Jim the Standing BearJim the Standing Bear
    Dec 18, 2007 at 1:41 am
    Dec 18, 2007 at 5:57 am
  • i use nutch-0.9, hadoop-0.12.2 and i use this command "bin/nutch crawl urls -dir crawled -depth 3" have error : - crawl started in: crawled - rootUrlDir = input - threads = 10 - depth = 3 - Injector: ...
    Dec 14, 2007 at 12:44 am
    Jan 7, 2008 at 2:13 am
  • Hey guys, triggered by a post on the mailing list I also checked our 0.14 cluster and although we really though we did the finalize after the upgrade we also have a big "previous" dir there. A couple ...
    Torsten CurdtTorsten Curdt
    Dec 12, 2007 at 10:42 am
    Dec 16, 2007 at 1:58 pm
  • Hi, I run hadoop on a BSD4 clusters and each map task is a gzip file (about 10MB). Some tasks finished. But many of them failed due to heap out of memory. I got the following syslogs: 2007-12-06 ...
    Rui ShiRui Shi
    Dec 6, 2007 at 8:31 pm
    Dec 11, 2007 at 5:07 am
  • I am trying to install/configure hadoop on a cluster with several computers. I followed exactly the instructions in the hadoop website for configuring multiple slaves, and when I run start-all.sh I ...
    Dec 5, 2007 at 5:00 pm
    Nov 13, 2008 at 8:29 pm
  • Hi all, I'm trying to implement a clustering algorithm on hadoop. Among other things, there're a lot of matrix multiplications. LINA (http://wiki.apache.org/lucene-hadoop/Lina) is probably going to ...
    Milan SimonovicMilan Simonovic
    Dec 28, 2007 at 8:51 pm
    Dec 31, 2007 at 6:14 am
  • We have two flavors of jobs we run through hadoop, the first flavor is a simple merge sort, where there is very little happening in the mapper or the reducer. The second flavor are very compute ...
    Jason VennerJason Venner
    Dec 25, 2007 at 9:53 pm
    Dec 29, 2007 at 3:00 am
  • Hello! We would like to use Hadoop to index a lot of documents, and we would like to have this index in the Lucene and utilize Lucene's search engine power for searching. At this point I am confused ...
    Eugeny N DzhurinskyEugeny N Dzhurinsky
    Dec 13, 2007 at 9:18 am
    Dec 14, 2007 at 10:03 am
  • Hi All, Kindly clarify our doubts mentioned below 1.Did Separate machines/nodes needed for Namenode ,Jobtracker, Slavenodes 2. If hadoop is configured in multinode cluster(with One machine as ...
    Dec 20, 2007 at 7:17 am
    Dec 21, 2007 at 12:05 am
  • On one of my tables I get this when trying to get results from a scanner I call for the scanner location and get it fine but on the first call I get the below error Error 500 The character 0x1b is ...
    Dec 30, 2007 at 11:57 pm
    Jan 1, 2008 at 8:17 am
  • Is there a way to query the meta to tell what server has what region? I tried select from .META. and -ROOT- but no luck from shell. Thanks Billy
    Dec 3, 2007 at 5:18 am
    Dec 17, 2007 at 9:15 pm
  • Hi, Did anyone port and run Hadoop on FreeBSD clusters? Thanks, Rui Be a better sports nut! Let your teams follow you with Yahoo Mobile. Try it now. ...
    Rui ShiRui Shi
    Dec 1, 2007 at 6:39 am
    Dec 4, 2007 at 4:09 am
  • Hbase does split on a row key level so what's to happens if I have a row that's larger then the max region size set in the conf? I have one row that has been split into many smaller regions I just ...
    Dec 19, 2007 at 9:35 am
    Dec 27, 2007 at 7:25 pm
  • We have relatively heavy weight objects that we pass around the cluster for our map/reduce tasks. We have noticed that when we are using the multi threaded mapper, we don't get very high utilization ...
    Jason VennerJason Venner
    Dec 12, 2007 at 9:20 pm
    Dec 13, 2007 at 2:43 am
  • We have a small Apple cluster running Hadoop. But another option we have, built into the Apple Server OS, is to use their Xgrid, which they promote for "supercomputer" scientific applications. Any ...
    Bob FutrelleBob Futrelle
    Dec 5, 2007 at 3:13 am
    Dec 5, 2007 at 6:01 pm
  • Hi, I am running a map/reduce task on a large cluster (70+ machines). I use a single input file, and sufficient number of map/reduce tasks so that each map process gets 250k records. That is, if my ...
    Dec 26, 2007 at 10:51 pm
    Dec 28, 2007 at 6:22 am
  • grid-leads@yahoo-inc.com grid-team@yahoo-inc.com grid-pr@yahoo-inc.com kryptonite-user@yahoo-inc.com hadoop-user@lucene.apache.org
    Eric BaldeschwielerEric Baldeschwieler
    Dec 5, 2007 at 6:57 pm
    Dec 5, 2007 at 8:10 pm
  • We have a small cluster of 9 machines on a shared Gig Switch (with a lot of other machines) The other day, running a job, the reduce stalled, when the map was 99.99x% done. 7 of the 9 machines were ...
    Jason VennerJason Venner
    Dec 4, 2007 at 6:07 pm
    Dec 4, 2007 at 10:12 pm
  • Hi All, I'm interested in the architecture of HBase, in particular how it is implemented on top of Hadoop DFS. I understand that HDFS files are write once: after they are initially created they are ...
    James D SadlerJames D Sadler
    Dec 30, 2007 at 5:17 am
    Jan 2, 2008 at 4:29 pm
  • Hi Everybody, i want to create arraylist that collects some objects from the input in the mapper class so that i want to use these collections to filter my input. the problem is my arraylist can't ...
    Dec 28, 2007 at 2:35 pm
    Dec 31, 2007 at 6:21 pm
  • Hi everybody, please explain me the steps to pass user parameters for the mapper class. thanks. -- View this message in context: ...
    Dec 25, 2007 at 4:44 pm
    Dec 26, 2007 at 2:23 pm
  • hi, everyone! I'm new,I am not familiar with hadoop,so I have some following questions,please help me. thanks 1、Are there 3rd party technologies and included in hadoop or dependencies? 2、How is ...
    Dec 12, 2007 at 1:41 am
    Dec 12, 2007 at 10:36 pm
  • Hi, The current public images only work on the smaller instances. It would be very helpful (save me some time) if someone would be so kind create or publish their hadoop image. Thibaut -- View this ...
    Thibaut BritzThibaut Britz
    Dec 11, 2007 at 3:52 pm
    Dec 12, 2007 at 3:48 pm
  • Hello there! We would like to start several map/reduce jobs on the same host from several threads using different input and output dirs. However we are not able to do this for some reason, and we are ...
    Eugeny N DzhurinskyEugeny N Dzhurinsky
    Dec 3, 2007 at 5:00 pm
    Dec 4, 2007 at 1:36 pm
  • I have been experimenting with that, and when I do, the master saturates well before the slave nodes, and the jobs start experiencing timeouts The map task in question is the IdentityMapper, this job ...
    Jason VennerJason Venner
    Dec 26, 2007 at 7:31 pm
    Dec 26, 2007 at 8:39 pm
  • Hi All: The aggregation classes in Hadoop use a HashMap to hold unique values in memory when computing unique counts, etc. I ran into a situation on 32-node grid (4G memory/node) where a single node ...
    C GC G
    Dec 19, 2007 at 7:59 pm
    Dec 20, 2007 at 4:21 am
  • I been looking around jira and can not find a issue on snapshots is there an snapshot for backup option in the works? Say I want to do a backup on my data I would run a snapshot and it would be ...
    Dec 19, 2007 at 12:29 am
    Dec 19, 2007 at 5:02 pm
  • I have a small setup I am running this on its 3 servers a master and namenode and the other 2 are datanodes and region servers. when I restart the services always the regions from one of my tables ...
    Dec 11, 2007 at 9:48 pm
    Dec 19, 2007 at 9:50 am
  • Hi All: I am migrating from a small grid to a larger one. The small grid runs fine with no issues. On the larger grid, with nearly identical configuration files (just changing host names and file ...
    C GC G
    Dec 17, 2007 at 7:53 pm
    Dec 18, 2007 at 9:22 pm
  • Hi, How can we specify so that the reducers can be invoked lazily? For instance, I know there are no partitions in the range of 200-300. How can I let the hadoop know that no need to invoke reduce ...
    Rui ShiRui Shi
    Dec 15, 2007 at 2:04 am
    Dec 17, 2007 at 6:03 am
  • http://www.amazon.com/gp/browse.html?node=342335011 Sorry for the OT post, but I thought this might be interesting to the HBase users. Best, Garth
    Garth PatilGarth Patil
    Dec 14, 2007 at 5:31 pm
    Dec 14, 2007 at 6:23 pm
  • Hi, My input is a bunch of gz files on local file system. I don't want hadoop to split them for mappers. How should I specify that? Thanks, Rui Be a better friend, newshound, and know-it-all with ...
    Rui ShiRui Shi
    Dec 13, 2007 at 10:53 pm
    Dec 13, 2007 at 11:27 pm
  • Hi All: Is there a tool available that will provide information about how a file is replicated within HDFS? I'm looking for something that will "prove" that a file is replicated across multiple ...
    C GC G
    Dec 10, 2007 at 7:59 pm
    Dec 10, 2007 at 9:17 pm
  • We have jobs that require different resources and as such saturate our machines at different levels or parallelization. What we want to do in the driver is set the number of simultaneous jobs per ...
    Jason VennerJason Venner
    Dec 3, 2007 at 3:34 am
    Dec 3, 2007 at 4:49 pm
  • Hi,everyone! I'm a fresh man in Hadoop, and when I install the Hadoop I have met a lot problem. Now my question is which group below is more reliable ? (1)hadoop 1.14.1 with jdk 1.5 (2)hadoop 1.15 ...
    Dec 1, 2007 at 9:03 am
    Dec 3, 2007 at 12:33 am
  • Hi, I am trying to define a custom key type in hadoop (version 0.15.0) This is how my class looks like: public class ClassAttributeValueKey implements WritableComparable { public int classification; ...
    Camilo ArangoCamilo Arango
    Dec 2, 2007 at 1:38 am
    Dec 2, 2007 at 8:40 pm
  • I tried to enter some test tables in hbase. The example was taken from http://wiki.apache.org/lucene-hadoop/Hbase/HbaseShell but it fails no matter which bloomfiler option I choose. Is there a better ...
    Peter ThygesenPeter Thygesen
    Dec 18, 2007 at 12:38 pm
    Jun 15, 2010 at 8:26 pm
  • Hello, I am just looking into Hadoop for a possible application and was hoping to get some feedback about whether it is a good fit and how to structure it. Basically my application works like this: ...
    Kevin CorbyKevin Corby
    Dec 21, 2007 at 3:50 pm
    Dec 21, 2007 at 9:15 pm
  • Hi all, I would like to know that has anyone tried to insert timestamp in the table created in hbase? I tried to insert timestamp with the command insert but every time I tried, it displayed error ...
    晨佳 王晨佳 王
    Dec 20, 2007 at 5:23 pm
    Dec 21, 2007 at 8:42 pm
  • I am Doing R&D on hadoop My requirement is to store huge datas and retrive data by search,I have searched on the web that Hadoop is the best solution. If any one have this kind of Document (some ...
    Dec 21, 2007 at 4:50 am
    Dec 21, 2007 at 11:03 am
  • Hi colleague, After reading the API docs about hbase,I don't know how to manipulate the hase using the java API .Would you please send me some examples? Thank you! Ma Qiang Department of Computer ...
    Ma qiangMa qiang
    Dec 19, 2007 at 3:52 am
    Dec 19, 2007 at 4:38 pm
  • I have tried to load hbase several times and always keep filing 2007-12-18 14:21:45,062 FATAL org.apache.hadoop.hbase.HRegionServer: Replay of hlog required. Forcing server restart ...
    Dec 18, 2007 at 8:50 pm
    Dec 18, 2007 at 9:48 pm
  • Hello,everyone! Our company prepared to stduy hadoop to deal with massive data. so company sent me to survey of hadoop community so there are following questions. please give me some suggestion. ...
    Dec 17, 2007 at 6:51 am
    Dec 18, 2007 at 6:27 am
  • Hi colleague, After reading the api docs about hbase,I don't know how to manipulate the hase using the java api .Would you please send me some examples? Thank you! Ma Qiang Department of Computer ...
    Ma qiangMa qiang
    Dec 14, 2007 at 1:36 pm
    Dec 14, 2007 at 5:16 pm
  • All of the answers to this thread were critically helpful for management and those trying to understand hadoop and the opportunities. And what kind of hardware we should be looking at. Does this ...
    Chris FellowsChris Fellows
    Dec 11, 2007 at 8:43 pm
    Dec 14, 2007 at 4:05 pm
Group Navigation
period‹ prev | Dec 2007 | next ›
Group Overview
groupcommon-user @

100 users for December 2007

Ted Dunning: 57 posts Billy: 38 posts Jason Venner: 23 posts Edward yoon: 20 posts Rui Shi: 18 posts Stack: 14 posts Bryan Duxbury: 12 posts Joydeep Sen Sarma: 10 posts Owen O'Malley: 10 posts Arun C Murthy: 9 posts C G: 8 posts Chad Walters: 8 posts Eugeny N Dzhurinsky: 8 posts Doug Cutting: 7 posts Jim Kellerman: 7 posts Jim the Standing Bear: 7 posts Peter Thygesen: 7 posts Eric Baldeschwieler: 6 posts Jibjoice: 6 posts Thiago Jackiw: 6 posts
show more