FAQ

Search Discussions

150 discussions - 542 posts

  • Hello, we were considering using hadoop to process some data, we have it set up on 8 nodes ( 1 master + 7 slaves) we filled the cluster up with files that contain tab delimited data. string \tab ...
    Elia MazzawiElia Mazzawi
    Jun 10, 2008 at 10:57 pm
    Jun 11, 2008 at 8:51 pm
  • Hi! Trying to run a streaming.jar reduce only (mapper == cat) job, I'm getting the following error, any ideas what to do? TIA, Andreas Hadoop map task list for job_200805291303_0088 on ...
    Andreas KostyrkaAndreas Kostyrka
    Jun 2, 2008 at 5:12 pm
    Jun 14, 2008 at 5:15 am
  • How does Hadoop decide when to update the "percent complete" for map/reduce tasks? I've been running a small job (~150 MB) on a pseudo-distributed cluster. "bin/hadoop jar" prints: 08/06/04 17:02:16 ...
    Stuart SierraStuart Sierra
    Jun 4, 2008 at 9:19 pm
    Jun 5, 2008 at 11:38 pm
  • Hi, I am currently trying to get Hadoop Pipes working. I am following the instructions at the hadoop wiki, where it provides code for a C++ implementation of Word Count (located here: ...
    SandySandy
    Jun 25, 2008 at 5:44 pm
    Jul 12, 2008 at 6:14 am
  • Hi! I am considering using Hadoop for (almost) realime data processing. I have data coming every second and I would like to use hadoop cluster to process it as fast as possible. I need to be able to ...
    Vadim ZalivaVadim Zaliva
    Jun 23, 2008 at 9:31 pm
    Jun 25, 2008 at 5:35 pm
  • Hi, I have a question about Hadoop. I am a beginner and just testing Hadoop. Would like to know how a php application would benefit from this, say an application that needs to work on a large number ...
    Chanchal JamesChanchal James
    Jun 12, 2008 at 4:43 pm
    Jun 17, 2008 at 3:00 pm
  • Concerning real-time Map Reduce within (and not only between) machines (multi-core & GPU), e.g. the Phoenix and Mars frameworks: I'm really interested in very fast Map Reduce tasks, i.e. without much ...
    Martin JaggiMartin Jaggi
    Jun 1, 2008 at 2:52 am
    Jun 8, 2008 at 8:06 pm
  • Hello folks: I am running several hadoop applications on hdfs. To save the efforts in issuing the set of commands every time, I am trying to use bash script to run the several applications ...
    Richard ZhangRichard Zhang
    Jun 10, 2008 at 9:45 pm
    Jun 12, 2008 at 1:10 am
  • Hello, I have been getting Too many fetch failures (in the map operation) and shuffle error (in the reduce operation) and am unable to complete any job on the cluster. I have 5 slaves in the cluster. ...
    Sayali KulkarniSayali Kulkarni
    Jun 19, 2008 at 2:07 pm
    Jul 19, 2008 at 10:05 am
  • Hi all, I have been battling EC2 all day and getting nowhere (see other message) Does anyone use the hadoop-ec2-images/hadoop-0.17.0 AMI for small instances successfully? Following ...
    Tim robertsonTim robertson
    Jun 27, 2008 at 6:27 pm
    Jun 28, 2008 at 8:50 am
  • Apologies if this is an RTM response, but I looked and wasn't able to find anything concrete. Is it possible to connect to HDFS via the HDFS client under a different username than I am currently ...
    Bob RemeikaBob Remeika
    Jun 11, 2008 at 7:56 pm
    Jun 12, 2008 at 5:23 pm
  • Hi, I have a question regarding InputSplit boundaries. Does an InputSplit necessarily fall within a single file system block boundaries ? Or can it span across blocks ? In particular, what about a ...
    Naama KrausNaama Kraus
    Jun 26, 2008 at 11:44 am
    Jul 29, 2008 at 5:52 pm
  • For a Nagios script I'm writing, I'd like a command-line method that checks if HDFS is up and running. Is there a better way than to attempt a hadoop dfs command and check the error code?
    Meng MaoMeng Mao
    Jun 27, 2008 at 2:49 pm
    Jul 2, 2008 at 2:56 pm
  • Hello Everyone, I've been brainstorming recently and its always been in the back of my mind, hadoop offers the functionality of clustering comodity systems together, but how would one go about ...
    Brad CBrad C
    Jun 6, 2008 at 2:31 pm
    Jun 9, 2008 at 10:40 am
  • First of all, thanks to whoever maintains the hadoop-ec2 scripts. They've saved us untold time and frustration getting started with a small testing cluster (5 instances). A question: when we log into ...
    Chris AndersonChris Anderson
    Jun 7, 2008 at 6:32 pm
    Mar 5, 2009 at 10:51 am
  • Hi! I'm running streaming tasks on hadoop 0.17.0, and wondered, if anyone has an approach to debugging the following situation: -) map have all finished (100% in http display), -) some reducers are ...
    Andreas KostyrkaAndreas Kostyrka
    Jun 30, 2008 at 3:30 pm
    Jul 1, 2008 at 8:46 am
  • Dears, I use hadoop-0.16.4 to do some work and found a error which i can't get the reasons. The scenario is like this: In the reduce step, instead of using OutputCollector to write result, i use ...
    晋光峰晋光峰
    Jun 18, 2008 at 11:38 am
    Jun 26, 2008 at 3:58 pm
  • I recently installed hadoop 0.17.0 in pseudo-distributed mode on a MacOSX 10.5.3 system with software managed by Fink installed in /sw. I configured hadoop to use the stock Java 1.5.0_13 installation ...
    Lev GivonLev Givon
    Jun 23, 2008 at 8:30 pm
    Jun 26, 2008 at 3:20 pm
  • hi, i'm new in hadoop and im just testing it at the moment. i set up a cluster with 2 nodes and it seems like they are running normally, the log files of the namenode and the datanodes dont show ...
    H3llRid0rH3llRid0r
    Jun 12, 2008 at 1:25 pm
    Jun 20, 2008 at 9:24 am
  • Richard ZhangRichard Zhang
    Jun 16, 2008 at 10:11 pm
    Jun 18, 2008 at 11:15 am
  • Hi, I'm a new Hadoop user, so if this question is blatantly obvious, I apologize. I'm trying to load a native shared library using the DistributedCache as outlined in ...
    MontagMontag
    Jun 12, 2008 at 1:48 pm
    Jun 13, 2008 at 4:14 pm
  • -- View this message in context: http://www.nabble.com/Stack-Overflow-When-Running-Job-tp17593524p17593524.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
    JkupfermanJkupferman
    Jun 2, 2008 at 3:41 am
    Jun 10, 2008 at 10:33 pm
  • Dear all, I am a newbie started using Haddop yesterday. I am having WIndows XP, and following is the output of grep program, which I had exceuted exactly after following instructions in QuickStart ...
    Ravi Shankar \(Google\)Ravi Shankar \(Google\)
    Jun 8, 2008 at 9:13 pm
    Jun 10, 2008 at 10:05 pm
  • hi, I want to setup a hadoop cluster, and I want to make the cluster to be Rack Awareness. But I can't find any document about the form of topology.script.file.name. Could anybody give me an example ...
    田超田超
    Jun 6, 2008 at 2:43 am
    Mar 18, 2009 at 9:12 pm
  • ipc.Client object is designed be able to share across threads, and each thread can only made synchronized rpc call,which means each thread call and wait for a result or error.This is implemented by a ...
    HeyongqiangHeyongqiang
    Jun 20, 2008 at 9:02 am
    Jul 9, 2008 at 3:17 am
  • Hello, I am happy to announce the first German Hadoop Meetup in Berlin. We will meet at 5 p.m. MESZ next Tuesday (24th of June) at the newthinking store in Berlin Mitte: newthinking store GmbH ...
    Isabel DrostIsabel Drost
    Jun 17, 2008 at 5:43 am
    Jun 30, 2008 at 8:40 pm
  • Hello list We will be getting access to a cluster soon, and I was wondering whether this I should use Hadoop ? Or am I better of with the usual batch schedulers such as ProActive etc ? I am not a ...
    Igor NikolicIgor Nikolic
    Jun 25, 2008 at 11:34 am
    Jun 28, 2008 at 4:04 am
  • Hi there, I'm running some streaming jobs on ec2 (ruby parsing scripts) and in my most recent test I managed to spike the load on my large instances to 25 or so. As a result, I lost communication ...
    Chris AndersonChris Anderson
    Jun 26, 2008 at 2:00 am
    Jun 27, 2008 at 4:16 pm
  • Hi, I'm a Hadoop newbie. My question is as follows: The level of parallelism of a job, with respect to mappers, is largely the number of map tasks spawned, which is equal to the number of ...
    Xuan Dzung DoanXuan Dzung Doan
    Jun 25, 2008 at 3:04 am
    Jun 26, 2008 at 4:59 pm
  • Hi Is there a way to grab a hadoop job's status/progress outside of the job and outside of hadoop? I.e if I have another application running and this application needs to know that a job has ended or ...
    Kayla JayKayla Jay
    Jun 17, 2008 at 5:21 pm
    Jun 17, 2008 at 6:55 pm
  • Hi, If map task is unexpectedly "silent" for a long time (e.g. wait for other application response), What happen? Is there any limit for staying? Thanks -- Best regards, Edward J. Yoon, ...
    Edward J. YoonEdward J. Yoon
    Jun 13, 2008 at 2:13 am
    Jun 13, 2008 at 10:13 am
  • Hi all, I'm using hadoop-streaming to execute Python jobs in an EC2 cluster. The output directory in HDFS has part-00000.deflate files - how can I deflate them back into regular text? In my ...
    Jim R. WilsonJim R. Wilson
    Jun 4, 2008 at 11:13 pm
    Feb 9, 2009 at 7:57 am
  • hello, i am developing a tool that will do some analysis tasks using hadoop map/reduce on a cluster the tool user interfaces will be run on the client windows system and should run the analysis tasks ...
    Deyaa AdranaleDeyaa Adranale
    Jun 25, 2008 at 12:46 pm
    Jul 9, 2008 at 9:51 am
  • Hi, I'm using Hadoop 0.17.1 with HBase trunk, and notice lots of exception in hadoop's log (it's a 3-node hdfs): 2008-06-30 19:27:45,760 ERROR org.apache.hadoop.dfs.DataNode: 192.168.23.1:500 ...
    Rong-en FanRong-en Fan
    Jun 30, 2008 at 11:31 am
    Jul 1, 2008 at 8:52 pm
  • Hello all, I deployed hadoop to a small cluster. The HDFS is running as user A. Now user B comes in and wants to run a simple Map-Reduce task. The Map-Reduce client creates all shared files in ...
    YongChul KwonYongChul Kwon
    Jun 26, 2008 at 7:10 pm
    Jun 27, 2008 at 12:21 am
  • Hello all, I am having a bit of a problem with a seemingly simple problem. I would like to have some global variable which is a byte array that all of my map tasks have access to. The best way that I ...
    JavaxtremeJavaxtreme
    Jun 25, 2008 at 3:46 pm
    Jun 25, 2008 at 7:25 pm
  • I have a question someone may have answered here before but I can not find the answer. Assuming I have a cluster of servers hosting a large amount of data I want to run a large job that the maps take ...
    Billy PearsonBilly Pearson
    Jun 14, 2008 at 8:32 pm
    Jun 16, 2008 at 6:29 pm
  • hi, Can I use MapWritable as an output value of a Reducer ? If yes, how will the (key, value) pairs in the MapWritable object will be written to the file ? What output format should I use in this ...
    Tarandeep SinghTarandeep Singh
    Jun 5, 2008 at 11:37 pm
    Jun 6, 2008 at 4:08 am
  • Hi Every one, I am running a simple map-red application similar to k-means. But, when I ran it in on single machine, it went fine with out any issues. But, when I ran the same on a hadoop cluster of ...
    Novice userNovice user
    Jun 19, 2008 at 11:32 am
    Jul 11, 2008 at 3:19 pm
  • Hello, I recall asking this question but this is in addition to what I'ev askd. Firstly, to recap my question and Arun's specific response: -- On May 20, 2008, at 9:03 AM, Saptarshi Guha wrote: ...
    Saptarshi GuhaSaptarshi Guha
    Jun 30, 2008 at 1:42 pm
    Jul 1, 2008 at 5:08 am
  • Hello all, I am having a problem writing my own RecordReader. The basic setup I have is a large byte array that needs to be diced up into key value pairs such that the key is the index into the array ...
    Sean AriettaSean Arietta
    Jun 30, 2008 at 4:23 pm
    Jun 30, 2008 at 5:48 pm
  • hello, I have a lucene index storing documents which holds src and dst words. word pairs may repeat. (it is a multigraph). I want to use hadoop to count how many of the same word pairs there are. I ...
    Cam BazzCam Bazz
    Jun 26, 2008 at 10:03 pm
    Jun 27, 2008 at 1:23 pm
  • Hi, How long is Hadoop full unit test suit expected to run? How do you go about running Hadoop tests? I found that it can take hours for [ant test] target to run which does not seem to be very ...
    Lukas VlcekLukas Vlcek
    Jun 17, 2008 at 7:03 am
    Jun 17, 2008 at 3:35 pm
  • i wanna try hadoop, but i can't run sshd when i use macbook(leopard) -- regards j.L
    j.Lj.L
    Jun 17, 2008 at 6:28 am
    Jun 17, 2008 at 2:44 pm
  • Why not just combine them? How do I do that? Rationale is that our tasks are very balanced in load, but unbalanced in timing. I've found that limiting the number of total threads to be the most safe ...
    Daniel LeffelDaniel Leffel
    Jun 17, 2008 at 4:19 am
    Jun 17, 2008 at 5:09 am
  • Hi, I am running a simple code and I am getting error as " No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String)". I am not able to figure what would have ...
    Novice userNovice user
    Jun 9, 2008 at 10:24 am
    Jun 12, 2008 at 1:57 pm
  • Is there support for counters in streaming? In particular, it would be nice to be able to access these after a job has run. Thanks! Miles -- The University of Edinburgh is a charitable body, ...
    Miles OsborneMiles Osborne
    Jun 10, 2008 at 10:17 pm
    Jun 11, 2008 at 7:35 am
  • I downloaded the Matrix Multiplication code from: http://code.google.com/p/hama/source/browse/trunk/src/java/org/apache/hama/ but I do not know how can I run it in the right way. Could you please ...
    HadoopHadoop
    Jun 2, 2008 at 12:12 pm
    Jun 9, 2008 at 1:41 pm
  • The next user group meeting is scheduled for June 18th from 6-7:30 pm at the Yahoo! Mission College campus (2821 Mission College, Santa Clara). Registration, driving directions etc are at ...
    Ajay AnandAjay Anand
    Jun 4, 2008 at 10:02 pm
    Jun 5, 2008 at 4:13 pm
  • Hi, I'm about to add a new disk (under a new partition) to some existing DataNodes that are nearly full. I see FAQ #15: 15. HDFS. How do I set up a hadoop node to use multiple volumes? Data-nodes can ...
    Otis GospodneticOtis Gospodnetic
    Jun 3, 2008 at 3:37 pm
    Jun 3, 2008 at 6:54 pm
Group Navigation
period‹ prev | Jun 2008 | next ›
Group Overview
groupcommon-user @
categorieshadoop
discussions150
posts542
users170
websitehadoop.apache.org...
irc#hadoop

170 users for June 2008

Ted Dunning: 22 posts Andreas Kostyrka: 18 posts Haijun Cao: 15 posts Steve Loughran: 14 posts Miles Osborne: 13 posts Lohit: 11 posts Chris K Wensel: 10 posts Elia Mazzawi: 10 posts Tim robertson: 10 posts Konstantin Shvachko: 9 posts Richard Zhang: 9 posts Chris Anderson: 8 posts Chris Collins: 8 posts Montag: 8 posts Runping Qi: 8 posts Amar Kamat: 7 posts Arun C Murthy: 7 posts Billy Pearson: 7 posts Chris Douglas: 7 posts Colin Freas: 7 posts
show more