Search Discussions

156 discussions - 691 posts

  • Thanks Rekha, I was missing the new library (hadoop-0.20.1-hdfs-core.jar) in my client. It seems to run a little further but I'm now getting a ClassCastException returned by the mapper. Note, this ...
    Arv MistryArv Mistry
    Nov 27, 2009 at 3:49 pm
    Nov 28, 2009 at 2:22 am
  • Hi all, I just installed Hadoop(single node cluster) and tried to start and stop the nodes , and it said no jobtracker to stop , no namenode to stop however the tutorial i used suggest that ...
    Prabhu Hari DhanapalPrabhu Hari Dhanapal
    Nov 16, 2009 at 2:46 pm
    Jan 21, 2013 at 10:06 am
  • Hi all, Today there's a problem about imbalanced data come out of mind . I'd like to know how hadoop handle this kind of data. e.g. one key dominates the map output, say 99%. So 99% data set will go ...
    Jeff ZhangJeff Zhang
    Nov 15, 2009 at 4:03 am
    Nov 24, 2009 at 6:35 pm
  • Hi All (& sorry for possible double posting), Does anybody know whether the hadoop eclipse plugin is still supported? I've tried using the 0.18.0 and 0.18.3 plugins to talk to the hadoop 0.18.0 ...
    Le ZhaoLe Zhao
    Nov 2, 2009 at 7:47 pm
    Nov 6, 2009 at 2:30 pm
  • I am starting to learn Hadoop, using the Yahoo virtual machine with version 0.18. My question is rather simple. I would like to execute a map/reduce job. In addition to getting the results from the ...
    Gordon LinoffGordon Linoff
    Nov 23, 2009 at 2:04 am
    Nov 25, 2009 at 7:09 am
  • I read the code and find the call DFSInputStream.read(buf, off, len) will cause the DataNode read len bytes (or less if encounting the end of block) , why does not hdfs read ahead to improve ...
    Martin MituzasMartin Mituzas
    Nov 24, 2009 at 7:23 am
    Nov 25, 2009 at 10:36 am
  • Hi, so far I've been using Amazon MapReduce. However, my app uses a growing number of Linux packages. I have been installing them on the fly, in the Mapper.configure(), but with OpenOffice this is ...
    Mark KerznerMark Kerzner
    Nov 5, 2009 at 5:09 pm
    Nov 6, 2009 at 9:00 pm
  • Hi, I am in the process of setting up a Hadoop cluster, starting small at first but rapidly growing. I plan on running the following, Hadoop, HDFS, HBase, Nutch and Mahout. I am starting with 2 ...
    John MartyniakJohn Martyniak
    Nov 3, 2009 at 1:26 pm
    Nov 6, 2009 at 5:43 pm
  • I'm trying to figure out if I should use ip addresses or dns names in my rack awareness script. Its easier for me to use dns names because we have the row and rack number in the name which means I ...
    David J. O'DellDavid J. O'Dell
    Nov 18, 2009 at 4:21 pm
    Nov 19, 2009 at 4:50 pm
  • Hey, quick question: I'm writing a program that parses data from 2 different files and puts the data into a table. Currently I have 2 different map functions and so I submit 2 separate jobs to the ...
    Mark VigeantMark Vigeant
    Nov 2, 2009 at 3:24 pm
    Nov 9, 2009 at 4:40 am
  • I have 2 clusters: 30 nodes running 0.18.3 and 36 nodes running 0.20.1 I've intermittently seen the following errors on both of my clusters, it happens when writing files. I was hoping this would go ...
    David J. O'DellDavid J. O'Dell
    Nov 25, 2009 at 7:28 pm
    Jan 9, 2010 at 4:02 am
  • I want to use hadoop to discover if there is any token that appears in every line of a file. I thought that this should be pretty straightforward, but I'm having a heck of a time with it. (I'm pretty ...
    James R. LeekJames R. Leek
    Nov 30, 2009 at 1:01 am
    Nov 30, 2009 at 5:47 pm
  • Has anybody else had any trouble running hadoop 0.19.2 and Ganglia 3.1.x? I was surfing through the Jira/Google and it seems that there where some issues but have been resolved. Any thoughts would be ...
    John MartyniakJohn Martyniak
    Nov 18, 2009 at 1:31 am
    Nov 18, 2009 at 11:50 pm
  • Hi, I was under the impression that Cloudera's 18.3 can split bz2 input logs during the map phase, is that not so? As of now i see each bz2 file being processed in one entire map task in my running ...
    Usman WaheedUsman Waheed
    Nov 14, 2009 at 8:35 am
    Nov 17, 2009 at 4:40 pm
  • Hi, I am currently evalutating whether Hadoop might be an alternative to our current system. We are providing a web analytics solution for very large websites and run every analysis on all collected ...
    Benjamin DagerothBenjamin Dageroth
    Nov 3, 2009 at 1:28 pm
    Nov 5, 2009 at 5:33 am
  • Hi all, I have been running Hadoop on Ubuntu for a while now in distributed mode (4 node cluster). Just playing around with it. Going forward, I am planning to have more nodes added to the cluster. ...
    Praveen YarlagaddaPraveen Yarlagadda
    Nov 2, 2009 at 8:17 pm
    Nov 4, 2009 at 12:21 pm
  • Dear hadoop fellows, We have been using Hadoop-0.20.1 MapReduce to crawl some web data. In this case, we only have mappers to crawl data and save data into HDFS in a distributed way. No reducers is ...
    Zhang Bingjun (Eddy)Zhang Bingjun (Eddy)
    Nov 2, 2009 at 8:33 am
    Nov 2, 2009 at 12:18 pm
  • so i'm working on a cluster with one other guy and we decided to try the fair scheduler but found that it caused all of our jobs to fail. has anyone else had this issue? is there something more to ...
    Mike KendallMike Kendall
    Nov 30, 2009 at 10:59 pm
    Dec 2, 2009 at 3:46 am
  • Hello all, I am not sure if the question is framed right ! Lets say user1 launches an instance of hadoop on *single node* , and hence he has permission to create,delete files on hdfs or launch M/R ...
    Nov 20, 2009 at 7:38 am
    Nov 24, 2009 at 3:43 pm
  • I am running Hadoop on single server. The issue I am running into is that start-all.sh script is not starting up NameNode. Only way I can start NameNode is by formatting it and I end up losing data ...
    Kaushal AminKaushal Amin
    Nov 10, 2009 at 2:47 pm
    Nov 14, 2009 at 5:51 am
  • Is there a good solution for Hadoop node monitoring? I know that Cacti and Ganglia are probably the two big ones, but are they the best ones to use? Easiest to setup? Most thorough reporting, etc. I ...
    John MartyniakJohn Martyniak
    Nov 12, 2009 at 5:46 am
    Nov 12, 2009 at 7:00 pm
  • I just set up a hadoop cluster. When I try to write to it from my java code, I get the.error below. When using the core-site.xml, do I need to specify a user? ...
    Ananth T. SarathyAnanth T. Sarathy
    Nov 19, 2009 at 6:29 pm
    Oct 7, 2010 at 12:08 am
  • Hi, I am new to hadoop. I am planning to do matrix multiplication(of order millions) using hadoop. I have a few queries regarding the above. i) Will using hadoop be a fix for this or should I try ...
    Nov 13, 2009 at 7:22 am
    Nov 25, 2009 at 7:20 pm
  • Hi, After porting my code from Hadoop 0.17 to 0.20, I am starting to have problems setting my jar file. I used to be able to set jar file by using JobConf.setJar(). But now I am using ...
    Zhengguo 'Mike' SUNZhengguo 'Mike' SUN
    Nov 24, 2009 at 4:29 am
    Nov 24, 2009 at 5:29 am
  • hi guys i want to take a look at hdfs api doc but i can not find it in the dist docs dir. there is no org.apache.hadoop.hdfs api doc. i think maybe i missed something. -- 从我的移动设备发送 ----- 天天开心 身体健康
    Y GY G
    Nov 18, 2009 at 9:56 am
    Nov 19, 2009 at 1:48 pm
  • Hi, when I am building my jar for a MapReduce job, I include the version of Hadoop I am running on my workstation. When I run Hadoop on a cluster, there is a version that runs on a cluster. Do they ...
    Mark KerznerMark Kerzner
    Nov 15, 2009 at 9:37 pm
    Nov 16, 2009 at 9:50 pm
  • Hi, I have been having lot of problems installing hadoop on my system and I finally gave up.I was just wondering if I could SSH to a remote hadoop cluster that is open for developers to practice on ? ...
    Prabhu Hari DhanapalPrabhu Hari Dhanapal
    Nov 16, 2009 at 6:29 am
    Nov 16, 2009 at 1:48 pm
  • Hi I installed the hadoop to one server which has following configurations 24 CPU, 72 GB RAM 17 Disk (2 TB) All configuration belongs to Hadoop and Pig are is default settings. ın order to run ...
    Nov 12, 2009 at 5:17 pm
    Nov 14, 2009 at 5:21 pm
  • Hi all, I could not find the ecilpse plug-in for hadoop 0.20.1. I only find the source code eclipse plugin. But do not know how to build the plug-in. Anyone could give some help? Thank you. Jeff Zhang
    Jeff ZhangJeff Zhang
    Nov 9, 2009 at 6:09 am
    Nov 10, 2009 at 6:21 pm
  • Hi, In one of my hadoop jobs I had set mapred.job.reuse.jvm.num.tasks = -1. The job was such that each map/reduce task occupied around 2GB RAM. So by setting mapred.job.reuse.jvm.num.tasks = -1, even ...
    Chandraprakash BhagtaniChandraprakash Bhagtani
    Nov 30, 2009 at 12:14 pm
    Dec 2, 2009 at 4:52 pm
  • Hi, I get this part-00000.deflate instead of part-00000. How do I get rid of the deflate option? Thank you, Mark
    Mark KerznerMark Kerzner
    Nov 26, 2009 at 5:34 am
    Nov 27, 2009 at 7:26 pm
  • Hi, I am starting a cluster of Apache Hadoop distributions, like .18 and also .19. This all works fine, then I log in. I see that the Hadoop daemons are already working. However, when I try # which ...
    Mark KerznerMark Kerzner
    Nov 24, 2009 at 9:02 pm
    Nov 25, 2009 at 5:52 am
  • everything online says that replication will be taken care of automatically, but i've had a file (that i uploaded through the put command on one node) sitting with a replication of 1 for three days.
    Mike KendallMike Kendall
    Nov 19, 2009 at 5:59 pm
    Nov 19, 2009 at 6:46 pm
  • I'm running a 2 nodes cluster using hadoop and katta. Master on qa-hadoop004 and slave on qa-hadoop005 When I'm running a search as root from a remote machine, all works well. When I'm running the ...
    Yair Even-ZoharYair Even-Zohar
    Nov 16, 2009 at 2:22 pm
    Nov 17, 2009 at 2:10 am
  • Hi, a) I have a Mapper ONLY job, the job reads in records, then parses them apart. No reduce phase b) I would like this mapper job to save the record into a shared mysql database on the network. c) I ...
    Nov 16, 2009 at 12:52 am
    Nov 16, 2009 at 5:09 pm
  • Dear Hadoop users, One of our hadoop clusters is being converted to SGE to run very specific application and we're thinking about how to utilize these huge hard-drives that are there. Since there ...
    Dmitry PushkarevDmitry Pushkarev
    Nov 13, 2009 at 9:56 pm
    Nov 14, 2009 at 6:00 am
  • Namenode won't start with this messages: hadoop-0.20.1/logs/hadoop-hadoop-namenode-hadoop_master.log: http://pastebin.com/m359b9e24 Thank you.
    Pavel kolodinPavel kolodin
    Nov 30, 2009 at 5:29 pm
    Nov 30, 2009 at 10:23 pm
  • Hi all, I want to build hadoop from source rather than downloading the already built tar ball. Can someone please give me the steps or link to any pointers please Thanks in advance -- Regards, ~Sid~ ...
    Nov 12, 2009 at 6:14 pm
    Nov 28, 2009 at 10:55 am
  • Hello, I have a set of input files part-r-* which I will pass through another map(no reduce). the part-r-* files consist of key, values, keys being small, values fairly large(MB's) I would like to ...
    Saptarshi GuhaSaptarshi Guha
    Nov 26, 2009 at 7:06 am
    Nov 26, 2009 at 4:14 pm
  • Hi all, As I know, Combiner is used in the mapper task, and most of the time, combiner is the same as reducer. So if combiner is used, the output type of mapper task must the same as reducer task, is ...
    Jeff ZhangJeff Zhang
    Nov 22, 2009 at 8:28 am
    Nov 22, 2009 at 7:06 pm
  • Hi. I have a strange case of missing files, which most probably randomly delete by my application. Does HDFS provides any auditing tools for tracking who deleted what and when? Thanks in advance.
    Stas OskinStas Oskin
    Nov 19, 2009 at 5:03 pm
    Nov 22, 2009 at 12:49 pm
  • I wonder if there is anything like this in the south Germany area. Bob
    Bob SchulzeBob Schulze
    Nov 19, 2009 at 9:59 am
    Nov 19, 2009 at 1:09 pm
  • title says it all.. this isn't the first job i've written either. very confused.
    Mike KendallMike Kendall
    Nov 13, 2009 at 10:03 pm
    Nov 13, 2009 at 11:19 pm
  • Can the NameNode/DataNode & JobTracker/TaskTracker run on a server that isn't part of the "cluster" meaning I would like to run it on a machine that wouldn't participate in the processing of data, ...
    John MartyniakJohn Martyniak
    Nov 9, 2009 at 3:21 pm
    Nov 11, 2009 at 4:03 pm
  • Hi, I have installed *hadoop-0.20* on my system. I am facing a problem while starting hadoop using *start-all.sh* command. I am getting the following error : *localhost: starting secondarynamenode, ...
    Mohan AgarwalMohan Agarwal
    Nov 5, 2009 at 7:50 am
    Nov 7, 2009 at 5:01 am
  • Hello, While trying to start the task tracker I get the following error in the logs (see below). I'm guessing its trying to clean up an aborted job( a badly coded one) and too many files to clean up. ...
    Saptarshi GuhaSaptarshi Guha
    Nov 30, 2009 at 4:54 pm
    Nov 30, 2009 at 9:33 pm
  • Hello Everybody, I have a question about object serialization in Hadoop. I have an object A which I want to pass to every map function. Currently the code I am using for this is as under. The problem ...
    Nov 29, 2009 at 10:10 pm
    Nov 30, 2009 at 12:44 pm
  • Hi, it is probably described somewhere in the manuals, but 1. Where are the log files, especially those that show my System.out.println() and errors; and 2. Do I need to log in to every machine on ...
    Mark KerznerMark Kerzner
    Nov 27, 2009 at 12:58 am
    Nov 29, 2009 at 5:57 pm
  • Hi, I started my first experimental Hadoop project with Hadoop 0.20.1 an run in the following problem: Job job = new Job(new Configuration(),"Myjob"); ...
    Matthias SchererMatthias Scherer
    Nov 26, 2009 at 3:10 pm
    Nov 27, 2009 at 12:18 pm
  • Dear All, I am implementing an algorithm that read a data file(.txt file, approximately 90MB), compare each line of the data file with each line of a specific samples file(.txt file, approximately ...
    Boyu ZhangBoyu Zhang
    Nov 22, 2009 at 8:21 pm
    Nov 22, 2009 at 9:02 pm
Group Navigation
period‹ prev | Nov 2009 | next ›
Group Overview
groupcommon-user @

175 users for November 2009

Mark Kerzner: 36 posts Todd Lipcon: 33 posts Zjffdu: 28 posts John Martyniak: 26 posts Edmund Kohlwey: 19 posts Edward Capriolo: 19 posts Steve Loughran: 19 posts Y G: 18 posts Mike Kendall: 17 posts Raymond Jennings III: 17 posts Siddu: 17 posts Amogh Vasekar: 16 posts Aa225: 14 posts Gang Luo: 14 posts Jason Venner: 14 posts Stephen Watt: 13 posts Allen Wittenauer: 12 posts Amandeep Khurana: 11 posts Brian Bockelman: 11 posts CubicDesign: 10 posts
show more