FAQ

Search Discussions

108 discussions - 420 posts

  • Hi, I'm porting a legacy application to hadoop and it uses a bunch of small files. I'm aware that having such small files ain't a good idea but I'm not doing the technical decisions and the port has ...
    Pierre ANCELOTPierre ANCELOT
    May 18, 2010 at 12:06 pm
    May 20, 2010 at 3:53 pm
  • I am new to Hadoop. I have successfully run java programs from Hadoop and I would like to call C programs from Hadoop. Thank you for your help Michael -- View this message in context: ...
    Michael RobinsonMichael Robinson
    May 29, 2010 at 6:31 pm
    May 31, 2010 at 4:48 pm
  • I'm working on bringing up a second test cluster and am getting these intermittent errors on the DataNodes: 2010-05-12 17:17:15,094 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: ...
    Andrew NguyenAndrew Nguyen
    May 13, 2010 at 12:19 am
    May 17, 2010 at 8:01 pm
  • Hello, I'm using Hadoop 0.20.1, I submitted a job using the org.apache.hadoop.mapreduce.Job approach e.g. org.apache.hadoop.mapreduce.Job _job job_.submit(); However, I would like to,from another ...
    Saptarshi GuhaSaptarshi Guha
    May 26, 2010 at 6:28 pm
    Jun 1, 2010 at 3:25 pm
  • Hello All, I've been unable to resolve this problem on my own so I've decided to ask for help. I've pasted the logs I have for the DataNode on of the slave nodes. The logs for TaskTracker are ...
    Erik TestErik Test
    May 25, 2010 at 8:06 pm
    May 27, 2010 at 10:37 am
  • Hi, I want to load data from HDFS to Hive, the data is in compressed files. The data is stored in flat files, the delimiter is ^A (ctrl-A). As long as I use de-compressed files everything is working ...
    Susanne LehmannSusanne Lehmann
    May 2, 2010 at 6:23 pm
    May 3, 2010 at 9:12 pm
  • My team and I were working with sequence files and were using the LuceneDocumentWrapper. But when I try to get the valcall, i get a no such method exception from the ReflectionUtils, which is caused ...
    Ananth SarathyAnanth Sarathy
    May 11, 2010 at 1:30 am
    May 11, 2010 at 3:27 pm
  • Hi, Is there a way to preserve previous job information (Completed Jobs, Failed Jobs) when the hadoop cluster is restarted? Everytime I start up my cluster (start-dfs.sh,start-mapred.sh) the ...
    Alan MillerAlan Miller
    May 18, 2010 at 5:32 pm
    Jun 18, 2010 at 10:37 pm
  • Hi hadoop folks, i'm encountering following problem on a 4 node cluster running hadoop-0.20.2. Its a map only job reading about 9 GB data from outside of hadoop. 31 map tasks at all while 12 map ...
    Johannes ZillmannJohannes Zillmann
    May 7, 2010 at 2:26 pm
    Jun 1, 2010 at 11:32 am
  • Hey guys, I know it's 5PM on a Friday, but we just saw one of our big cluster's namenode's deadlock. This is 0.19.1; does this ring a bell for anyone? I haven't had any time to start going through ...
    Brian BockelmanBrian Bockelman
    May 15, 2010 at 12:22 am
    May 17, 2010 at 4:36 pm
  • Hi folks :) I have one big file... I read it with FileInputFormat, this generates only one task and of course, this doesn't get distributed across the cluster nodes. Should I use an other Input class ...
    Pierre ANCELOTPierre ANCELOT
    May 10, 2010 at 12:22 pm
    May 11, 2010 at 5:27 am
  • Hadoop is working fine in Linux. In Windows (using cygwin) I can't get mapred to work, though hdfs is ok. This is the stacktrace: java.io.FileNotFoundException: File ...
    Carlos Eduardo Moreira dos SantosCarlos Eduardo Moreira dos Santos
    May 2, 2010 at 4:42 am
    May 10, 2010 at 12:20 am
  • Hi Guys, I am trying to use compression to reduce the IO workload when trying to run a job but failed. I have several questions which needs your help. For lzo compression, I found a guide ...
    Stan leeStan lee
    May 18, 2010 at 3:45 pm
    May 19, 2010 at 11:01 am
  • Hi, I want a Hadoop job that will simply take each line of the input text file and store it (after parsing) in a database, like SimpleDB. Can I put this code into Mapper, make no call to "collect" in ...
    Mark KerznerMark Kerzner
    May 12, 2010 at 2:02 am
    May 12, 2010 at 2:22 am
  • Hi, I am tring to set up a small hadoop cluster with 6 machines. the problem I have now is that if I set the memory allocated to a task low (e.g -Xmx512m) the application does not run, if I set it ...
    JambortaJamborta
    May 4, 2010 at 10:56 pm
    May 5, 2010 at 3:26 am
  • Hi, I'm a novice at hadoop, and I want to install it on 3 nodes, I try to configure it by editing core-site.xml, hdfs-site.xml and mapred-site.xml that the first node is the namenode, the second is ...
    Khaled BEN BAHRIKhaled BEN BAHRI
    May 31, 2010 at 8:34 am
    May 31, 2010 at 3:28 pm
  • Hi All, I continually get this error when trying to run start-all.sh for hadoop 0.20.2 on ubuntu. What confuses me is I DO have JAVA_HOME set in hadoop-env.sh to /usr/lib/jvm/jdk1.6.0_17. I've double ...
    Erik TestErik Test
    May 18, 2010 at 5:30 pm
    May 22, 2010 at 1:46 am
  • Hi all, I wonder is it enough for recovering hadoop cluster by just coping the meta data from SecondaryNameNode to the new master node ? Do I any need do any other stuffs ? Thanks for any help. -- ...
    Jeff ZhangJeff Zhang
    May 13, 2010 at 9:01 am
    May 14, 2010 at 1:40 am
  • Hi, We are running our cluster on Amazon EC2. we are using cloudera scripts to setup hadoop. On the master node, we start below services. 609 $AS_HADOOP '"$HADOOP_HOME"/bin/hadoop-daemon.sh start ...
    Balanagireddy MudiamBalanagireddy Mudiam
    May 7, 2010 at 10:17 pm
    Jun 12, 2010 at 1:14 am
  • Hi, Actually I have an indexing and search service that receives documents to be indexed and search requests through XML-RPC. The server uses the Lucene Search Engine and store it's index on the ...
    AécioAécio
    May 13, 2010 at 6:41 pm
    Jun 3, 2010 at 2:59 am
  • Hi, I am trying to run a JNI application on StandAlone mode and I have been getting this result ever since. I've looked up every possible webs and sites but never could get the solution to it. Please ...
    Edward choiEdward choi
    May 31, 2010 at 4:49 pm
    Jun 2, 2010 at 1:12 am
  • Hi guys, i have following situation. A network setup with 2 network interfaces, eth0 (external) and eth1(internal). Now in order to use the internal ips for hadoop i set dfs.datanode.dns.interface ...
    Johannes ZillmannJohannes Zillmann
    May 28, 2010 at 10:24 am
    Jun 1, 2010 at 11:49 am
  • Hi all, I'm inside a OpenSolaris zone, or more precisely a Joyent Accelerator. I can't seem to get a datanode started. I can start a namenode fine. I can "bin/hadoop datanode -format' fine. JAVA_HOME ...
    Alex LiAlex Li
    May 14, 2010 at 10:28 am
    May 14, 2010 at 5:26 pm
  • Hello all, I am trying to install HBase and while going through the requirements (link below), it asked me to apply HDFS-630 patch. The latest 2 patches are for Hadoop 0.21. I am using version 0.20. ...
    Raghava MutharajuRaghava Mutharaju
    May 14, 2010 at 2:59 am
    May 14, 2010 at 4:20 am
  • Hi, I am given an account on a cluster which uses OpenPBS as the cluster management software. The only way I can run a job is by submitting it to OpenPBS. How to run mapreduce programs on it? Is ...
    Udaya LakshmiUdaya Lakshmi
    May 4, 2010 at 2:46 pm
    May 4, 2010 at 7:51 pm
  • Hi, all. I have encountered a problem that cannot be solved with simple computation, I don't know whether hadoop is applicable to it, I am completely new to hadoop and MapReduce. I have the raw data ...
    Kevin TseKevin Tse
    May 28, 2010 at 10:35 am
    May 31, 2010 at 9:13 am
  • I followed the steps mentioned here: http://developer.yahoo.com/hadoop/tutorial/module2.html#decommission to decommission a data node. What I see from the namenode is the hostname of the machine that ...
    Scott WhiteScott White
    May 18, 2010 at 4:32 am
    May 18, 2010 at 6:12 pm
  • hello, I want to deploy Hadoop on a cluster. In this cluster, different nodes share same file system. If I make changes to files on node1. then other nodes will have the same changes. (The file ...
    Dechao buDechao bu
    May 12, 2010 at 3:36 pm
    May 12, 2010 at 6:01 pm
  • Hi, I am launching hadoop cluster on amazon ec2. For a period of 30-40 minutes, hdfs is in safemode and recovering the blocks. Till then job tracker doesn't to accept connections from task tracker. I ...
    Balanagireddy MudiamBalanagireddy Mudiam
    May 12, 2010 at 2:39 pm
    May 12, 2010 at 3:46 pm
  • Not sure if this is the right list for this question, but. Is it possible to determine which host actually processed my MR job? Regards, Alan
    Alan MillerAlan Miller
    May 6, 2010 at 4:09 pm
    May 7, 2010 at 3:33 pm
  • am new to hadoop. I have a file Wordcount.java which refers hadoop.jar and stanford-parser.jar I am running the following commnad javac -classpath .:hadoop-0.20.1-core.jar:stanford-parser.jar -d ep ...
    HarshiraHarshira
    May 7, 2010 at 3:55 am
    May 7, 2010 at 1:21 pm
  • I am currently testing out a rollout of HBase 0.20.3 on top of Hadoop 0.20.2. The HBase doc recommends HDFS-630 patch be applied. I realize this is a newbieish question, but has anyone done this to ...
    Joseph ChiuJoseph Chiu
    May 4, 2010 at 4:39 pm
    May 4, 2010 at 6:27 pm
  • Hi, I am writing an hadoop application that should process a Java Project. A Java Project directory can have many subfolders(packages) and files(fava files) mixed in with them. As well as unrelated ...
    Tiago VelosoTiago Veloso
    May 30, 2010 at 8:44 am
    May 31, 2010 at 12:56 am
  • I'm getting the following errors: WARN org.apache.hadoop.mapred.JobTracker: Serious problem, cannot find record of 'previous' heartbeat for 'tracker_m351.ra.wink.com:localhost/127.0.0.1:41885'; ...
    PeterAtReunionPeterAtReunion
    May 28, 2010 at 1:49 am
    May 29, 2010 at 2:23 am
  • Hi, I was using the beta version of Cloudera scripts from a while back, and I think there is a stable version, but I can't find it. It tells me to go download a Hadoop distribution, and there I can't ...
    Mark KerznerMark Kerzner
    May 28, 2010 at 4:32 am
    May 28, 2010 at 2:05 pm
  • Hi, I am running Benchmarks for Testing performance of Hadoop cluster, I need to know if there are any tests that are crucial or most widely used? regards Sonali Gavhane "Legal Disclaimer: This ...
    SonaliSonali
    May 27, 2010 at 6:35 pm
    May 28, 2010 at 12:30 pm
  • My Java mapper hands its processing off to C++ through JNI. On the C++ side I need to access a file. I have already implemented a version of this interface in which the file is read entirely into RAM ...
    Keith WileyKeith Wiley
    May 21, 2010 at 7:10 pm
    May 22, 2010 at 10:38 am
  • Hi All, Im new to hadoop and successfuly runs many times MapRed task on my small cluster (6 machines). Now I realizes that by default only 1 reducer assigned to the job. and with only 1 reducer ...
    Ferdinand NemanFerdinand Neman
    May 19, 2010 at 6:07 am
    May 20, 2010 at 2:41 am
  • I am considering to use a machine to save a redundant copy of HDFS metadata through setting dfs.name.dir in hdfs-site.xml like this (as in YDN): <property <name dfs.name.dir</name <value ...
    Jiang lichtJiang licht
    May 18, 2010 at 12:10 am
    May 18, 2010 at 6:23 pm
  • When I run the sort job, I found when there are 70 reduce tasks running and no one completed, the progress bar shows that it has finished about 80%, so how the mapreduce mechnism to caculate this? ...
    Stan leeStan lee
    May 17, 2010 at 1:45 am
    May 18, 2010 at 3:33 pm
  • Hi, I am running map-side join. My input looks like this. file1.txt ----------- a|deer b|dog file2.txt ----------- a|veg b|nveg I am getting output like a|[deer,veg] b|[dog,nveg] I dont want those ...
    Carbon RockCarbon Rock
    May 13, 2010 at 1:58 pm
    May 18, 2010 at 7:24 am
  • Hello, I am running org.apache.hadoop.fs.TestDFSIO to benchmark our HDFS installation and had a couple of questions regarding the same. a) If I run the benchmark back to back in the same directory, I ...
    Lavanya RamakrishnanLavanya Ramakrishnan
    May 14, 2010 at 5:51 pm
    May 15, 2010 at 1:16 am
  • Hi, I am very new to the MapReduce paradigm so this could be a dumb question. What do you do if your mapper functions need to know more than just the data being processed in order to do their job? ...
    DNMILNEDNMILNE
    May 12, 2010 at 5:34 am
    May 13, 2010 at 2:05 am
  • Hi, I saw a lot of warnings like the following in namenode log: 2010-05-11 06:45:07,186 WARN /: /listPaths/xxxxxxxxxxxxs: java.lang.NullPointerException at ...
    Runping QiRunping Qi
    May 11, 2010 at 4:53 pm
    May 12, 2010 at 2:13 am
  • Please check out this PNG image from attachment or from Google docs: http://docs.google.com/drawings/pub?id=1P3jdSddseG1oSYrtjREWcajizxmxoRIhUHCEw4sDi3k&w=771&h=624So, what I want to do is something ...
    DennisDennis
    May 7, 2010 at 2:02 am
    May 8, 2010 at 7:37 am
  • Hi, Came across something "ugly". I'm using the latest Hadoop version in Cloudera's CH2 :Hadoop 0.20.1+169.68 (At least I think its the latest version in CH2) Noticed that when I instantiate a ...
    Michael SegelMichael Segel
    May 4, 2010 at 2:51 pm
    May 4, 2010 at 3:42 pm
  • Does anyone know of any existing work integrating HDF5 (http://www.hdfgroup.org/HDF5/whatishdf5.html) with Hadoop? I don't know much about HDF5 but it was recently brought to my attention as a way to ...
    Andrew NguyenAndrew Nguyen
    May 3, 2010 at 5:37 pm
    May 4, 2010 at 4:18 am
  • Hadoop Fans, just a quick note about training options at the Hadoop Summit. There are discounts expiring soon, so if you planned to attend, or didn't know, we want to make sure you stay in the loop. ...
    Christophe BiscigliaChristophe Bisciglia
    May 11, 2010 at 11:13 pm
    Jun 16, 2010 at 9:49 pm
  • Greetings, I'm having an odd problem with 20.2. I'm using a cluster of 7 nodes on EC2, and it's "forgetting" jars that I deploy. For example, when I do hadoop -jar filename.jar, it doesn't seem to ...
    Bradford StephensBradford Stephens
    May 31, 2010 at 6:31 am
    May 31, 2010 at 5:35 pm
  • Hello: I got this error when putting files into hdfs,it seems a old issue,and I followed the solution of this link: ...
    Alex LuyaAlex Luya
    May 26, 2010 at 10:54 am
    May 27, 2010 at 6:26 am
Group Navigation
period‹ prev | May 2010 | next ›
Group Overview
groupcommon-user @
categorieshadoop
discussions108
posts420
users150
websitehadoop.apache.org...
irc#hadoop

150 users for May 2010

Ted Yu: 19 posts Pierre ANCELOT: 14 posts Brian Bockelman: 12 posts Todd Lipcon: 12 posts Allen Wittenauer: 10 posts Mark Kerzner: 10 posts Michael Robinson: 10 posts Andrew Nguyen: 9 posts Eric Sammer: 9 posts Stan lee: 9 posts Steve Loughran: 9 posts Erik Test: 8 posts Jeff Zhang: 8 posts Michael Segel: 8 posts Edward Capriolo: 7 posts Raghava Mutharaju: 7 posts Ananth Sarathy: 5 posts Andrew Nguyen: 5 posts Dhadoop: 5 posts Hemanth Yamijala: 5 posts
show more