Search Discussions

143 discussions - 590 posts

  • Hello, Please can someone point some of the risks we may incur if we decide to implement Hadoop? BR, Isaac.
    Kobina KwarkoKobina Kwarko
    Sep 16, 2011 at 7:11 pm
    Sep 22, 2011 at 5:04 am
  • Hi ! I have set up hadoop on my machine as per http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/ I am able to run application with capacity scheduler by submit ...
    Sep 18, 2011 at 6:30 am
    Sep 20, 2011 at 7:47 am
  • Hello, I am trying to automate formatting an HDFS volume. Is there any way to do this without the interaction (and using expect)? Cheers, Ivan
    Ivan NovickIvan Novick
    Sep 22, 2011 at 9:42 pm
    Sep 23, 2011 at 5:25 pm
  • We use the DistributedCache class to distribute a few lookup files for our jobs. We have been aggressively deleting failed task attempts' leftover data , and our script accidentally deleted the path ...
    Meng MaoMeng Mao
    Sep 23, 2011 at 6:58 am
    Sep 27, 2011 at 8:05 pm
  • Hi ! I have set up mumak and able to run it in terminal and in eclipse. I have modified the mapred-site.xml and capacity-scheduler.xml as necessary. I tried to apply patch ...
    Sep 21, 2011 at 6:03 am
    Sep 23, 2011 at 6:30 am
  • Hi I use hadoop for a MapReduce job in my system. I would like to have the job run very 5th minute. Are there any "distributed" timer job stuff in hadoop? Of course I could setup a timer in an ...
    Per SteffensenPer Steffensen
    Sep 1, 2011 at 9:31 am
    Sep 2, 2011 at 9:34 am
  • Hi, all recently, I was hit by a question, "how is a hadoop job divided into 2 phases?", In textbooks, we are told that the mapreduce jobs are divided into 2 phases, map and reduce, and for reduce, ...
    Nan ZhuNan Zhu
    Sep 19, 2011 at 2:24 am
    Sep 19, 2011 at 7:20 pm
  • Hi all, Can we replace our namenode machine later with some other machine. ? Actually I got a new server machine in my cluster and now I want to make this machine as my new namenode and jobtracker ...
    Praveenesh kumarPraveenesh kumar
    Sep 22, 2011 at 4:43 am
    Sep 22, 2011 at 4:23 pm
  • We are exploring possibility of using HBase for the real time transactions. Is that possible? -Jignesh
    Jignesh PatelJignesh Patel
    Sep 20, 2011 at 7:57 pm
    Sep 21, 2011 at 6:15 pm
  • Is it possible to submit a series of MR Jobs to the JobTracker to run in sequence (one finishes, take the output of that if successful and feed it into the next, etc), or does it need to run client ...
    Aaron BaffAaron Baff
    Sep 28, 2011 at 11:57 pm
    Sep 29, 2011 at 6:02 pm
  • Dear Friends, Im new in hadoop for an important data mining university research, i saw these sentences in different hadoop related docs: { Win32 is supported as a *development platform* not as a ...
    Hamedani, MasoudHamedani, Masoud
    Sep 28, 2011 at 2:32 am
    Sep 29, 2011 at 4:02 am
  • Guys, As far as I know hadoop, I think, to copy the files to HDFS, first it needs to be copied to the NameNode's local filesystem. Is it right ?? So does it mean that even if I have a hadoop cluster ...
    Praveenesh kumarPraveenesh kumar
    Sep 21, 2011 at 8:44 am
    Sep 21, 2011 at 12:38 pm
  • Hi all, I am running a 10 node cluster (1NN + 9DN, ubuntu server 10.04, 2GB RAM each). I am facing a strange problem. My datanodes go down randomly and nothing showup in the logs. They lose their ...
    John smithJohn smith
    Sep 15, 2011 at 10:07 pm
    Sep 16, 2011 at 5:19 pm
  • I wanna use hadoop/contrib/index to create a distrabute lucene index on hadoop ,who can help me by giving me the sourcecode of the hadoop/contrib/index(hadoop 0.20.2),Thank you very much ! (PS:My ...
    Sep 13, 2011 at 5:29 am
    Oct 10, 2011 at 3:09 am
  • Hi, I am relatively new to Hadoop and was wondering how to do incremental loads into HDFS. I have a continuous stream of data flowing into a service which is writing to an OLTP store. Due to the high ...
    Sam SeigalSam Seigal
    Sep 30, 2011 at 11:02 pm
    Oct 3, 2011 at 5:45 pm
  • I was not able to restart my name server because I the name server ran out of space. Then I adjusted dfs.datanode.du.reserved to 0, and used tune2fs -m to get some space, but I still could not ...
    Peng, WeiPeng, Wei
    Sep 21, 2011 at 3:31 am
    Sep 21, 2011 at 6:28 am
  • Good evening, I have a certain value output from a mapper and I want to concatenate the string outputs using a Reducer (one reducer).But the order of the concatenated string values is not in order. ...
    Daniel YehdegoDaniel Yehdego
    Sep 20, 2011 at 5:44 am
    Sep 20, 2011 at 5:51 pm
  • Hi, I have a hadoop job running on over 50k files, each of which is about 500M. I need to extract some tiny information from each file and no reducer is needed. However, the output from the mappers ...
    Peng, WeiPeng, Wei
    Sep 20, 2011 at 7:56 am
    Sep 20, 2011 at 4:39 pm
  • Hey guys, I am running hive and I am trying to join two tables (2.2GB and 136MB) on a cluster of 9 nodes (replication = 3) Hadoop version - 0.20.2 Each data node memory - 2GB HADOOP_HEAPSIZE - 1000MB ...
    John smithJohn smith
    Sep 19, 2011 at 12:43 pm
    Sep 19, 2011 at 2:40 pm
  • Hi, Some of the MR jobs I run doesn't need sorting of map-output in each partition. Is there someway I can disable it? Any help? Thanks jS
    John smithJohn smith
    Sep 10, 2011 at 9:06 am
    Sep 11, 2011 at 9:58 am
  • Hello, I have found a HDFSClient which shows me, how to access my HDFS from inside the cluster (i.e. running on a Node). My Idea is, that different processes may write 64M Chunks to HDFS from ...
    Ralf HeydeRalf Heyde
    Sep 5, 2011 at 2:29 pm
    Sep 6, 2011 at 1:41 pm
  • Hello all, I have Hadoop up and running and an embarrassingly parallel problem but can't figure out how to arrange the problem. My apologies in advance if this is obvious and I'm not getting it. My ...
    Parker JonesParker Jones
    Sep 12, 2011 at 10:11 am
    Oct 14, 2011 at 5:14 pm
  • Hi, This is my first post here. I'm new to Hadoop. I've already installed Hadoop on 2 Ubuntu boxes (one is both master and slave and the other is only slave). When I run a Wordcount example on 5 ...
    Abdelrahman KamelAbdelrahman Kamel
    Sep 26, 2011 at 3:17 pm
    Sep 28, 2011 at 8:57 am
  • Hi all, I have a 32-bit binary that uses libhdfs for accessing hdfs (on the cloudera VM) and am trying to run it on cluster with 64-bit machines. But unfortunately it crashes with "error while ...
    Vivek KVivek K
    Sep 27, 2011 at 2:10 pm
    Sep 27, 2011 at 2:39 pm
  • I happened to notice this today and being fairly new to administering hadoop, I'm not exactly sure how to pull out of this situation without data loss. The checkpoint hasn't happened since Sept 2nd. ...
    Jeremy HansenJeremy Hansen
    Sep 7, 2011 at 12:26 am
    Sep 7, 2011 at 5:22 pm
  • Hi Folks, I am working on a 3 node cluster (1 NN + 2 DNs) . I loaded some test data with replication factor 3 (around 400MB data). However when I run wordcount example , it hangs at map 0%. ...
    John smithJohn smith
    Sep 6, 2011 at 7:49 am
    Sep 6, 2011 at 10:17 am
  • Should I be able to go to the root of a clean download, type "ant test", and have everything work? I tried this and saw a lot of test failures. Is there some other configuration setup I ...
    W.P. McNeillW.P. McNeill
    Sep 8, 2011 at 9:48 pm
    Jan 16, 2012 at 2:43 am
  • Jignesh PatelJignesh Patel
    Sep 30, 2011 at 7:59 pm
    Oct 4, 2011 at 12:48 am
  • Hi All: I have lots of small files stored in HDFS. My HDFS block size is 128M. Each file is significantly smaller than the HDFS block size. Then, I want to know whether the small file used 128M in ...
    Sep 21, 2011 at 1:54 am
    Sep 29, 2011 at 6:05 pm
  • Hi all, We are trying to run a mahout job in a hadoop cluster, but we keep getting the same status. The job passes the initial mahout stages and when it comes to be executed as a MR job, it seems to ...
    George KousiourisGeorge Kousiouris
    Sep 21, 2011 at 1:59 pm
    Sep 21, 2011 at 3:40 pm
  • Hello, I've writed an HDFS Client which works pretty well. But . on my Namenode I configured a replication leven of 2 . on my Client - the config - hold a value of 1. If I now write a file from my ...
    Ralf HeydeRalf Heyde
    Sep 12, 2011 at 10:01 pm
    Sep 13, 2011 at 8:52 pm
  • Hi, I was trying to set-up hadoop/hbase cluster on ec2 which took me few hours to set-up from scratch on bundled image from s3. I am curious to know, what is the best way to setting hadoop/hbase ...
    Shahnawaz SaifiShahnawaz Saifi
    Sep 7, 2011 at 11:19 am
    Sep 7, 2011 at 9:11 pm
  • We have a compression utility that tries to grab all subdirs to a directory on HDFS. It makes a call like this: FileStatus[] subdirs = fs.globStatus(new Path(inputdir, "*")); and handles files vs ...
    Meng MaoMeng Mao
    Sep 2, 2011 at 8:05 pm
    Sep 3, 2011 at 8:40 pm
  • Hi all, I am trying to install Hadoop (release 0.20.203) on a machine with CentOS. When I try to start HDFS, I get the following error. <machine-name : Unrecognized option: -jvm <machine-name : Could ...
    Abhishek sharmaAbhishek sharma
    Sep 1, 2011 at 7:51 pm
    Sep 1, 2011 at 10:56 pm
  • I have a problem where certain Hadoop jobs take prohibitively long to run. My hypothesis is that I am generating more I/O than my cluster can handle and I need to substantiate this. I am looking ...
    W.P. McNeillW.P. McNeill
    Sep 29, 2011 at 11:15 pm
    Oct 5, 2011 at 1:11 pm
  • Hi, Is it possible to get the process id of each task in a MapReduce job? When I run a mapreduce job and do a monitoring in linux using ps, i just see the id of the mapreduce job process but not its ...
    Bikash sharmaBikash sharma
    Sep 28, 2011 at 6:46 pm
    Sep 30, 2011 at 12:52 pm
  • Hi hadoopers, I was looking the way to dump hadoop configuration in order to check if what i have just changed in mapred-site.xml is really kicked in. Found that HADOOP-6184 ...
    Patrick sangPatrick sang
    Sep 28, 2011 at 9:00 pm
    Sep 29, 2011 at 6:29 pm
  • Hi, Suppose I am having 10 windows machines and if I have 10 VM individual instances running on these machines independently, can I use these VM instances to communicate with each other so that I can ...
    Praveenesh kumarPraveenesh kumar
    Sep 28, 2011 at 6:39 am
    Sep 28, 2011 at 9:22 am
  • Hi, in the first phase we are planning to establish a small cluster with few commodity computer (each 1GB, 200GB,..). Cluster would run ubuntu server 10.10 and a hadoop build from the branch 0.20.204 ...
    Merto MertekMerto Mertek
    Sep 23, 2011 at 3:00 pm
    Sep 27, 2011 at 8:03 pm
  • Hi all, I am working around the code to understand where HDFS divides a file into blocks. Can anyone point me to this section of the code? Thanks, Kartheek
    Kartheek muthyalaKartheek muthyala
    Sep 25, 2011 at 5:36 am
    Sep 26, 2011 at 6:41 am
  • Hi, i am receiving messages from two mailing lists ("common-dev","common-user") and I would like to disable receiving msg from jira. I am not a member of "common-issues-unsubscribe" list. Can I ...
    Merto MertekMerto Mertek
    Sep 23, 2011 at 1:57 pm
    Sep 23, 2011 at 2:27 pm
  • Is there any way that we can run a particular job in a hadoop on subset of datanodes ? My problem is I don't want to use all the nodes to run some job, I am trying to make Job completion Vs No. of ...
    Praveenesh kumarPraveenesh kumar
    Sep 21, 2011 at 1:02 pm
    Sep 21, 2011 at 3:02 pm
  • Hi, I'm trying to create a table similar to apache_log but I'm trying to avoid to write my own map-reduce task because I don't want to have my HDFS files twice. So if you're working with log lines ...
    Raimon BoschRaimon Bosch
    Sep 1, 2011 at 2:08 pm
    Sep 18, 2011 at 6:19 am
  • Hi ! I have set up single-node cluster using ...
    Arun kArun k
    Sep 14, 2011 at 10:58 am
    Sep 16, 2011 at 1:29 pm
  • Hi, I am using the latest Cloudera distribution, and with that I am able to use the latest Hadoop API, which I believe is 0.21, for such things as import org.apache.hadoop.mapreduce.Reducer; So I am ...
    Mark KerznerMark Kerzner
    Sep 15, 2011 at 2:49 am
    Sep 15, 2011 at 4:49 am
  • I have some 0.20.2 Hadoop code that extends org.apache.hadoop.streaming.io.IdentifierResolver. That code found this in the hadoop-*-streaming.jar in the contrib/streaming directory. I am trying to ...
    W.P. McNeillW.P. McNeill
    Sep 6, 2011 at 11:28 pm
    Sep 8, 2011 at 6:52 pm
  • Hi, I am testing my Hadoop-based FreeEed <http://freeeed.org/ , an open source tool for eDiscovery, and I am using the Enron data ...
    Mark KerznerMark Kerzner
    Sep 7, 2011 at 1:42 am
    Sep 7, 2011 at 4:59 am
  • I've got a Hadoop job that uses StringUtils.getTrimmedStringCollection. It works when run on version 0.20.2, but now when I run it on a 0.20.203 cluster I get a No Such Method Exception for this ...
    W.P. McNeillW.P. McNeill
    Sep 6, 2011 at 5:47 pm
    Sep 6, 2011 at 6:30 pm
  • I have recently rebuilt a server with centos 6.0 and it seems that something caused hadoop-fuse to get confused and it is no longer able to find libjvm.so. The error i get is find: ...
    John BondJohn Bond
    Sep 5, 2011 at 2:09 pm
    Nov 30, 2011 at 4:48 am
  • hi, every time after starting our hadoop cluster (using Cloudera's) this message appears: 2011-09-13 04:35:05,207 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode extension entered. The ...
    Sep 14, 2011 at 10:35 am
    Nov 22, 2011 at 8:10 am
Group Navigation
period‹ prev | Sep 2011 | next ›
Group Overview
groupcommon-user @

157 users for September 2011

Uma Maheswara Rao G 72686: 45 posts Harsh J: 38 posts ArunKumar: 22 posts John smith: 18 posts Steve Loughran: 16 posts W.P. McNeill: 15 posts Praveenesh kumar: 14 posts GOEKE, MATTHEW (AG/1000): 13 posts Robert Evans: 12 posts Meng Mao: 11 posts Arun C Murthy: 10 posts Jignesh Patel: 10 posts Joey Echeverria: 10 posts Peng, Wei: 10 posts Bikash sharma: 9 posts Mark Kerzner: 9 posts Michael Segel: 9 posts Daniel Yehdego: 8 posts Shanmuganathan.r: 8 posts Vivek K: 8 posts
show more