FAQ

Search Discussions

124 discussions - 499 posts

  • I always have this question but couldn't find proper answer for this. For system level applications, c/c++ is preferable. But why this one using java?
    Elton skyElton sky
    Oct 10, 2010 at 4:40 am
    Mar 16, 2011 at 8:58 pm
  • Hi, I want to load a serialized HashMap object in hadoop. The file of stored object is 200M. I could read that object efficiently in JAVA by setting -Xmx as 1000M. However, in hadoop I could never ...
    Shi YuShi Yu
    Oct 11, 2010 at 8:49 pm
    Oct 14, 2010 at 2:02 am
  • Hello, I have mappers that do not need much ram but combiners and reducers need a lot. Is it possible to set different VM parameters for mappers and reducers? PS Often I face interesting problem, on ...
    Vitaliy SemochkinVitaliy Semochkin
    Oct 5, 2010 at 1:00 pm
    Oct 12, 2010 at 6:44 pm
  • Hello, In hdfs.org.apache.hadoop.hdfs.DFSClient<eclipse-javadoc:%E2%98%82=HadoopSrcCode/src%3Chdfs.org.apache.hadoop.hdfs%7BDFSClient.java%E2%98%83DFSClient ...
    Elton skyElton sky
    Oct 18, 2010 at 10:34 am
    Oct 22, 2010 at 5:06 am
  • Hi, I am new to Hadoop and new to the community. I am using Hadoop 0.20.2 and working through the LineIndexer example. I have defined the Reducer using the followinjob.setReducerClass ...
    Cameron_Cameron_
    Oct 30, 2010 at 6:08 pm
    Nov 2, 2010 at 8:43 pm
  • Hi , I am getting the error http://repo1.maven.org/maven2/junit/junit/3.8.1/junit-3.8.1.pom: invalid sha1: .. Is it downloading the corrupt file or is there any other thing which I need to take care ...
    Bharath vBharath v
    Oct 30, 2010 at 1:14 pm
    Nov 1, 2010 at 10:55 am
  • Hi, When I am running the following command in Mandriva Linux hadoop namenode -format I am getting the following error: 10/10/09 22:32:07 INFO namenode.NameNode: STARTUP_MSG: ...
    Siddharth raghuvanshiSiddharth raghuvanshi
    Oct 9, 2010 at 5:10 pm
    Oct 12, 2010 at 9:48 am
  • Hello, I am a beginner with hadoop framework. I am trying create a distributed crawling application. I have googled a lot. but the resources are too low. Can anyone please help me on the following ...
    Burhan UddinBurhan Uddin
    Oct 22, 2010 at 8:43 pm
    Oct 27, 2010 at 4:41 pm
  • Hello, I have a simple map-reduce job that reads in zipped files and converts them to lzo compression. Some of the files are not properly zipped which results in Hadoop throwing an ...
    EdEd
    Oct 19, 2010 at 10:45 pm
    Oct 21, 2010 at 10:00 pm
  • Job Title: Sr. Developer - Hadoop Hive Location: Cupertino, CA Relevant Experience (Yrs) 10 + Yrs Technical/Functional Skills 10+ years of strong technical and implementation experience in ...
    Ram PrakashRam Prakash
    Oct 29, 2010 at 9:39 pm
    Nov 1, 2010 at 8:26 pm
  • Hello everyone, I just recently started running the balancer to fix job errors where a particular task runs out of local disk; however, I've noticed that I usually end up with a significant amount of ...
    Jones, NickJones, Nick
    Oct 27, 2010 at 1:41 pm
    Oct 27, 2010 at 6:46 pm
  • I was wondering if there were any projects out there doing a small file management layer on top of Hadoop? I know that HDFS is primarily for map/reduce but I think companies are going to start using ...
    Ananth SarathyAnanth Sarathy
    Oct 26, 2010 at 4:29 pm
    Oct 27, 2010 at 6:08 pm
  • Hi, I am doing a student Independent Study Project and Harvery Mudd has given me 13 Sun Netra X1 I can use as a dedicated Hadoop cluster. Right now they are without an OS. If anyone with experience ...
    Bruce WilliamsBruce Williams
    Oct 16, 2010 at 8:09 pm
    Oct 19, 2010 at 12:04 pm
  • Hi, all I do an application using hadoop. I take 1GB text data as input the result as follows: (1) the cluster of 3 PCs: the time consumed is 1020 seconds. (2) the cluster of 4 PCs: the time is about ...
    JanderJander
    Oct 5, 2010 at 7:35 am
    Oct 6, 2010 at 12:22 am
  • Wow. I could use help quickly... My name node is reporting a null BV. All the data nodes report the same Build Version. We were not upgrading the DFS, but did stop, restart, after adding a jar to ...
    Phil youngPhil young
    Oct 25, 2010 at 11:13 pm
    Oct 26, 2010 at 3:05 am
  • I am using the following to set my number of reduce tasks, however when I run my job it's always using just 1 reducer. conf.setInt("mapred.reduce.tasks", 20); 1 reducer will never finish this job. ...
    Matt TanquaryMatt Tanquary
    Oct 21, 2010 at 8:40 pm
    Oct 21, 2010 at 9:39 pm
  • Hi, during map phase I recieved following expcetion java.lang.NullPointerException at org.apache.hadoop.io.Text.encode(Text.java:388) at org.apache.hadoop.io.Text.encode(Text.java:369) at ...
    Vitaliy SemochkinVitaliy Semochkin
    Oct 14, 2010 at 10:42 am
    Oct 15, 2010 at 11:33 am
  • Hi, I'm trying to run Hadoop on two computers out of 12 network-connected computers. I know only know the 2 computer names ( 'snoopy','booboo') as established by the department and the host name is ...
    MahaMaha
    Oct 12, 2010 at 10:11 pm
    Oct 13, 2010 at 6:26 pm
  • Hi, I am trying to run a job on my hadoop cluster, where I get consistently get heap space error. I increased the heap-space to 4 GB in hadoop-env.sh and reboot the cluster. However, I still get the ...
    Pramy BhatsPramy Bhats
    Oct 5, 2010 at 1:41 pm
    Oct 6, 2010 at 2:05 pm
  • Hi Folks, I'm sure this is easy for you guy, so please let me know. What's the solution when the NameNode doesn't start other than formatting it? I also tried stop-dfs.sh and starting it again over ...
    MahaMaha
    Oct 5, 2010 at 5:25 am
    Oct 5, 2010 at 6:45 am
  • Hi, I am running some code on a cluster with several nodes (ranging from 1 to 30) using hadoop-0.19.2. In a test, I only put a single file under the input folder, however, each time I find the logged ...
    Shi YuShi Yu
    Oct 2, 2010 at 4:31 pm
    Oct 2, 2010 at 8:06 pm
  • Hi, I'm seeing in my experiments that Fuse-HDFS is significantly slower (around 3x slower) than using the Java hdfs API directly. Wanted to ask if this slowness the norm? Or is there something wrong ...
    Aniket rayAniket ray
    Oct 26, 2010 at 5:17 am
    Oct 26, 2010 at 8:43 pm
  • Hi all, I am running a program with input 1 million lines of data, among the 1 million, 5 or 6 lines data are corrupted. The way the are corrupted is: in the position which a float number is ...
    Boyu ZhangBoyu Zhang
    Oct 15, 2010 at 9:02 pm
    Oct 16, 2010 at 10:44 pm
  • Hi I am practising some programs in Map-Reduce such as WordCount, Word Search , Grep etc Now I want to know is it possible to write Map-Reduce program on hadoop for finding *Factorial of Number*. In ...
    Adarsh SharmaAdarsh Sharma
    Oct 13, 2010 at 6:55 am
    Oct 15, 2010 at 12:18 am
  • I'm working with hadoop 0.20.2 using the new API contained in the package: org.apache.hadoop.mapreduce I have noticed that MultipleInputs is under: org.apache.hadoop.mapred and when setting a path it ...
    Marc SturleseMarc Sturlese
    Oct 6, 2010 at 4:22 pm
    Oct 7, 2010 at 8:15 am
  • Hi all , I have been recently working on a task where I need to take in two input (types) files , compare them and produce a result from it using a logic. But as I understand simple MapReduce ...
    Matthew JohnMatthew John
    Oct 14, 2010 at 2:03 pm
    Jan 1, 2011 at 7:55 pm
  • Hi all, As all of us know that Hadoop considers the user who starts the hadoop cluster as superuser. It provides all access to HDFS to that user. But know I want to know that how we can R/W access to ...
    Adarsh SharmaAdarsh Sharma
    Oct 29, 2010 at 6:14 am
    Oct 29, 2010 at 2:00 pm
  • Once I run a map-reduce job I get output in the form of part-r-00000 part-r-00001 ... In many cases the output is significantly smaller than the original input - take the classic word count In most ...
    Steve LewisSteve Lewis
    Oct 23, 2010 at 10:08 pm
    Oct 24, 2010 at 12:44 am
  • Is there a way to control maximum number of mappers or reducers per node per job? i.e. say I have a cluster, now I want to run a job such that on each node, no more than 2 mappers are expected to run ...
    Jiang lichtJiang licht
    Oct 21, 2010 at 1:02 am
    Oct 22, 2010 at 3:50 pm
  • Hi all I have started hadoop and its running am getting following error when i try to execute a simple map reduce program root@cjb3f-69:/home/media/Desktop# hadoop-0.20 jar HadoopSample.jar ...
    Biju .BBiju .B
    Oct 22, 2010 at 9:16 am
    Oct 22, 2010 at 2:10 pm
  • Hi all, We currently have Cloudera's Hadoop beta 3 installed on our cluster, we would like to upgrade to the latest stable release CDH3. Is there documentation or recommended steps on how to do this? ...
    Abhinay MehtaAbhinay Mehta
    Oct 20, 2010 at 9:58 am
    Oct 21, 2010 at 2:56 pm
  • When I run hive, the derby.log will be created in the current directory. I was looking around and found that if I create derby.properties and add the line ...
    Bichonfrise74Bichonfrise74
    Oct 19, 2010 at 11:59 pm
    Oct 20, 2010 at 11:53 pm
  • Hi all, I had a small doubt regarding the reduce module. What I understand is that after the shuffle / sort phase , all the records with the same key value goes into a reduce function. If thats the ...
    Matthew JohnMatthew John
    Oct 18, 2010 at 10:20 am
    Oct 19, 2010 at 1:06 pm
  • hadoop logs keep increasing and not removed automatically according to the logging property. I didn't change the default logging property but it never works as it should. The default logging property ...
    ShanganShangan
    Oct 18, 2010 at 5:06 am
    Oct 18, 2010 at 5:36 am
  • Saturday i would liek to modify simple word count program so that i can produce text file from given html files ( by extracting text content only beween <title and </title and <text and </text . When ...
    Tri DoanTri Doan
    Oct 16, 2010 at 2:57 pm
    Oct 16, 2010 at 4:39 pm
  • Dear All, I am trying to run a memory hungry program in a cluster with 6 nodes, among the 6 nodes, 2 of them have 32 G memory, and the rest have 16 G memory. I am wondering is there a way of ...
    Boyu ZhangBoyu Zhang
    Oct 8, 2010 at 4:18 pm
    Oct 15, 2010 at 8:58 pm
  • Is there a command that will display which nodes the blocks of a file are replicated to? We're prototyping a hadoop cluster and want to perform some failure testing where we kill the correct ...
    AdamphelpsAdamphelps
    Oct 11, 2010 at 11:27 pm
    Oct 13, 2010 at 4:30 pm
  • Hi again, I guess my questions are easy.. Since I'm installing hadoop in my school machine I have to veiw namenode online via hdfs://host-name:50070 instead of the default link provided by Hadoop ...
    Maha A. AlabduljalilMaha A. Alabduljalil
    Oct 6, 2010 at 7:56 pm
    Oct 6, 2010 at 10:28 pm
  • Hi, I've recently downloaded Hadoop-0.21.0. After the installation, I've noticed that there is no "contrib" directory that used to exist in Hadoop-0.20.2. So I was wondering if there is no ...
    Edward choiEdward choi
    Oct 5, 2010 at 3:04 am
    Oct 6, 2010 at 12:15 am
  • Hello, Using a map reduce and the image list from image-net I have created a set of sequence files which contain roughly 11 million images (split across roughly 100 part files). holding the images in ...
    Samangooei SinaSamangooei Sina
    Oct 2, 2010 at 12:22 pm
    Oct 4, 2010 at 9:20 am
  • There is a suggestion to set the number of reducers to a prime number closest to the number of nodes and number of mappers a prime number closest to several times the number of nodes in the cluster. ...
    Shi YuShi Yu
    Oct 24, 2010 at 2:00 am
    Oct 27, 2010 at 6:07 pm
  • is it possible to add a custom-site.xml resource (wich is placed in hdfs) to a Configuration? Something like: Configuration cc = new Configuration(); Path p = new ...
    Marc SturleseMarc Sturlese
    Oct 26, 2010 at 5:06 pm
    Oct 27, 2010 at 6:46 am
  • Hello everyone, I am having problems using MultipleOutputs with LZO compression (could be a bug or something wrong in my own code). In my driver I set MultipleOutputs.addNamedOutput(job, "test", ...
    EdEd
    Oct 21, 2010 at 9:52 pm
    Oct 26, 2010 at 2:04 pm
  • Hi, I have a problem of comparing two huge files (100G each) consist of string sequence. It is more like the file text compare problem. I would like to find out how many strings are different within ...
    Shi YuShi Yu
    Oct 21, 2010 at 2:07 am
    Oct 25, 2010 at 5:31 pm
  • Hi! I'm running Cloudera CDH2 update 2 (hadoop-0.20 0.20.1+169.113), and after the upgrade I'm getting the following error in the reducers during the copy phase in one of my larger jobs: 2010-10-20 ...
    Erik ForsbergErik Forsberg
    Oct 20, 2010 at 5:49 pm
    Oct 21, 2010 at 3:19 pm
  • Hi, I have attached the relevant part of jobtracker log. The job1 had 3 splits, but it started 5 map tasks, m_00000 through m_00004. ( I have the speculative execution turned off). The job some how ...
    Murali Krishna. PMurali Krishna. P
    Oct 15, 2010 at 11:41 am
    Oct 20, 2010 at 8:18 am
  • Dear All, I met a problem that when I start the hadoop ,I could not find the slave nodes , my OS system is CentOS, when run the mapreduce program, it alarms me that: 10/10/16 20:32:41 INFO ...
    冯超冯超
    Oct 16, 2010 at 1:02 pm
    Oct 17, 2010 at 1:13 am
  • Hi, I use Hadoop 0.20.3-dev on Ubuntu. I use it in pseudo-distributed mode in a single node cluster. I have already run mapreduce programs for wordcount and building inverted index. I am trying to ...
    Bibek PaudelBibek Paudel
    Oct 12, 2010 at 12:03 pm
    Oct 12, 2010 at 8:57 pm
  • Hi, I am still confused about the effect of using Combiner in Hadoop Map/Reduce. The performance tips (http://www.cloudera.com/blog/2009/12/7-tips-for-improving-mapreduce-performance/) suggest us to ...
    Shi YuShi Yu
    Oct 5, 2010 at 11:32 pm
    Oct 6, 2010 at 10:38 pm
  • Hi everyone. This is my first post on this mailing list and I hope I won't do any mistakes ;) I work for a French telecom operator ad we have recently started a Hadoop project. We currently have two ...
    Arthur CarantaArthur Caranta
    Oct 4, 2010 at 1:07 pm
    Oct 4, 2010 at 1:46 pm
Group Navigation
period‹ prev | Oct 2010 | next ›
Group Overview
groupcommon-user @
categorieshadoop
discussions124
posts499
users163
websitehadoop.apache.org...
irc#hadoop

163 users for October 2010

Shi Yu: 30 posts Harsh J: 28 posts Allen Wittenauer: 18 posts Ed: 17 posts Maha A. Alabduljalil: 16 posts Steve Loughran: 14 posts Matt Tanquary: 10 posts Vitaliy Semochkin: 10 posts Matthew John: 9 posts Elton sky: 8 posts Siddharth raghuvanshi: 8 posts Steve Lewis: 7 posts Sudhir Vallamkondu: 7 posts Brian Bockelman: 6 posts Luke Lu: 6 posts Owen O'Malley: 6 posts Adarsh Sharma: 5 posts Bharath v: 5 posts Boyu Zhang: 5 posts Hemanth Yamijala: 5 posts
show more