FAQ

Search Discussions

168 discussions - 710 posts

  • Hi, I'm following the steps at http://wiki.apache.org/hadoop/HowToContribute for building hadoop in preparation for submitting a patch. I've checked out the trunk, and when I run "mvn test" from the ...
    Tony BurtonTony Burton
    Sep 5, 2012 at 1:50 pm
    Sep 11, 2012 at 3:43 pm
  • Hi Users, Please clarify the below questions. 1. With in 10 minutes one petabyte of data load into HDFS/HIVE , how many slave (Data Nodes) machines required. 2. With in 10 minutes one petabyte of ...
    prabhu Kprabhu K
    Sep 5, 2012 at 12:22 pm
    Sep 10, 2012 at 7:54 pm
  • <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" <HTML <HEAD <META content="text/html; charset=utf-8" http-equiv=Content-Type <META name=GENERATOR content="MSHTML 9.00.8112.16448" ...
    Akiyoshi TakahashiAkiyoshi Takahashi
    Sep 5, 2012 at 6:17 pm
    Sep 9, 2012 at 9:30 am
  • Hello, How can one get to know the nodes on which reduce tasks will run? One of my job is running and it's completing all the map tasks. My map tasks write lots of intermediate data. The intermediate ...
    Abhay RatnaparkhiAbhay Ratnaparkhi
    Sep 3, 2012 at 2:19 pm
    Sep 5, 2012 at 6:39 pm
  • Hi, After reviewing the class's (not very complicated) code, I have some questions I hope someone can answer: - (more general question) Are there many use-cases for using DBInputFormat? Do most ...
    Yaron GonenYaron Gonen
    Sep 11, 2012 at 12:42 pm
    Sep 26, 2012 at 1:48 pm
  • well, it's worked for me in the past outside Hadoop itself ...
    Steve LoughranSteve Loughran
    Sep 1, 2012 at 8:08 am
    Sep 6, 2012 at 2:48 pm
  • Hello all, I am testing the Hadoop recovery as per http://wiki.apache.org/hadoop/NameNode document. But instead of using an NFS share, I am copying to another directory. Then when I shut down the ...
    Artem ErvitsArtem Ervits
    Sep 17, 2012 at 9:39 pm
    Sep 19, 2012 at 3:10 pm
  • Hi, What happens when an existing (not new) datanode rejoins a cluster for following scenarios: 1. Some of the blocks it was managing are deleted/modified? 2. The size of the blocks are now modified ...
    Mehul ChoubeMehul Choube
    Sep 11, 2012 at 7:15 am
    Sep 11, 2012 at 9:12 am
  • Hi all, Now I'd like to deploy a simple MapReduce job, written in Java, to a remote cluster within Eclipse. For the moment I've found this solution: 1) Put the hadoop conf file in the classpath 1) ...
    Alberto CordioliAlberto Cordioli
    Sep 20, 2012 at 2:15 pm
    Sep 21, 2012 at 4:43 pm
  • Our hadoop version is hadoop-0.20-append+4. We have configured the rack awareness in the namenode. But when I add new datanode, and update the topology data file, and restart the datanode, I just see ...
    Jameson LiJameson Li
    Sep 13, 2012 at 3:04 am
    Sep 14, 2012 at 4:29 am
  • Hi all, Anybody had a success compiling Map reduce jobs with big insights distro on MAC? Seems like it requires IBM java JDK and it might not be available on MAC. Is there a way work around it? ...
    Serge BlazhiyevskyySerge Blazhiyevskyy
    Sep 19, 2012 at 9:15 pm
    Sep 25, 2012 at 6:31 pm
  • I have uploaded some images to hdfs hadoop user/combo/ directory now want to show those images in a jsp i have configured tomcat and hadoop properly i m able to do uploads any ideas on how to build ...
    Visioner SadakVisioner Sadak
    Sep 6, 2012 at 2:57 pm
    Sep 12, 2012 at 6:55 am
  • Hi, all I have written a simple MR program which partition a file into multiple files bases on the clustering result of the points in this file, here is my code: --- private int run() throws ...
    Jason YangJason Yang
    Sep 17, 2012 at 1:51 pm
    Sep 18, 2012 at 6:08 am
  • Hi, I would like to perform a map-side join of two large datasets where dataset A consists of m*n elements and dataset B consists of n elements. For the join, every element in dataset B needs to be ...
    Sigurd SpieckermannSigurd Spieckermann
    Sep 10, 2012 at 9:58 am
    Sep 17, 2012 at 1:51 pm
  • Hi all, I ran a query on hive on top of 90 million records that took 12 minutes to execute and same query on sql server took 8 minutes.My question is how can i make hadoop's performance better.What ...
    Iwannaplay gamesIwannaplay games
    Sep 5, 2012 at 6:19 am
    Sep 7, 2012 at 7:09 pm
  • Hi Guys Is there some 3rd party monitor tool that i can use to monitor the hadoop cluster, especially that i can get a notification/email when there is a job failed? Thanks for any suggestion ...
    WangRamonWangRamon
    Sep 6, 2012 at 8:33 am
    Sep 6, 2012 at 7:16 pm
  • I've been running up against the good old fashioned "replicated to 0 nodes" gremlin quite a bit recently. My system (a set of processes interacting with hadoop, and of course hadoop itself) runs for ...
    Keith WileyKeith Wiley
    Sep 4, 2012 at 4:42 pm
    Sep 5, 2012 at 3:42 am
  • Is it possible to write unit test for mapper Map , and reducer Reduce function ? -Ravi
    Ravi PRavi P
    Sep 26, 2012 at 8:19 pm
    Sep 26, 2012 at 9:31 pm
  • Hi guys, I'm experiencing a strange behavior when I use the Hadoop join-package. After running a job the result statistics show that my combiner has an input of 100 records and an output of 100 ...
    Sigurd SpieckermannSigurd Spieckermann
    Sep 25, 2012 at 8:33 am
    Sep 25, 2012 at 5:03 pm
  • Dear Team Members, I am working as a Linux Administrator, I am interested to work on Hadoop. Please let me know from where and how I can start to learning. It is very great full to help for learning ...
    Munnavar ShaikMunnavar Shaik
    Sep 13, 2012 at 5:38 pm
    Sep 13, 2012 at 7:20 pm
  • Hi, We have a requirement where we have change our Hadoop Cluster's Replication Factor without restarting the Cluster. We are running our Cluster on Amazon EMR. Can you please suggest the way to ...
    Uddipan MukherjeeUddipan Mukherjee
    Sep 5, 2012 at 6:03 pm
    Sep 5, 2012 at 8:41 pm
  • Hello, We have 15 node cluster and right now we dont have Kerberos implemented. But on urgent basis we want to secure the cluster. Right now anyone who know IP of Namenode can just download the ...
    Shin ChanShin Chan
    Sep 28, 2012 at 9:24 am
    Sep 28, 2012 at 4:18 pm
  • Hi all. We're using Hadoop 1.0.3. We need to pick up a set of large (4+GB) files when they've finished being written to HDFS by a different process. There doesn't appear to be an API specifically for ...
    Peter SheridanPeter Sheridan
    Sep 25, 2012 at 4:29 pm
    Sep 27, 2012 at 10:04 pm
  • Hi all, I have to join to large datasets A and B. I preprocess both datasets by parsing the source text files and creating custom datatypes ADT and BDT out ouf it. Now I have to join theses data ...
    Oliver B. FischerOliver B. Fischer
    Sep 26, 2012 at 1:19 pm
    Sep 26, 2012 at 2:49 pm
  • with speculative execution enabled Hadoop can run task attempt on more then 1 node. If mapper is using multipleoutputs then second attempt (or sometimes even all) fails to create output file because ...
    Radim KolarRadim Kolar
    Sep 12, 2012 at 10:52 pm
    Sep 14, 2012 at 5:31 pm
  • Hi, all I have a question about how does the pseudo-distributed Hadoop cluster work: As many map tasks are submitted to the pseudo-distributed Hadoop cluster, does the hadoop run each mapper in ...
    Jason YangJason Yang
    Sep 14, 2012 at 6:04 am
    Sep 14, 2012 at 8:21 am
  • Observe: ~/ $ hd fs -put test /test put: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create file/test. Name node is in safe mode. ~/ $ hadoop dfsadmin -safemode leave Safe mode ...
    Keith WileyKeith Wiley
    Sep 4, 2012 at 5:08 pm
    Sep 9, 2012 at 10:39 pm
  • 6

    SNN

    Hi When I start my cluster (with start-dfs.sh), secondary namenodes are created on all the machines in conf/slaves. I set conf/masters to a single different machine (along with dfs.http.address ...
    SurferSurfer
    Sep 4, 2012 at 8:22 am
    Sep 5, 2012 at 7:47 am
  • I would like to use the CombineFileInputFormat in a sequence of mapreduce jobs that run on Hadoop 20.2. I noticed that the class was in a mapred package, rather than in the mapreduce package. When I ...
    Anna LahoudAnna Lahoud
    Sep 27, 2012 at 8:02 pm
    Sep 28, 2012 at 12:17 pm
  • Hi, all I have been stuck by a weird problem for a long time. so I was wondering could anyone give me some advise? I have a MapReduce Job , in which: 1. the mapper would read a whole file as a split ...
    Jason YangJason Yang
    Sep 24, 2012 at 1:26 pm
    Sep 24, 2012 at 4:38 pm
  • Hello experts could you judge whether webhdfs is fast or hdfsproxy is fast, is hdfs proxy slower coz it uses https only or can we use http also in hdfsproxy, its also mentioned in this below ...
    Visioner SadakVisioner Sadak
    Sep 19, 2012 at 1:49 pm
    Sep 20, 2012 at 6:15 am
  • Hello, I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a pseudo-distributed mode. After download / install / setup config files I ran the following namenode format command as suggested in the ...
    Jason HuangJason Huang
    Sep 14, 2012 at 3:02 pm
    Sep 15, 2012 at 5:58 pm
  • Hi! I'm using JobControl (v. 1.0.3) to chain two MapReduce applications. It works and creates output data, but it doesn't give me back information messages as number of mappers, number of records in ...
    Piter85 Piter85Piter85 Piter85
    Sep 12, 2012 at 9:35 am
    Sep 12, 2012 at 1:06 pm
  • Hi, I want to make sure my understanding about task assignment in hadoop is correct or not. When scanning a file with multiple tasktrackers, I am wondering how a task is assigned to each ...
    Hiroyuki YamadaHiroyuki Yamada
    Sep 11, 2012 at 1:02 pm
    Sep 12, 2012 at 9:45 am
  • Hi, all I was wondering what's the default number of reducer if I don't set it in configuration? Will it change dynamically according to the output volume of Mapper? -- YANG, Lin
    Jason YangJason Yang
    Sep 11, 2012 at 11:24 am
    Sep 11, 2012 at 12:38 pm
  • Sorry for admin-only content: can we remove this address from the list? I get the bounce message below whenever I post to <span class="m_body_email_addr" title="858a0c8e479a78c1038b7355244ec07c" ...
    Tony BurtonTony Burton
    Sep 10, 2012 at 9:46 am
    Sep 11, 2012 at 9:56 am
  • Hello hadoopers! In a reduce-only Hadoop job input files are handled by the identity mapper and sent to the reducers without modification. In one of my job I was surprised to see the job failing in ...
    JOAQUIN GUANTER GONZALBEZJOAQUIN GUANTER GONZALBEZ
    Sep 6, 2012 at 2:51 pm
    Sep 7, 2012 at 3:49 am
  • Using hadoop with mahout in a local filesystem/non-hdfs config for debugging purposes inside Intellij IDEA. When I run one particular part of the analysis I get the error below. I didn't write the ...
    Pat FerrelPat Ferrel
    Sep 3, 2012 at 3:59 pm
    Sep 6, 2012 at 11:53 pm
  • Hello again, i just wanted to keep you updated, in case anybody reads this and is interested in the mortbay-issue: I applied the wordcount example to a big input file and everything worked fine. I ...
    Björn-Elmar MacekBjörn-Elmar Macek
    Sep 5, 2012 at 11:56 am
    Sep 5, 2012 at 1:53 pm
  • Hi: I wish to use Hadoop streaming to run a program which requires specific PATH and CLASSPATH variables. I have set these two variables in both "/etc/profile" and "~/.bashrc" on all slaves (and ...
    Andy XueAndy Xue
    Sep 4, 2012 at 7:42 am
    Sep 4, 2012 at 8:52 am
  • genrally in hadoop map function will be exeucted by all the data nodes on the input data set ,against this how can i do the following. i have some filter programs , and what i want to do is each data ...
    Mallik arjunMallik arjun
    Sep 3, 2012 at 4:20 pm
    Sep 4, 2012 at 8:04 am
  • Hi, I am running Hadoop 1.03 in Pseudo distributed mode, on a quad core Xeon processor with hyper-threading enabled. When I submit a job to process a file of size about 1.6 GB, only two concurrent ...
    Shing Hing ManShing Hing Man
    Sep 29, 2012 at 7:06 pm
    Sep 29, 2012 at 9:15 pm
  • Hi... Could someone help me with following scenario.. I want implement a job which should get 2 mapper outputs and send them to 1 reducer. Attached image show the flow I wanted.... Normal flow is ...
    Kumudu harshaniKumudu harshani
    Sep 23, 2012 at 3:40 am
    Sep 23, 2012 at 1:31 pm
  • Hi all, I'm greatly confused about the spill/sort/merge thing going on during the Map phase. Here are some stats: - io.sort.mb = 256 MB (80% spill threshold) - io.sort.factor = 64 - spills performed ...
    Martin DobmeierMartin Dobmeier
    Sep 13, 2012 at 2:05 pm
    Sep 22, 2012 at 8:43 am
  • Dear All, I am currently deploying hadoop 1.0.3 on my Debian 32-bit Linux. I think need a 32-bit binary file taskcontroller. However, I found the binary files provided in hadoop 1.0.3 is 64 bit. I ...
    Yongzhi WangYongzhi Wang
    Sep 18, 2012 at 3:05 am
    Sep 19, 2012 at 4:18 am
  • Hi there, Today we started deploying Mapr M3 into production. However we're having problems completing jobs. During a typical job the job return this: 12/09/11 16:33:20 INFO mapred.JobClient: Task Id ...
    Robin VerlangenRobin Verlangen
    Sep 13, 2012 at 12:39 pm
    Sep 13, 2012 at 5:40 pm
  • Hi, I have a sequence file written by SequenceFileOutputFormat with key/value type of <Text, BytesWritable , like below: Text BytesWritable ...
    Jason YangJason Yang
    Sep 12, 2012 at 3:16 am
    Sep 12, 2012 at 5:58 am
  • Hi all, I am running hadoop-0.20.2 on single node cluster, I run the command hadoop fsck / it shows error: Exception in thread "main" java.net.UnknownHostException: http at ...
    Yogesh dhariYogesh dhari
    Sep 11, 2012 at 3:55 pm
    Sep 11, 2012 at 4:47 pm
  • Hi, I'm new to hadoop and i've just played around with map reduce. I would like to check if my understanding to hadoop is correct and i would appreciate if anyone could correct me if i'm wrong. I ...
    Elaine GanElaine Gan
    Sep 11, 2012 at 1:56 am
    Sep 11, 2012 at 6:42 am
  • Thanks, for the deeper explanation. Now i understand what you ment. Either way, any clustering process requires calculating the distance of all points (not between all the points, but of all of them ...
    Dexter morganDexter morgan
    Sep 2, 2012 at 4:26 pm
    Sep 10, 2012 at 1:31 pm
Group Navigation
period‹ prev | Sep 2012 | next ›
Group Overview
grouphdfs-user @
categorieshadoop
discussions168
posts710
users218
websitehadoop.apache.org...
irc#hadoop

218 users for September 2012

Harsh J: 80 posts Jason Yang: 27 posts Bertrand Dechoux: 26 posts Hemanth Yamijala: 25 posts Michel Segel: 25 posts Bejoy KS: 24 posts Visioner Sadak: 24 posts Tony Burton: 14 posts Sigurd Spieckermann: 12 posts Narasingu Ramesh: 10 posts Steve Loughran: 10 posts Joshi, Rekha: 8 posts Keith Wiley: 8 posts Vinod Kumar Vavilapalli: 8 posts Artem Ervits: 7 posts Björn-Elmar Macek: 7 posts Yongzhi Wang: 7 posts Jason Huang: 6 posts Sathyavageeswaran: 6 posts Serge Blazhiyevskyy: 6 posts
show more