Search Discussions
-
Hi, I'm following the steps at http://wiki.apache.org/hadoop/HowToContribute for building hadoop in preparation for submitting a patch. I've checked out the trunk, and when I run "mvn test" from the ...
Tony Burton
Sep 5, 2012 at 1:50 pm
Sep 11, 2012 at 3:43 pm -
Hi Users, Please clarify the below questions. 1. With in 10 minutes one petabyte of data load into HDFS/HIVE , how many slave (Data Nodes) machines required. 2. With in 10 minutes one petabyte of ...
prabhu K
Sep 5, 2012 at 12:22 pm
Sep 10, 2012 at 7:54 pm -
18
Legal Matter
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" <HTML <HEAD <META content="text/html; charset=utf-8" http-equiv=Content-Type <META name=GENERATOR content="MSHTML 9.00.8112.16448" ...Akiyoshi Takahashi
Sep 5, 2012 at 6:17 pm
Sep 9, 2012 at 9:30 am -
Hello, How can one get to know the nodes on which reduce tasks will run? One of my job is running and it's completing all the map tasks. My map tasks write lots of intermediate data. The intermediate ...
Abhay Ratnaparkhi
Sep 3, 2012 at 2:19 pm
Sep 5, 2012 at 6:39 pm -
Hi, After reviewing the class's (not very complicated) code, I have some questions I hope someone can answer: - (more general question) Are there many use-cases for using DBInputFormat? Do most ...
Yaron Gonen
Sep 11, 2012 at 12:42 pm
Sep 26, 2012 at 1:48 pm -
well, it's worked for me in the past outside Hadoop itself ...
Steve Loughran
Sep 1, 2012 at 8:08 am
Sep 6, 2012 at 2:48 pm -
Hello all, I am testing the Hadoop recovery as per http://wiki.apache.org/hadoop/NameNode document. But instead of using an NFS share, I am copying to another directory. Then when I shut down the ...
Artem Ervits
Sep 17, 2012 at 9:39 pm
Sep 19, 2012 at 3:10 pm -
Hi, What happens when an existing (not new) datanode rejoins a cluster for following scenarios: 1. Some of the blocks it was managing are deleted/modified? 2. The size of the blocks are now modified ...
Mehul Choube
Sep 11, 2012 at 7:15 am
Sep 11, 2012 at 9:12 am -
Hi all, Now I'd like to deploy a simple MapReduce job, written in Java, to a remote cluster within Eclipse. For the moment I've found this solution: 1) Put the hadoop conf file in the classpath 1) ...
Alberto Cordioli
Sep 20, 2012 at 2:15 pm
Sep 21, 2012 at 4:43 pm -
Our hadoop version is hadoop-0.20-append+4. We have configured the rack awareness in the namenode. But when I add new datanode, and update the topology data file, and restart the datanode, I just see ...
Jameson Li
Sep 13, 2012 at 3:04 am
Sep 14, 2012 at 4:29 am -
Hi all, Anybody had a success compiling Map reduce jobs with big insights distro on MAC? Seems like it requires IBM java JDK and it might not be available on MAC. Is there a way work around it? ...
Serge Blazhiyevskyy
Sep 19, 2012 at 9:15 pm
Sep 25, 2012 at 6:31 pm -
I have uploaded some images to hdfs hadoop user/combo/ directory now want to show those images in a jsp i have configured tomcat and hadoop properly i m able to do uploads any ideas on how to build ...
Visioner Sadak
Sep 6, 2012 at 2:57 pm
Sep 12, 2012 at 6:55 am -
Hi, all I have written a simple MR program which partition a file into multiple files bases on the clustering result of the points in this file, here is my code: --- private int run() throws ...
Jason Yang
Sep 17, 2012 at 1:51 pm
Sep 18, 2012 at 6:08 am -
Hi, I would like to perform a map-side join of two large datasets where dataset A consists of m*n elements and dataset B consists of n elements. For the join, every element in dataset B needs to be ...
Sigurd Spieckermann
Sep 10, 2012 at 9:58 am
Sep 17, 2012 at 1:51 pm -
Hi all, I ran a query on hive on top of 90 million records that took 12 minutes to execute and same query on sql server took 8 minutes.My question is how can i make hadoop's performance better.What ...
Iwannaplay games
Sep 5, 2012 at 6:19 am
Sep 7, 2012 at 7:09 pm -
Hi Guys Is there some 3rd party monitor tool that i can use to monitor the hadoop cluster, especially that i can get a notification/email when there is a job failed? Thanks for any suggestion ...
WangRamon
Sep 6, 2012 at 8:33 am
Sep 6, 2012 at 7:16 pm -
I've been running up against the good old fashioned "replicated to 0 nodes" gremlin quite a bit recently. My system (a set of processes interacting with hadoop, and of course hadoop itself) runs for ...
Keith Wiley
Sep 4, 2012 at 4:42 pm
Sep 5, 2012 at 3:42 am -
Is it possible to write unit test for mapper Map , and reducer Reduce function ? -Ravi
Ravi P
Sep 26, 2012 at 8:19 pm
Sep 26, 2012 at 9:31 pm -
Hi guys, I'm experiencing a strange behavior when I use the Hadoop join-package. After running a job the result statistics show that my combiner has an input of 100 records and an output of 100 ...
Sigurd Spieckermann
Sep 25, 2012 at 8:33 am
Sep 25, 2012 at 5:03 pm -
Dear Team Members, I am working as a Linux Administrator, I am interested to work on Hadoop. Please let me know from where and how I can start to learning. It is very great full to help for learning ...
Munnavar Shaik
Sep 13, 2012 at 5:38 pm
Sep 13, 2012 at 7:20 pm -
Hi, We have a requirement where we have change our Hadoop Cluster's Replication Factor without restarting the Cluster. We are running our Cluster on Amazon EMR. Can you please suggest the way to ...
Uddipan Mukherjee
Sep 5, 2012 at 6:03 pm
Sep 5, 2012 at 8:41 pm -
Hello, We have 15 node cluster and right now we dont have Kerberos implemented. But on urgent basis we want to secure the cluster. Right now anyone who know IP of Namenode can just download the ...
Shin Chan
Sep 28, 2012 at 9:24 am
Sep 28, 2012 at 4:18 pm -
Hi all. We're using Hadoop 1.0.3. We need to pick up a set of large (4+GB) files when they've finished being written to HDFS by a different process. There doesn't appear to be an API specifically for ...
Peter Sheridan
Sep 25, 2012 at 4:29 pm
Sep 27, 2012 at 10:04 pm -
Hi all, I have to join to large datasets A and B. I preprocess both datasets by parsing the source text files and creating custom datatypes ADT and BDT out ouf it. Now I have to join theses data ...
Oliver B. Fischer
Sep 26, 2012 at 1:19 pm
Sep 26, 2012 at 2:49 pm -
with speculative execution enabled Hadoop can run task attempt on more then 1 node. If mapper is using multipleoutputs then second attempt (or sometimes even all) fails to create output file because ...
Radim Kolar
Sep 12, 2012 at 10:52 pm
Sep 14, 2012 at 5:31 pm -
Hi, all I have a question about how does the pseudo-distributed Hadoop cluster work: As many map tasks are submitted to the pseudo-distributed Hadoop cluster, does the hadoop run each mapper in ...
Jason Yang
Sep 14, 2012 at 6:04 am
Sep 14, 2012 at 8:21 am -
Observe: ~/ $ hd fs -put test /test put: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create file/test. Name node is in safe mode. ~/ $ hadoop dfsadmin -safemode leave Safe mode ...
Keith Wiley
Sep 4, 2012 at 5:08 pm
Sep 9, 2012 at 10:39 pm -
I would like to use the CombineFileInputFormat in a sequence of mapreduce jobs that run on Hadoop 20.2. I noticed that the class was in a mapred package, rather than in the mapreduce package. When I ...
Anna Lahoud
Sep 27, 2012 at 8:02 pm
Sep 28, 2012 at 12:17 pm -
Hi, all I have been stuck by a weird problem for a long time. so I was wondering could anyone give me some advise? I have a MapReduce Job , in which: 1. the mapper would read a whole file as a split ...
Jason Yang
Sep 24, 2012 at 1:26 pm
Sep 24, 2012 at 4:38 pm -
Hello experts could you judge whether webhdfs is fast or hdfsproxy is fast, is hdfs proxy slower coz it uses https only or can we use http also in hdfsproxy, its also mentioned in this below ...
Visioner Sadak
Sep 19, 2012 at 1:49 pm
Sep 20, 2012 at 6:15 am -
Hello, I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a pseudo-distributed mode. After download / install / setup config files I ran the following namenode format command as suggested in the ...
Jason Huang
Sep 14, 2012 at 3:02 pm
Sep 15, 2012 at 5:58 pm -
Hi! I'm using JobControl (v. 1.0.3) to chain two MapReduce applications. It works and creates output data, but it doesn't give me back information messages as number of mappers, number of records in ...
Piter85 Piter85
Sep 12, 2012 at 9:35 am
Sep 12, 2012 at 1:06 pm -
Hi, I want to make sure my understanding about task assignment in hadoop is correct or not. When scanning a file with multiple tasktrackers, I am wondering how a task is assigned to each ...
Hiroyuki Yamada
Sep 11, 2012 at 1:02 pm
Sep 12, 2012 at 9:45 am -
Hi, all I was wondering what's the default number of reducer if I don't set it in configuration? Will it change dynamically according to the output volume of Mapper? -- YANG, Lin
Jason Yang
Sep 11, 2012 at 11:24 am
Sep 11, 2012 at 12:38 pm -
Sorry for admin-only content: can we remove this address from the list? I get the bounce message below whenever I post to <span class="m_body_email_addr" title="858a0c8e479a78c1038b7355244ec07c" ...
Tony Burton
Sep 10, 2012 at 9:46 am
Sep 11, 2012 at 9:56 am -
Hello hadoopers! In a reduce-only Hadoop job input files are handled by the identity mapper and sent to the reducers without modification. In one of my job I was surprised to see the job failing in ...
JOAQUIN GUANTER GONZALBEZ
Sep 6, 2012 at 2:51 pm
Sep 7, 2012 at 3:49 am -
Using hadoop with mahout in a local filesystem/non-hdfs config for debugging purposes inside Intellij IDEA. When I run one particular part of the analysis I get the error below. I didn't write the ...
Pat Ferrel
Sep 3, 2012 at 3:59 pm
Sep 6, 2012 at 11:53 pm -
Hello again, i just wanted to keep you updated, in case anybody reads this and is interested in the mortbay-issue: I applied the wordcount example to a big input file and everything worked fine. I ...
Björn-Elmar Macek
Sep 5, 2012 at 11:56 am
Sep 5, 2012 at 1:53 pm -
Hi: I wish to use Hadoop streaming to run a program which requires specific PATH and CLASSPATH variables. I have set these two variables in both "/etc/profile" and "~/.bashrc" on all slaves (and ...
Andy Xue
Sep 4, 2012 at 7:42 am
Sep 4, 2012 at 8:52 am -
genrally in hadoop map function will be exeucted by all the data nodes on the input data set ,against this how can i do the following. i have some filter programs , and what i want to do is each data ...
Mallik arjun
Sep 3, 2012 at 4:20 pm
Sep 4, 2012 at 8:04 am -
Hi, I am running Hadoop 1.03 in Pseudo distributed mode, on a quad core Xeon processor with hyper-threading enabled. When I submit a job to process a file of size about 1.6 GB, only two concurrent ...
Shing Hing Man
Sep 29, 2012 at 7:06 pm
Sep 29, 2012 at 9:15 pm -
Hi... Could someone help me with following scenario.. I want implement a job which should get 2 mapper outputs and send them to 1 reducer. Attached image show the flow I wanted.... Normal flow is ...
Kumudu harshani
Sep 23, 2012 at 3:40 am
Sep 23, 2012 at 1:31 pm -
Hi all, I'm greatly confused about the spill/sort/merge thing going on during the Map phase. Here are some stats: - io.sort.mb = 256 MB (80% spill threshold) - io.sort.factor = 64 - spills performed ...
Martin Dobmeier
Sep 13, 2012 at 2:05 pm
Sep 22, 2012 at 8:43 am -
Dear All, I am currently deploying hadoop 1.0.3 on my Debian 32-bit Linux. I think need a 32-bit binary file taskcontroller. However, I found the binary files provided in hadoop 1.0.3 is 64 bit. I ...
Yongzhi Wang
Sep 18, 2012 at 3:05 am
Sep 19, 2012 at 4:18 am -
Hi there, Today we started deploying Mapr M3 into production. However we're having problems completing jobs. During a typical job the job return this: 12/09/11 16:33:20 INFO mapred.JobClient: Task Id ...
Robin Verlangen
Sep 13, 2012 at 12:39 pm
Sep 13, 2012 at 5:40 pm -
Hi, I have a sequence file written by SequenceFileOutputFormat with key/value type of <Text, BytesWritable , like below: Text BytesWritable ...
Jason Yang
Sep 12, 2012 at 3:16 am
Sep 12, 2012 at 5:58 am -
Hi all, I am running hadoop-0.20.2 on single node cluster, I run the command hadoop fsck / it shows error: Exception in thread "main" java.net.UnknownHostException: http at ...
Yogesh dhari
Sep 11, 2012 at 3:55 pm
Sep 11, 2012 at 4:47 pm -
Hi, I'm new to hadoop and i've just played around with map reduce. I would like to check if my understanding to hadoop is correct and i would appreciate if anyone could correct me if i'm wrong. I ...
Elaine Gan
Sep 11, 2012 at 1:56 am
Sep 11, 2012 at 6:42 am -
Thanks, for the deeper explanation. Now i understand what you ment. Either way, any clustering process requires calculating the distance of all points (not between all the points, but of all of them ...
Dexter morgan
Sep 2, 2012 at 4:26 pm
Sep 10, 2012 at 1:31 pm
Group Overview
group | hdfs-user |
categories | hadoop |
discussions | 168 |
posts | 710 |
users | 218 |
website | hadoop.apache.org... |
irc | #hadoop |
218 users for September 2012
Archives
- February 2013 (245)
- January 2013 (838)
- December 2012 (590)
- November 2012 (723)
- October 2012 (861)
- September 2012 (710)
- August 2012 (1,046)
- July 2012 (151)
- June 2012 (91)
- May 2012 (126)
- April 2012 (95)
- March 2012 (64)
- February 2012 (128)
- January 2012 (258)
- December 2011 (110)
- November 2011 (164)
- October 2011 (83)
- September 2011 (101)
- August 2011 (58)
- July 2011 (73)
- June 2011 (101)
- May 2011 (184)
- April 2011 (51)
- March 2011 (110)
- February 2011 (100)
- January 2011 (101)
- December 2010 (44)
- November 2010 (49)
- October 2010 (48)
- September 2010 (26)
- August 2010 (52)
- July 2010 (50)
- June 2010 (64)
- May 2010 (57)
- April 2010 (45)
- March 2010 (38)
- February 2010 (10)
- January 2010 (84)
- December 2009 (3)
- November 2009 (38)
- October 2009 (43)
- September 2009 (32)
- August 2009 (35)
- July 2009 (5)