Search Discussions

172 discussions - 794 posts

  • Hi Users, Please clarify the below questions. 1. With in 10 minutes one petabyte of data load into HDFS/HIVE , how many slave (Data Nodes) machines required. 2. With in 10 minutes one petabyte of ...
    prabhu Kprabhu K
    Sep 5, 2012 at 12:22 pm
    Sep 10, 2012 at 7:54 pm
  • <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" <HTML <HEAD <META content="text/html; charset=utf-8" http-equiv=Content-Type <META name=GENERATOR content="MSHTML 9.00.8112.16448" ...
    Akiyoshi TakahashiAkiyoshi Takahashi
    Sep 5, 2012 at 6:17 pm
    Sep 9, 2012 at 9:30 am
  • Hello, How can one get to know the nodes on which reduce tasks will run? One of my job is running and it's completing all the map tasks. My map tasks write lots of intermediate data. The intermediate ...
    Abhay RatnaparkhiAbhay Ratnaparkhi
    Sep 3, 2012 at 2:19 pm
    Sep 5, 2012 at 6:39 pm
  • Hi I am going to process video analytics using hadoop I am very interested about CPU+GPU architercute espessially using CUDA ( http://www.nvidia.com/object/cuda_home_new.html) and JCUDA ( ...
    Oleg RuchovetsOleg Ruchovets
    Sep 24, 2012 at 3:30 pm
    Oct 3, 2012 at 1:44 pm
  • hi, i know that some algorithms cannot be parallelized and adapted to the mapreduce paradigm. however, i have noticed that in most cases where i find myself struggling to express an algorithm in ...
    Jane WayneJane Wayne
    Sep 26, 2012 at 3:36 pm
    Sep 26, 2012 at 7:39 pm
  • Hello all, I am testing the Hadoop recovery as per http://wiki.apache.org/hadoop/NameNode document. But instead of using an NFS share, I am copying to another directory. Then when I shut down the ...
    Artem ErvitsArtem Ervits
    Sep 17, 2012 at 9:39 pm
    Sep 19, 2012 at 3:10 pm
  • Hi, What happens when an existing (not new) datanode rejoins a cluster for following scenarios: 1. Some of the blocks it was managing are deleted/modified? 2. The size of the blocks are now modified ...
    Mehul ChoubeMehul Choube
    Sep 11, 2012 at 7:15 am
    Sep 11, 2012 at 9:12 am
  • Hi all, Now I'd like to deploy a simple MapReduce job, written in Java, to a remote cluster within Eclipse. For the moment I've found this solution: 1) Put the hadoop conf file in the classpath 1) ...
    Alberto CordioliAlberto Cordioli
    Sep 20, 2012 at 2:15 pm
    Sep 21, 2012 at 4:43 pm
  • Our hadoop version is hadoop-0.20-append+4. We have configured the rack awareness in the namenode. But when I add new datanode, and update the topology data file, and restart the datanode, I just see ...
    Jameson LiJameson Li
    Sep 13, 2012 at 3:04 am
    Sep 14, 2012 at 4:29 am
  • Hi all, Anybody had a success compiling Map reduce jobs with big insights distro on MAC? Seems like it requires IBM java JDK and it might not be available on MAC. Is there a way work around it? ...
    Serge BlazhiyevskyySerge Blazhiyevskyy
    Sep 19, 2012 at 9:15 pm
    Sep 25, 2012 at 6:31 pm
  • I have uploaded some images to hdfs hadoop user/combo/ directory now want to show those images in a jsp i have configured tomcat and hadoop properly i m able to do uploads any ideas on how to build ...
    Visioner SadakVisioner Sadak
    Sep 6, 2012 at 2:57 pm
    Sep 12, 2012 at 6:55 am
  • Hi, all I have written a simple MR program which partition a file into multiple files bases on the clustering result of the points in this file, here is my code: --- private int run() throws ...
    Jason YangJason Yang
    Sep 17, 2012 at 1:51 pm
    Sep 18, 2012 at 6:08 am
  • Hi, I would like to perform a map-side join of two large datasets where dataset A consists of m*n elements and dataset B consists of n elements. For the join, every element in dataset B needs to be ...
    Sigurd SpieckermannSigurd Spieckermann
    Sep 10, 2012 at 9:58 am
    Sep 17, 2012 at 1:51 pm
  • Hi all, I ran a query on hive on top of 90 million records that took 12 minutes to execute and same query on sql server took 8 minutes.My question is how can i make hadoop's performance better.What ...
    Iwannaplay gamesIwannaplay games
    Sep 5, 2012 at 6:19 am
    Sep 7, 2012 at 7:09 pm
  • Hi Guys Is there some 3rd party monitor tool that i can use to monitor the hadoop cluster, especially that i can get a notification/email when there is a job failed? Thanks for any suggestion ...
    Sep 6, 2012 at 8:33 am
    Sep 6, 2012 at 7:16 pm
  • I've been running up against the good old fashioned "replicated to 0 nodes" gremlin quite a bit recently. My system (a set of processes interacting with hadoop, and of course hadoop itself) runs for ...
    Keith WileyKeith Wiley
    Sep 4, 2012 at 4:42 pm
    Sep 5, 2012 at 3:42 am
  • Is it possible to write unit test for mapper Map , and reducer Reduce function ? -Ravi
    Ravi PRavi P
    Sep 26, 2012 at 8:19 pm
    Sep 26, 2012 at 9:31 pm
  • Hi guys, I'm experiencing a strange behavior when I use the Hadoop join-package. After running a job the result statistics show that my combiner has an input of 100 records and an output of 100 ...
    Sigurd SpieckermannSigurd Spieckermann
    Sep 25, 2012 at 8:33 am
    Sep 25, 2012 at 5:03 pm
  • Dear Team Members, I am working as a Linux Administrator, I am interested to work on Hadoop. Please let me know from where and how I can start to learning. It is very great full to help for learning ...
    Munnavar ShaikMunnavar Shaik
    Sep 13, 2012 at 5:38 pm
    Sep 13, 2012 at 7:20 pm
  • Hi, We have a requirement where we have change our Hadoop Cluster's Replication Factor without restarting the Cluster. We are running our Cluster on Amazon EMR. Can you please suggest the way to ...
    Uddipan MukherjeeUddipan Mukherjee
    Sep 5, 2012 at 6:03 pm
    Sep 5, 2012 at 8:41 pm
  • Hello, We have 15 node cluster and right now we dont have Kerberos implemented. But on urgent basis we want to secure the cluster. Right now anyone who know IP of Namenode can just download the ...
    Shin ChanShin Chan
    Sep 28, 2012 at 9:24 am
    Sep 28, 2012 at 4:18 pm
  • Hi all. We're using Hadoop 1.0.3. We need to pick up a set of large (4+GB) files when they've finished being written to HDFS by a different process. There doesn't appear to be an API specifically for ...
    Peter SheridanPeter Sheridan
    Sep 25, 2012 at 4:29 pm
    Sep 27, 2012 at 10:04 pm
  • Hi, I want to run the PiEstimator example from using the following command $hadoop job -submit pieestimatorconf.xml which contains all the info required by hadoop to run the job. E.g. the input file ...
    Varad MeruVarad Meru
    Sep 23, 2012 at 7:55 am
    Sep 26, 2012 at 4:47 pm
  • Hi all, I have to join to large datasets A and B. I preprocess both datasets by parsing the source text files and creating custom datatypes ADT and BDT out ouf it. Now I have to join theses data ...
    Oliver B. FischerOliver B. Fischer
    Sep 26, 2012 at 1:19 pm
    Sep 26, 2012 at 2:49 pm
  • Hi, I'd like to announce Project Panthera, our open source efforts that showcase better data analytics capabilities on Hadoop/HBase (through both SW and HW improvements), available at ...
    Dai, JasonDai, Jason
    Sep 17, 2012 at 1:56 pm
    Sep 18, 2012 at 7:45 am
  • Hi, I am setting up a secured hdfs using Kerberos. I got NN, 2NN working just fine. However, DN cannot talk to NN and throws the following exception. I disabled the AES256 from keytab, which in ...
    Shumin WuShumin Wu
    Sep 12, 2012 at 4:48 pm
    Sep 17, 2012 at 5:41 am
  • with speculative execution enabled Hadoop can run task attempt on more then 1 node. If mapper is using multipleoutputs then second attempt (or sometimes even all) fails to create output file because ...
    Radim KolarRadim Kolar
    Sep 12, 2012 at 10:52 pm
    Sep 14, 2012 at 5:31 pm
  • Hi, all I have a question about how does the pseudo-distributed Hadoop cluster work: As many map tasks are submitted to the pseudo-distributed Hadoop cluster, does the hadoop run each mapper in ...
    Jason YangJason Yang
    Sep 14, 2012 at 6:04 am
    Sep 14, 2012 at 8:21 am
  • Observe: ~/ $ hd fs -put test /test put: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create file/test. Name node is in safe mode. ~/ $ hadoop dfsadmin -safemode leave Safe mode ...
    Keith WileyKeith Wiley
    Sep 4, 2012 at 5:08 pm
    Sep 9, 2012 at 10:39 pm
  • 6


    Hi When I start my cluster (with start-dfs.sh), secondary namenodes are created on all the machines in conf/slaves. I set conf/masters to a single different machine (along with dfs.http.address ...
    Sep 4, 2012 at 8:22 am
    Sep 5, 2012 at 7:47 am
  • In local read, BlockReaderLocal class use "static Map<Integer, LocalDatanodeInfo localDatanodeInfoMap" property to store local block file path and local meta file path. When I stop HDFS cluster or I ...
    Jlei liuJlei liu
    Sep 27, 2012 at 6:20 am
    Oct 1, 2012 at 9:03 am
  • I would like to use the CombineFileInputFormat in a sequence of mapreduce jobs that run on Hadoop 20.2. I noticed that the class was in a mapred package, rather than in the mapreduce package. When I ...
    Anna LahoudAnna Lahoud
    Sep 27, 2012 at 8:02 pm
    Sep 28, 2012 at 12:17 pm
  • Anybody know why libhdfs.so is not found by package managers on CentOS 64 and OpenSuse64? I hava an rpm which declares Hadoop as a dependacy, but the package managers (KPackageKit, zypper, etc) ...
    Pastrana, Rodrigo (RIS-BCT)Pastrana, Rodrigo (RIS-BCT)
    Sep 25, 2012 at 3:21 am
    Sep 25, 2012 at 7:30 pm
  • Hi, all I have been stuck by a weird problem for a long time. so I was wondering could anyone give me some advise? I have a MapReduce Job , in which: 1. the mapper would read a whole file as a split ...
    Jason YangJason Yang
    Sep 24, 2012 at 1:26 pm
    Sep 24, 2012 at 4:38 pm
  • Hello experts could you judge whether webhdfs is fast or hdfsproxy is fast, is hdfs proxy slower coz it uses https only or can we use http also in hdfsproxy, its also mentioned in this below ...
    Visioner SadakVisioner Sadak
    Sep 19, 2012 at 1:49 pm
    Sep 20, 2012 at 6:15 am
  • Hello, I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a pseudo-distributed mode. After download / install / setup config files I ran the following namenode format command as suggested in the ...
    Jason HuangJason Huang
    Sep 14, 2012 at 3:01 pm
    Sep 15, 2012 at 5:58 pm
  • Hi! I'm using JobControl (v. 1.0.3) to chain two MapReduce applications. It works and creates output data, but it doesn't give me back information messages as number of mappers, number of records in ...
    Piter85 Piter85Piter85 Piter85
    Sep 12, 2012 at 9:35 am
    Sep 12, 2012 at 1:06 pm
  • Hi, I want to make sure my understanding about task assignment in hadoop is correct or not. When scanning a file with multiple tasktrackers, I am wondering how a task is assigned to each ...
    Hiroyuki YamadaHiroyuki Yamada
    Sep 11, 2012 at 1:02 pm
    Sep 12, 2012 at 9:45 am
  • Hi, all I was wondering what's the default number of reducer if I don't set it in configuration? Will it change dynamically according to the output volume of Mapper? -- YANG, Lin
    Jason YangJason Yang
    Sep 11, 2012 at 11:24 am
    Sep 11, 2012 at 12:38 pm
  • Sorry for admin-only content: can we remove this address from the list? I get the bounce message below whenever I post to <span class="m_body_email_addr" title="858a0c8e479a78c1038b7355244ec07c" ...
    Tony BurtonTony Burton
    Sep 10, 2012 at 9:46 am
    Sep 11, 2012 at 9:56 am
  • Hello hadoopers! In a reduce-only Hadoop job input files are handled by the identity mapper and sent to the reducers without modification. In one of my job I was surprised to see the job failing in ...
    Sep 6, 2012 at 2:51 pm
    Sep 7, 2012 at 3:49 am
  • Hi: I wish to use Hadoop streaming to run a program which requires specific PATH and CLASSPATH variables. I have set these two variables in both "/etc/profile" and "~/.bashrc" on all slaves (and ...
    Andy XueAndy Xue
    Sep 4, 2012 at 7:42 am
    Sep 4, 2012 at 8:52 am
  • genrally in hadoop map function will be exeucted by all the data nodes on the input data set ,against this how can i do the following. i have some filter programs , and what i want to do is each data ...
    Mallik arjunMallik arjun
    Sep 3, 2012 at 4:20 pm
    Sep 4, 2012 at 8:04 am
  • Hi, I am running Hadoop 1.03 in Pseudo distributed mode, on a quad core Xeon processor with hyper-threading enabled. When I submit a job to process a file of size about 1.6 GB, only two concurrent ...
    Shing Hing ManShing Hing Man
    Sep 29, 2012 at 7:06 pm
    Sep 29, 2012 at 9:15 pm
  • Hi... Could someone help me with following scenario.. I want implement a job which should get 2 mapper outputs and send them to 1 reducer. Attached image show the flow I wanted.... Normal flow is ...
    Kumudu harshaniKumudu harshani
    Sep 23, 2012 at 3:40 am
    Sep 23, 2012 at 1:31 pm
  • Hi, I need to run some benchmarking tests for a given mapreduce job on a *subset *of a 10-node Hadoop cluster. Not that it matters, but the current cluster settings allow for ~20 map slots and 10 ...
    Safdar KureishySafdar Kureishy
    Sep 10, 2012 at 9:07 am
    Sep 23, 2012 at 1:32 am
  • Hi all, I'm greatly confused about the spill/sort/merge thing going on during the Map phase. Here are some stats: - io.sort.mb = 256 MB (80% spill threshold) - io.sort.factor = 64 - spills performed ...
    Martin DobmeierMartin Dobmeier
    Sep 13, 2012 at 2:05 pm
    Sep 22, 2012 at 8:43 am
  • Dear All, I am currently deploying hadoop 1.0.3 on my Debian 32-bit Linux. I think need a 32-bit binary file taskcontroller. However, I found the binary files provided in hadoop 1.0.3 is 64 bit. I ...
    Yongzhi WangYongzhi Wang
    Sep 18, 2012 at 3:05 am
    Sep 19, 2012 at 4:18 am
  • Hi there, Today we started deploying Mapr M3 into production. However we're having problems completing jobs. During a typical job the job return this: 12/09/11 16:33:20 INFO mapred.JobClient: Task Id ...
    Robin VerlangenRobin Verlangen
    Sep 13, 2012 at 12:39 pm
    Sep 13, 2012 at 5:40 pm
  • Hi, I have a sequence file written by SequenceFileOutputFormat with key/value type of <Text, BytesWritable , like below: Text BytesWritable ...
    Jason YangJason Yang
    Sep 12, 2012 at 3:16 am
    Sep 12, 2012 at 5:58 am
Group Navigation
period‹ prev | Sep 2012 | next ›
Group Overview
groupcommon-user @

236 users for September 2012

Harsh J: 92 posts Bertrand Dechoux: 33 posts Jason Yang: 27 posts Michel Segel: 27 posts Bejoy KS: 25 posts Hemanth Yamijala: 25 posts Visioner Sadak: 24 posts Tony Burton: 14 posts Sigurd Spieckermann: 12 posts Narasingu Ramesh: 10 posts Steve Loughran: 10 posts Vinod Kumar Vavilapalli: 9 posts Jay Vyas: 8 posts Joshi, Rekha: 8 posts Keith Wiley: 8 posts Björn-Elmar Macek: 7 posts Hemanth Yamijala: 7 posts Sathyavageeswaran: 7 posts Yongzhi Wang: 7 posts Artem Ervits: 6 posts
show more