Search Discussions

130 discussions - 495 posts

  • Hi all, Can anyone please tell me how to control the splits size ? I have one big file which will be splitted by the number of maps. The input file is binary and contains some objects. I do not want ...
    Teodor MacicasTeodor Macicas
    Aug 23, 2010 at 10:38 am
    Sep 7, 2010 at 7:50 pm
  • Hey friends, I am in a doubt. Suppose i want to pass program specific config parameter through command line and after reading it setting up to the desired local variable. For example, suppose I am ...
    Deepak DiwakarDeepak Diwakar
    Aug 9, 2010 at 6:09 pm
    Aug 11, 2010 at 2:44 pm
  • The namenode log for a Hadoop-0.20 installation contains this error message: "/var/lib/hadoop-0.20/cache/hadoop/dfs/name in in an inconsistent state". This directory does not exist and I would like ...
    Cliff palmerCliff palmer
    Aug 23, 2010 at 1:16 pm
    Aug 25, 2010 at 6:17 pm
  • As part of our experimentation, the plan is to pull 4 slave nodes out of a 8-slave/1-master cluster. With replication factor set to 3, I thought losing half of the cluster may be too much for hdfs to ...
    Steve KuoSteve Kuo
    Aug 6, 2010 at 5:43 am
    Aug 9, 2010 at 4:39 am
  • Read-only NFS? I recently looked into an NFS-related unit-test bug (MR-2041), but those failures were due to directory creation and/or permissions-setting (i.e., writing), apparently timing-related. ...
    Greg RoelofsGreg Roelofs
    Aug 31, 2010 at 7:50 pm
    Sep 1, 2010 at 5:21 pm
  • Dear All, I am working on this algorithm that is kind of like "clustering" data set. The algorithm is like this: The data set is broken into N(~40) chunks, each chunk contains 2,000 lines. For each ...
    Boyu ZhangBoyu Zhang
    Aug 12, 2010 at 7:59 pm
    Aug 13, 2010 at 8:21 pm
  • I have a question regarding outputting Writable objects. I thought all Writables know how to serialize themselves to output. For example I have an ArrayWritable of strings (or Texts) but when I ...
    Aug 31, 2010 at 4:59 pm
    Sep 2, 2010 at 8:01 pm
  • Hi, do all nodes send their System.out.println() logs to the same place in Hadoop job console? I don't see the mixture I would expect. Thank you, Mark
    Mark KerznerMark Kerzner
    Aug 30, 2010 at 8:36 pm
    Sep 1, 2010 at 7:37 am
  • my cluster consists of 4 nodes : 1 namenode and 3 datanodes, it works well functioning as hdfs,but when I run mapreduce tasks, it will take quite a long time and there're quite a lot of too many ...
    Aug 18, 2010 at 7:02 am
    Aug 18, 2010 at 2:12 pm
  • Hello, I've confused the problem for a week already, Please sharing if you know what could be causing this, Thinks in advance! Hadoop version: 0.20.2 Machines: machine 1 - ...
    Kevin ChenKevin Chen
    Aug 13, 2010 at 10:17 am
    Aug 17, 2010 at 8:09 am
  • I came across the tutorial on creating a custom partitioner on Hadoop ( http://philippeadjiman.com/blog/2009/12/20/hadoop-tutorial-series-issue-2-getting-started-with-customized-partitioning/) I am ...
    Mithila NagendraMithila Nagendra
    Aug 25, 2010 at 4:41 pm
    Aug 26, 2010 at 10:32 pm
  • Apologies for the newbie question but I think I'm a little lost. Hadoop 20.2 came out in Feb 2010 but the fix I'm looking for is in Hadoop 20.3, ...
    Pete TylerPete Tyler
    Aug 1, 2010 at 11:11 pm
    Aug 15, 2010 at 12:39 am
  • Hi, I am trying to write a thesis proposal about my PhD about usage of hadoop in cloud computing. I need to find some open problems in cloud computing which can be addressed by hadoop. I would ...
    Jackob CarlssonJackob Carlsson
    Aug 10, 2010 at 2:01 pm
    Aug 11, 2010 at 5:56 pm
  • I am currently stuck with hadoop namenode that won't start. When I type "start-all.sh", everything prints out fine. But when I type "jps", only JobTracker is activated. When this happens, I usually ...
    Edward choiEdward choi
    Aug 5, 2010 at 4:10 am
    Aug 9, 2010 at 9:38 am
  • Hi, I'm trying to set a variable in my mapper class by reading an argument from the command line and then passing the entry to the mapper from main. Is this possible? public static void main(String[] ...
    Erik TestErik Test
    Aug 2, 2010 at 4:17 pm
    Aug 3, 2010 at 9:15 pm
  • So I am administering a 10+ node hadoop cluster and everything is going swimmingly. Unfortunately, some relatively critical data is now being stored on the cluster and I am being asked to create a ...
    Aug 3, 2010 at 1:55 pm
    Aug 3, 2010 at 4:08 pm
  • Hi all, I am using hadoop 0.20.2 and I want to use sort huge amount of data. I've read about Terasort [from examples], but now it's using 10bytes char keys. Changing keys from char to integer wasn't ...
    Teodor MacicasTeodor Macicas
    Aug 1, 2010 at 9:24 pm
    Aug 2, 2010 at 10:51 pm
  • Hi, At every beginning,I run:hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+' successfully,but when run: nutch crawl url -dir crawl -depth 3,got errors: ...
    Alex LuyaAlex Luya
    Aug 15, 2010 at 1:04 pm
    Sep 21, 2010 at 10:20 pm
  • 6


    Is there a public ivy repo that has the latest hadoop? Thanks
    Aug 27, 2010 at 3:05 pm
    Sep 4, 2010 at 7:12 pm
  • How can I add jars to Hadoops classpath when running MapReduce jobs for the following situations? 1) Assuming that the jars are local the nodes that running the job. 2) The jobs are only local to the ...
    Aug 29, 2010 at 5:29 am
    Sep 1, 2010 at 4:31 pm
  • How should I be creating a new Job instance in 0.21. It looks like Job(Configuration conf, String jobName) has been deprecated. It looks like Job(Cluster cluster) is the new way but I'm unsure of how ...
    Aug 29, 2010 at 11:39 pm
    Aug 31, 2010 at 2:18 am
  • Hi all, Does anyone know how to efficiently concatenate 2 different files in HDFS, as well as splitting a file into 2 different ones ? I did this by read from a file, write to another one. Of course, ...
    Teodor MacicasTeodor Macicas
    Aug 19, 2010 at 10:59 am
    Aug 20, 2010 at 9:03 pm
  • Hi, There: does anybody know of a good combination of centos version and jdk version that works stably ? I am using centos version Linux 2.6.18-194.8.1.el5.centos.plus #1 SMP Wed Jul 7 11:45:38 EDT ...
    Jinsong HuJinsong Hu
    Aug 13, 2010 at 9:18 pm
    Aug 16, 2010 at 3:00 pm
  • Hi! Due to network reconfigurations, I need to change the hostnames of some of my worker nodes, i.e. the nodes running tasktracker and datanode. I need to do this to make my hostname naming schema ...
    Erik ForsbergErik Forsberg
    Aug 10, 2010 at 10:52 am
    Aug 10, 2010 at 6:04 pm
  • We are looking to enable LZO compression of the map outputs on our Cloudera 0.20.1 cluster. It seems there are various sets of instructions available and I am curious what your thoughts are regarding ...
    Bobby DennettBobby Dennett
    Aug 5, 2010 at 10:53 pm
    Aug 9, 2010 at 2:35 am
  • Hi everyone, Could anyone please help me to understand the function of combiner? Thanks in advance Jackob
    Jackob CarlssonJackob Carlsson
    Aug 2, 2010 at 3:39 pm
    Aug 2, 2010 at 9:28 pm
  • Hello, how can one determine the names of the files in a particular hadoop directory, programmatically?
    Denim LiveDenim Live
    Aug 25, 2010 at 7:36 am
    Oct 12, 2010 at 10:57 am
  • will data stored in compression format affect mapreduce job speed? increase or decrease? or more complex relationship between these two ? can anybody give some explanation in detail? 2010-08-26 ...
    Aug 26, 2010 at 2:33 am
    Aug 26, 2010 at 7:58 pm
  • Requirement: I want to get rid of a data node machine. But it has useful data that is still in use. So, I want to move all its files/blocks to other live data nodes in the same cluster. Question: I ...
    Jiang lichtJiang licht
    Aug 20, 2010 at 8:32 pm
    Aug 21, 2010 at 12:45 am
  • When my Java based client creates a mapreduce Job instance I can set the job name, which is readable by the map and reduce classes. However, so that I can write some generalized map and reduce ...
    Pete TylerPete Tyler
    Aug 13, 2010 at 7:56 pm
    Aug 14, 2010 at 12:55 am
  • I am building a monitoring tool for my Hadoop cluster. I have been able to collect most of the data I need from JobClient. How can I get JobConf for a job? My monitoring tool needs the outputPath and ...
    Patek tekPatek tek
    Aug 12, 2010 at 3:21 am
    Aug 13, 2010 at 1:07 am
  • Hello, Error initializing attempt_201008101445_0212_r_000002_0: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for ...
    Rares VernicaRares Vernica
    Aug 12, 2010 at 5:38 pm
    Aug 12, 2010 at 8:42 pm
  • Hi all, scheduler, most installations are using Hadoop's Fair Scheduler. Based on features and our requirements, we're leaning towards using the Capacity Scheduler; however, there is some concern ...
    Bobby DennettBobby Dennett
    Aug 11, 2010 at 10:05 pm
    Aug 12, 2010 at 12:56 pm
  • Hello, I set "mapred.task.cache.levels" to 1 so that I have only data-local-map tasks. Still, by looking the the data-local-maps counter it seems not all map tasks are local. I checked each map task ...
    Rares VernicaRares Vernica
    Aug 10, 2010 at 9:58 pm
    Aug 11, 2010 at 5:32 pm
  • I am not able to find any command or parameter in core-default.xml to configure secondary namenode on separate machine. I have a 4-node cluster with jobtracker,master,secondary namenode on one ...
    Adarsh SharmaAdarsh Sharma
    Aug 18, 2010 at 7:36 am
    Dec 8, 2010 at 8:56 pm
  • 2010-08-27 shangan 发件人: shangan 发送时间: 2010-08-27 11:43:32 收件人: hadoop-user 抄送: 主题: mapreduce attempts killed there's quite a lot failed/killed task attempts while sometimes there's none of them when ...
    Aug 27, 2010 at 4:00 am
    Aug 27, 2010 at 8:33 am
  • Hi I am trying to convert Mahout xmlInputFormat to new API but this is not working. The problem which i think is that in old api we have next method which takes key and value and we can set it in the ...
    Shuja RehmanShuja Rehman
    Aug 25, 2010 at 9:23 pm
    Aug 26, 2010 at 4:07 am
  • What is the preferred way of managing multiple configurations.. ie development, production etc. Is there someway I can tell hadoop to use a separate conf directory other than ${hadoop_home}/conf? I ...
    Aug 18, 2010 at 5:29 pm
    Aug 19, 2010 at 2:45 pm
  • Hello, I'm trying to determine how to split a file evenly so each map task has a similar work load. The input I will have is a list of coordinates like this: 2,8 3,9 4,10 5,7 6,2 7,3 8,1 9,0 10,4 ...
    Erik TestErik Test
    Aug 17, 2010 at 2:51 pm
    Aug 18, 2010 at 5:54 am
  • Hi, Am not sure if this is the right place to post this doubt.We tried implementing M/R on a different distributed file system developed in house. Every time I run more than 30 threads I get an error ...
    Aug 16, 2010 at 7:05 am
    Aug 16, 2010 at 3:59 pm
  • Hej, I am configuring a Hadoop cluster with 1 master (jobtracker+namenode+secondary namenode) and 2 slaves (tasktracker+datanode). The 2 datanodes logs do not show any information but the datanodes ...
    Aug 12, 2010 at 5:15 pm
    Aug 13, 2010 at 9:36 am
  • Someone sent this email to the commons-user list a while back, but it seems like it slipped through the cracks. We're starting to dig into some hard-core Hadoop development and just came upon this ...
    David RosenstrauchDavid Rosenstrauch
    Aug 4, 2010 at 3:38 pm
    Aug 4, 2010 at 9:10 pm
  • hi All, i want to access through map-reduce java code ..... plz guide me .. -- View this message in context: http://old.nabble.com/how-to-access-HDFS-file-system.-tp29333807p29333807.html Sent from ...
    Aug 3, 2010 at 10:16 am
    Aug 3, 2010 at 11:31 am
  • Is the number of nodes to be decommissioned bounded by replication factor? E.g. with a replication factor of 2, is it safe to decommission 2 data nodes at a time? I guess probably it is fine if ...
    Jiang lichtJiang licht
    Aug 31, 2010 at 7:58 pm
    Sep 1, 2010 at 10:20 am
  • Hi, I am running some basic setup and test to know about hadoop. When I try to start nodes I get this error. I am already using java 1.6. Can someone please help? # echo $JAVA_HOME /root/jdk1.6.0_21/ ...
    Mohit AnchliaMohit Anchlia
    Aug 31, 2010 at 12:52 am
    Sep 1, 2010 at 2:02 am
  • When I configure my job to use a KeyValueTextInputFormat doesn't that imply that the key and value to my mapper will be both Text? I have it set up like this and I am using the default Mapper.class ...
    Aug 26, 2010 at 6:08 pm
    Aug 28, 2010 at 4:01 am
  • JIRA seems to be down FYI. Database errors are being returned: *Cause: * org.apache.commons.lang.exception.NestableRuntimeException: com.atlassian.jira.exception.DataAccessException: ...
    Bill GrahamBill Graham
    Aug 26, 2010 at 3:48 am
    Aug 26, 2010 at 7:03 am
  • I was trying to get Ganglia 3.1 to work with the stable hadoop-0.20.2 version from Apache. I patched this release from HADOOP-4675 using HADOOP-4675-v7.patch as suggested by CDH3 release notes [1] I ...
    Aug 24, 2010 at 1:28 pm
    Aug 25, 2010 at 12:59 pm
  • In current hadoop documentation, it is "hadoop balancer [-threshold <threshold ]" to start a balancer and to stop the balancer press ctrl-c. But in some other places (YDN and older hadoop version ...
    Jiang lichtJiang licht
    Aug 24, 2010 at 7:14 pm
    Aug 24, 2010 at 9:22 pm
  • From what I understand the InputSplit is a byte slice of a particular file which is then handed off to an individual mapper for processing. Is the size of the InputSplit equal to the hadoop block ie ...
    Aug 20, 2010 at 1:48 am
    Aug 20, 2010 at 1:27 pm
Group Navigation
period‹ prev | Aug 2010 | next ›
Group Overview
groupcommon-user @

136 users for August 2010

Jiang licht: 24 posts Harsh J: 22 posts Mark: 22 posts Edward Capriolo: 17 posts Hemanth Yamijala: 15 posts Allen Wittenauer: 13 posts Gang Luo: 13 posts Teodor Macicas: 12 posts Owen O'Malley: 11 posts Shangan: 11 posts David Rosenstrauch: 10 posts Xiujin yang: 10 posts Erik Test: 9 posts Patek tek: 9 posts Sudhir Vallamkondu: 9 posts Amareshwari Sri Ramadasu: 8 posts Michael Segel: 8 posts Abhishek sharma: 7 posts Brian Bockelman: 7 posts Cliff palmer: 7 posts
show more