Search Discussions

55 discussions - 207 posts

  • Hi, I was trying to run the Sort example in Hadoop-0.20.2 over 200GB of input data using a 20 node cluster of nodes. HDFS is configured to use 128MB block size (so 1600maps are created) and a ...
    Virajith JalapartiVirajith Jalaparti
    Jul 12, 2011 at 12:45 pm
    Jul 13, 2011 at 12:20 pm
  • Over the past week or two, I've run into an issue where MapReduce jobs hang or fail near completion. The percent completion of both map and reduce tasks is often reported as 100%, but the actual ...
    Kai Ju LiuKai Ju Liu
    Jul 8, 2011 at 12:43 am
    Aug 3, 2011 at 7:44 pm
  • Hello list, as a newbie I got a tricky use-case in mind which I want to implement with Hadoop to train my skillz. There is no real scenario behind that, so I can extend or shrink the problem to the ...
    Jul 18, 2011 at 5:04 pm
    Jul 19, 2011 at 8:00 pm
  • Hi all, I am using hadoop 0.20.2. I am setting the property mapred.tasktracker.map.tasks.maximum = 4 (same for reduce also) on my job conf but I am still seeing max of only 2 map and reduce tasks on ...
    Praveen PeddiPraveen Peddi
    Jul 1, 2011 at 8:03 pm
    Jul 6, 2011 at 2:19 am
  • Hello everyone, I'm new to Hadoop and I'm trying to figure out how to design a M/R program to parse a file and generate a PMML file as output. What I would like to do is split a file by a keyword ...
    Erik TErik T
    Jul 11, 2011 at 6:57 pm
    Jul 18, 2011 at 8:33 pm
  • Hi, I faced a problem that the jobs are still running after executing "hadoop job -kill jobId". I rebooted the cluster but the job still can not be killed. The hadoop version is 0.20.2. Any idea? ...
    Juwei ShiJuwei Shi
    Jul 1, 2011 at 3:53 pm
    Jul 5, 2011 at 3:48 pm
  • hello everyone, I got an exception from my jobtracker's log file as follow: 2011-07-27 01:58:04,197 INFO org.apache.hadoop.mapred.JobTracker: Cleaning up the system directory 2011-07-27 01:58:04,230 ...
    Jul 27, 2011 at 7:29 am
    Jul 28, 2011 at 2:19 am
  • I'm back to trying to add libraries to the classpath instead of handing around a fat JAR. This time I've served up my directory full of JARs on NFS, which each node in my cluster has mounted at ...
    John ArmstrongJohn Armstrong
    Jul 26, 2011 at 7:29 pm
    Jul 27, 2011 at 2:56 pm
  • Hi, I am new to Hadoop, and I apologies if this was answered before, or if this is not the right list for my question. I am trying to do the following: 1- Read monitoring information from slave nodes ...
    Jul 12, 2011 at 10:02 pm
    Jul 13, 2011 at 6:17 am
  • Hey all, I'm working on a project that uses a native c library. Although I can use DistributedCache as a way to distribute the c library, I'd like to use the jar to do the job. What I mean is packing ...
    Donghan (Jarod) WangDonghan (Jarod) Wang
    Jul 9, 2011 at 7:09 pm
    Jul 11, 2011 at 2:48 pm
  • Hi, I'm using the avro format both for input and output, for a mapper and a reducer. I would like to output multiple avro items with different schemata. For sequence files I would use the ...
    Vyacheslav ZholudevVyacheslav Zholudev
    Jul 26, 2011 at 12:47 pm
    Aug 4, 2011 at 8:42 am
  • Hi, I am trying to access distributed cache in my custom output format but it does not work and file open in custom output format fails with file does not exist even though it physically does. Looks ...
    Mapred LearnMapred Learn
    Jul 29, 2011 at 5:28 pm
    Jul 29, 2011 at 6:23 pm
  • Hi All, I am getting the following error on running a job on about 12 TB of data. This happens before any mappers or reducers are launched. Also the job starts fine if I reduce the amount of input ...
    Gagan BansalGagan Bansal
    Jul 24, 2011 at 9:07 am
    Jul 25, 2011 at 5:45 pm
  • I am newbie. Most of example shows that, job.setOutputKeyClass(Text.class); is it possible job.setOutputKeyClass(MapWritable.class); because my key is combination of values(src IP, src Port, dst ...
    Choonho SonChoonho Son
    Jul 20, 2011 at 12:03 am
    Jul 20, 2011 at 4:11 pm
  • All, I am getting the following errors during my MR jobs (see below). Ultimately the jobs finish well enough, but these errors do slow things down. I've done some reading and I understand that this ...
    Geoffry RobertsGeoffry Roberts
    Jul 18, 2011 at 10:02 pm
    Jul 19, 2011 at 1:40 pm
  • recently we had some network issues with our cluster. this job used to take on few minute to complete and how it is taking over half hour. when looking at the jobtracker's log i see it slowly getting ...
    Felix gaoFelix gao
    Jul 14, 2011 at 7:47 pm
    Jul 14, 2011 at 9:23 pm
  • Hi, In one of my jobs I am getting the following error. java.io.IOException: File X could only be replicated to 0 nodes, instead of 1 at ...
    Sudharsan SampathSudharsan Sampath
    Jul 5, 2011 at 12:43 pm
    Jul 5, 2011 at 7:59 pm
  • Hi, In my hadoop running example, the data ouput is compressed using gzip. I would like to create a small java program that decompress the output. Can anyone give an example on how to decompress the ...
    Pedro Sa CostaPedro Sa Costa
    Jul 5, 2011 at 3:24 pm
    Jul 5, 2011 at 4:51 pm
  • Hi all, We are basically working on a research project and I require some help regarding this. I had a few basic doubts regarding submission of Map-Reduce jobs in Hadoop. 1. How do I submit a ...
    Narayanan KNarayanan K
    Jul 1, 2011 at 5:59 am
    Jul 2, 2011 at 4:11 am
  • Over the past week or two, I've been seeing an issue where hard-to-reach (i.e. hard to ssh to) instances exhibit high load but low CPU. These instances are hosted in EC2, of type c1.xlarge with 4 ...
    Kai Ju LiuKai Ju Liu
    Jul 6, 2011 at 1:37 am
    Aug 2, 2011 at 4:42 am
  • In my map function, I need to know the number of reducer, the code segment in my program like this: JobConf job = new JobConf(driverClass.class); int numReducer=job.getNumReduceTasks(); but the ...
    Jul 26, 2011 at 9:09 am
    Jul 26, 2011 at 9:37 am
  • Moving this to mapreduce-user (this is the right list).. Could you please look at the TaskTracker logs around the time when you see the task failure. That might have something more useful for ...
    Devaraj DasDevaraj Das
    Jul 11, 2011 at 6:21 am
    Jul 11, 2011 at 5:08 pm
  • Hey there, i am trying to add a new datanode/tasktracker to a currently running cluster. Is this feasible? And if yes, how do i change the masters, slaves and dfs.replication(in hdfs-site.xml) ...
    Paul RimbaPaul Rimba
    Jul 1, 2011 at 3:57 am
    Jul 1, 2011 at 5:59 am
  • Hi, I am trying out the new api and it doesn't execute the Reducer.. not sure why it is so. code snippet: [code] Job job = new Job(); job.setJarByClass(HdpTest.class); ...
    Web serviceWeb service
    Jul 31, 2011 at 1:22 am
    Jul 31, 2011 at 7:49 pm
  • Hi I am using *Job *( http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/mapreduce/Job.html) to submit a MR job on Hbase. I want to pass some values from my main() to Map and Reduce ...
    Narayanan KNarayanan K
    Jul 30, 2011 at 4:33 pm
    Jul 31, 2011 at 2:51 am
  • hello everyone, Now ,I am running hadoop-0.20.2 , my question is ,how can i move to version of hadoop-0.20-append? I package the hadoop-0.20-append's core jar , can i just replace the current ...
    Jul 28, 2011 at 8:52 am
    Jul 29, 2011 at 7:59 am
  • So I think I've figured out how to fix my problem with putting files on the distributed classpath by digging through the code Hadoop uses to process -libjars. If I say ...
    John ArmstrongJohn Armstrong
    Jul 27, 2011 at 2:39 pm
    Jul 27, 2011 at 3:05 pm
  • Hello all, I am trying to write a MR program where the output from the mappers are dependent on the previous map processes. I understand that a job scheduler exists to control such processes. Would ...
    Ross NordeenRoss Nordeen
    Jul 25, 2011 at 6:20 pm
    Jul 26, 2011 at 7:30 am
  • Hi, I followed the below instructions to compile the MRv2 code. http://svn.apache.org/repos/asf/hadoop/common/branches/MR-279/mapreduce/INSTALL I start the resourcemanager and then the nodemanager ...
    Praveen SripatiPraveen Sripati
    Jul 21, 2011 at 2:36 pm
    Jul 25, 2011 at 4:37 pm
  • All, I have a MR program that I feed in a list of IDs and it generates the unique comparison set as a result. Example: if I have a list {1,2,3,4,5} then the resulting output would be {2x1, 3x2, 3x1, ...
    Jul 14, 2011 at 10:15 pm
    Jul 15, 2011 at 6:37 pm
  • Hi, Is it possible to upgrade to a newer version of hadoop without bringing the cluster down? To my understanding its not. But just wondering.. Thanks Sudharsan S
    Sudharsan SampathSudharsan Sampath
    Jul 8, 2011 at 5:51 am
    Jul 8, 2011 at 2:06 pm
  • hello, My mapreduce program is slow in multi-cluster configuration. Reduce task is stuck at 16% . But the same program is running more faster in sudo-mode(single node). what can i do ??? I have only ...
    Ranjith kRanjith k
    Jul 1, 2011 at 8:09 am
    Jul 1, 2011 at 10:05 pm
  • Hello guys, Can anybody tell me which is the relation between map task and combine tasks? I would like to know if there is a 1:1 relation between them, or is a *:1 (many to one). For example: If I ...
    Lucian IordacheLucian Iordache
    Jul 1, 2011 at 8:56 am
    Jul 1, 2011 at 2:24 pm
  • Hello Folks, I am trying to run a simplistic Hadoop pipes program (the typical wordcount example). Unfortunately I am seeing a bunch of errors while running it as follows: $ bin/hadoop pipes ...
    Decimus PhostleDecimus Phostle
    Jul 27, 2011 at 8:56 pm
    Aug 8, 2011 at 3:53 pm
  • Hello, I am trying to setup a MapReduce job so that the task JVMs are reused on each cluster node. Libraries used by my MapReduce job have a significant initialization time, mainly creating ...
    Brandon VargoBrandon Vargo
    Jul 29, 2011 at 6:20 pm
    Jul 29, 2011 at 7:03 pm
  • I have a generic question about how the number of mapper tasks is calculated, as far as I know, the number is primarily based on the number of splits, say if I have 5 splits and I have 10 tasktracker ...
    Anfernee XuAnfernee Xu
    Jul 25, 2011 at 7:23 pm
    Jul 26, 2011 at 3:54 am
  • unsubscribe
    김 준영김 준영
    Jul 19, 2011 at 3:32 am
    Jul 20, 2011 at 5:49 am
  • I'd like to spread Hadoop across two physical clusters, one which is publicly accessible and the other which is behind a NAT. The NAT'd machines will only run TaskTrackers, not HDFS, and not Reducers ...
    Ben ClayBen Clay
    Jul 18, 2011 at 7:54 pm
    Jul 19, 2011 at 12:24 am
  • All, I cannot confirm if this is an issue with my code / usage or if I am actually running into a framework issue. I just ran a job that uses the exact same method and it works perfectly which makes ...
    Jul 18, 2011 at 9:54 pm
    Jul 18, 2011 at 10:17 pm
  • All, For the first time I have tried to use the class ArrayWritable. All goes well enough until the Reducer tries to do a write. Then, I get the following exception: java.lang.RuntimeException: ...
    Geoffry RobertsGeoffry Roberts
    Jul 18, 2011 at 7:58 pm
    Jul 18, 2011 at 9:22 pm
  • Hi, I'm trying to use java security in hadoop map reduce. I would like to set some security policies only to reduce tasks. Does hadoop offers this feature? -- Best regards, -----------------------
    Pedro Sa CostaPedro Sa Costa
    Jul 11, 2011 at 5:27 am
    Jul 11, 2011 at 3:20 pm
  • Hi, In Pseudo mode, when I have set no number of reducer to 2, I get the following error from a reduce task. 11/07/06 21:15:59 INFO mapreduce.Job: Task Id : attempt_201107062043_0002_r_000000_2, ...
    Shing Hing ManShing Hing Man
    Jul 6, 2011 at 8:37 pm
    Jul 10, 2011 at 6:55 pm
  • hi,all i have some hive sql writen by others ,and will modify them in the future. but i cant' access to the hive environment in my company, but i can use hadoop and run jobs on it, so java code ...
    Ling caoLing cao
    Jul 6, 2011 at 9:20 am
    Jul 7, 2011 at 7:21 pm
  • Hi, I am trying to set up a Hadoop cluster (using hadoop-0.20.2) using a bunch of machines each of which have 2 interfaces, a control and an internal interface. I want only the internal interface to ...
    Virajith JalapartiVirajith Jalaparti
    Jul 7, 2011 at 12:39 pm
    Jul 7, 2011 at 1:38 pm
  • Hi, Does every block of files in HDFS have to be the same file format when writing map-reduce applications, a more specific question is , when dealing with CSV files, can we have a head in the file? ...
    Xiaobo GuXiaobo Gu
    Jul 6, 2011 at 10:03 am
    Jul 6, 2011 at 10:24 am
  • Hello guys, I have a problem with the table splits generation for a Map Reduce executing on HBase table. By default, the table splits are the regions, having a startRow, an endRow and a ...
    Lucian IordacheLucian Iordache
    Jul 4, 2011 at 2:18 pm
    Jul 4, 2011 at 3:37 pm
  • Hi, I am playing with a single node instance hadoop on Solaris 10 x64, when running map-reduce jobs, the following error occured on reduce tasks, but map tasks complete successfully, I am sure the ...
    Xiaobo GuXiaobo Gu
    Jul 30, 2011 at 11:29 am
    Jul 30, 2011 at 11:29 am
  • Moving to mapreduce-user@, bcc common-user@. Use JobControl: http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#Job+Control Arun
    Arun C MurthyArun C Murthy
    Jul 29, 2011 at 11:34 pm
    Jul 29, 2011 at 11:34 pm
  • Hello guys! I'm a begginer in using hadoop and I got a project to do some tests on Hive... I was given a 3 node cluster and I'm trying for some days now to manage to make it work... I installed ...
    Mathew OlmaMathew Olma
    Jul 21, 2011 at 1:44 am
    Jul 21, 2011 at 1:44 am
Group Navigation
period‹ prev | Jul 2011 | next ›
Group Overview
groupmapreduce-user @

68 users for July 2011

Harsh J: 20 posts Arun C Murthy: 12 posts Virajith Jalaparti: 12 posts Steve Lewis: 9 posts Juwei Shi: 6 posts Xiaobo Gu: 6 posts Allen Wittenauer: 5 posts Em: 5 posts John Armstrong: 5 posts Robert Evans: 5 posts Sudharsan Sampath: 5 posts 周俊清: 5 posts Devaraj K: 4 posts Felix gao: 4 posts GOEKE, MATTHEW (AG/1000): 4 posts Joey Echeverria: 4 posts Kai Ju Liu: 4 posts Lucian Iordache: 4 posts Mapred Learn: 4 posts Narayanan K: 4 posts
show more