Search Discussions

53 discussions - 164 posts

  • I am attempting to speed up a mapping process whose input is GZIP compressed CSV files. The files range from 1-2GB, I am running on a Cluster where each node has a total of 32GB memory available to ...
    Hans UhligHans Uhlig
    Mar 11, 2012 at 4:00 am
    Mar 13, 2012 at 2:04 am
  • Hello, I have a couple of questions regarding mapreduce configurations. We install various platforms on data nodes that require mixed set of native libraries. Part of the problem is that in general ...
    Dmitriy LyubimovDmitriy Lyubimov
    Mar 28, 2012 at 1:43 am
    Mar 30, 2012 at 6:00 pm
  • Hi all, We tried using mapreduce to execute a simple map code which read a txt file stored in HDFS and write then the output. The file to read is a very small one. It was not split and written ...
    Hassen RiahiHassen Riahi
    Mar 3, 2012 at 11:53 pm
    Mar 4, 2012 at 8:32 pm
  • Hi, I am quite new to Hadoop and Java as well and have two questions: *Ques 1:* ====== I have a HDFS directory which contains the o/p files of reducer. I want to read all the part-r-* files present ...
    Piyush KansalPiyush Kansal
    Mar 5, 2012 at 9:47 am
    Mar 16, 2012 at 7:37 am
  • Hi, I'm trying to debug map and reduce tasks for a quite long time, and it seems that it's impossible. MR are launched in new process and there's no way to debug them. Even with IsolationRunner class ...
    Pedro CostaPedro Costa
    Mar 29, 2012 at 3:34 pm
    Mar 29, 2012 at 4:33 pm
  • The ReduceTask can save the file using several output format: InternalFileOutputFormant, SequenceFileOutputFormat, TeraOuputFormat, etc... How can I read the keys and the values from the output file? ...
    Pedro CostaPedro Costa
    Mar 30, 2012 at 5:20 pm
    Apr 2, 2012 at 2:41 am
  • Hi all, I'd like to log and monitor average job completion times on hadoop. (For example, wall time from the perspective of the JobTracker) What might be the best way to do this? I see a lot of ...
    Bharath RaviBharath Ravi
    Mar 4, 2012 at 10:39 pm
    Mar 6, 2012 at 4:22 am
  • Hi, Is there any way to ensure the execution of a map on all nodes of a clusterin a way that each node run the map once and only once. That is, I would use Hadoop to execute a method on all nodes in ...
    Luiz Carlos MunizLuiz Carlos Muniz
    Mar 29, 2012 at 12:25 am
    Apr 3, 2012 at 11:07 am
  • We have a weird performance problem with a hadoop job on our cluster. We have a 32-node experimenting cluster of blades (2 hex-core), one dedicated job tracker, one dedicated namenode, with ...
    Mar 16, 2012 at 9:23 pm
    Mar 19, 2012 at 1:57 pm
  • I'm trying to write a Reducer which will eliminate duplicates from the list of values before writing them out. I have the following code for my Reducer: /*****************/ public class ...
    Steven WillisSteven Willis
    Mar 14, 2012 at 9:35 pm
    Mar 16, 2012 at 2:54 pm
  • I am using hadoop 0.20.2 mapreduce API. The program is running fine, just slower than it could. I sum values and then use job.setSortComparatorClass(LongWritable.DecreasingComparator.class) to sort ...
    Henry HelgenHenry Helgen
    Mar 8, 2012 at 11:02 pm
    Mar 9, 2012 at 9:42 pm
  • hi all, we are looking for a way, to map-reduce on a *non-closed files*. we currently able to run a hadoop fs -cat <non-closed-file *non-closed files* - files that are currently been written, and ...
    Niv MizrahiNiv Mizrahi
    Mar 4, 2012 at 2:37 pm
    Mar 5, 2012 at 9:29 pm
  • Hi All, I have a file in HDFS spanning across many blocks. Say the file has many words in it from W1, W2 , W3 ...Wn. I want to find the edit distance between all pairs of words. Is this is possible ...
    Praveen Kumar K J V SPraveen Kumar K J V S
    Mar 28, 2012 at 2:13 am
    Apr 3, 2012 at 12:25 pm
  • Ashish vyasAshish vyas
    Mar 30, 2012 at 8:30 am
    Mar 30, 2012 at 9:56 am
  • Hi All I'm using Hadoop-0.20-append, the cluster contains 3 nodes, for each node I have 14 map and 14 reduce slots, here is the configuration: <property <name ...
    Mar 10, 2012 at 10:40 am
    Mar 12, 2012 at 8:05 am
  • Hi, I tried to configure hadoop 0.23.1.I added all libs from share folder to lib directory.But still i get the error while formating the namenode Exception in thread "main" ...
    Raghavendhra rahulRaghavendhra rahul
    Mar 1, 2012 at 9:49 am
    Mar 2, 2012 at 3:35 am
  • Hi, I dont see mapred.max.map.failures.percent property in mapred-default.xml conf of hadoop version 0.20.1 . Is it removed? Is there any alternate property corresponding to this? -Ajit
    Ajit RatnaparkhiAjit Ratnaparkhi
    Mar 29, 2012 at 11:53 am
    Mar 31, 2012 at 12:09 pm
  • all sorry to bother, as a new user, it seems that I cannot post anything. I've tried twice yesterday, but I didn't receive my own post... can anyone enlighten me? thanks
    Fang XinFang Xin
    Mar 31, 2012 at 4:11 am
    Mar 31, 2012 at 4:16 am
  • Hi All, Can n't we use special character to get paths in HDFS using Hadoop API. E.g. Path path = new Path("/data/input/20120321*"); I have files /data/input/20120321000000/a.txt , ...
    Thamizhannal ParamasivamThamizhannal Paramasivam
    Mar 21, 2012 at 10:31 am
    Mar 22, 2012 at 7:14 am
  • I am trying to write a map-reduce application in which the mapper function is aware of the input HDFS filename of its data split. Anyone know how to do that?
    Qu ChenQu Chen
    Mar 17, 2012 at 12:35 pm
    Mar 17, 2012 at 12:57 pm
  • Hi, In MapReduce, if the locations of the split are in {HostA, HostB, HostC}, and the respective map tasks will run in HostB, the map tasks will pick up the split from HostB? Who is responsible to ...
    Pedro CostaPedro Costa
    Mar 7, 2012 at 1:58 pm
    Mar 7, 2012 at 10:58 pm
  • I switched to new mapreduce API. I need a replacement for job.getNumMapTasks()) in job driver.
    Radim KolarRadim Kolar
    Mar 4, 2012 at 4:25 pm
    Mar 4, 2012 at 7:58 pm
  • Hi all, The FileOutputFormat/FileOutputCommitter always treats an output path as a directory and write files under it, even if there is only one Reducer. Is there any way to configure an OutputFormat ...
    Jianhui ZhangJianhui Zhang
    Mar 3, 2012 at 12:39 am
    Mar 4, 2012 at 4:28 pm
  • Hello Folks, Are there any pointers to such comparisons between Apache Pig and Hadoop Streaming Map Reduce jobs? Also there was a claim in our company that Pig performs better than Map Reduce jobs? ...
    Subir SSubir S
    Mar 2, 2012 at 4:48 am
    Mar 2, 2012 at 7:08 am
  • Hi I have 5 dependent jobs, I'm running them with jobcontrol and jobs 2 and 3 run at the same time (not dependency between them). Each job produces several information that is the input for the ...
    Cornelio IñigoCornelio Iñigo
    Mar 16, 2012 at 4:59 pm
    Apr 2, 2012 at 7:27 am
  • Hi, When i tried to run randomwriter example in capacity scheduler it works fine.But when i run the distributed shell example under capacity scheduler it shows the following exception. RemoteTrace ...
    Raghavendhra rahulRaghavendhra rahul
    Mar 27, 2012 at 9:23 am
    Apr 2, 2012 at 3:33 am
  • Hi all: I'm new to mapreduce, but familiar with Collaborative Filtering recommendation framework. I tried to use mahout to do this work. But it disappointed me. My machine work all day to do this job ...
    Chao yinChao yin
    Mar 31, 2012 at 11:17 am
    Mar 31, 2012 at 4:51 pm
  • Hi All, Just move from Matlab to Hadoop, can anyone kindly give me advise on how to deal with matrix? maybe a starting point will be to calculate some stats for each column. what could a mapper and ...
    Fang XinFang Xin
    Mar 30, 2012 at 6:07 pm
    Mar 31, 2012 at 11:04 am
  • Hi All, Just move from Matlab to Hadoop, can anyone kindly give me advise on how to deal with matrix easily in Hadoop? maybe a starting example will be to calculate some stats for each column. how ...
    Fang XinFang Xin
    Mar 30, 2012 at 6:15 pm
    Mar 31, 2012 at 5:08 am
  • Hello, I'm interested in writing a library, to be used with Node.js, that can ask the JobTracker for information about jobs. I see that this is possible using the Java API, with the JobClient ...
    Ryan ColeRyan Cole
    Mar 29, 2012 at 1:08 am
    Mar 29, 2012 at 3:32 pm
  • current implementation of MapWritable and AbstractMapWritable do not track class usage. Class name is still serialized in write() to disk even if no instance of such class exists in stored table ...
    Radim KolarRadim Kolar
    Mar 25, 2012 at 10:16 am
    Mar 25, 2012 at 4:19 pm
  • i have mappers only job - number of reducers set to 0. Its hadoop 0.22 and output from job is this: 2012-03-24 18:24:22,117 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop ...
    Radim KolarRadim Kolar
    Mar 24, 2012 at 5:32 pm
    Mar 24, 2012 at 5:40 pm
  • I want to add new MR applications into hadoop-0.20.2-examples.jar. How to do that? I have set up Hadoop 0.20.2 development in eclipse.
    Qu ChenQu Chen
    Mar 19, 2012 at 7:05 pm
    Mar 19, 2012 at 9:49 pm
  • Hi, Is there any way i can dump the output of my Ruby Map Reduce jobs into HBase directly? In other words does Hadoop Streaming with Ruby integrate with HBase? Like Pig has HBaseStorage etc. Thanks ...
    Subir SSubir S
    Mar 16, 2012 at 8:05 am
    Mar 17, 2012 at 1:39 pm
  • Hi people, Please, I would like ask something a bit more high level than programing for Hadoop. I will have some students working with Hive, Pig or H-Base (I don't which of them yet) and I would like ...
    Luiz Antonio Falaguasta BarbosaLuiz Antonio Falaguasta Barbosa
    Mar 14, 2012 at 7:42 pm
    Mar 15, 2012 at 3:51 pm
  • Hi all, I am new to Hadoop and just start coding in MapReduce. I've checked out the trunk and am able to build the MapReduce project. I also import the code to the eclipse. My very first goal is to ...
    Viney GuptaViney Gupta
    Mar 8, 2012 at 9:28 am
    Mar 8, 2012 at 10:16 am
  • dear I'm sorry for my poor in English. I'm confused about something. Recently, i read a paper names "TeraByte Sort on Apache Hadoop" which was written by Owen O'Malley. And i see the graph "Running ...
    Mar 5, 2012 at 6:42 am
    Mar 5, 2012 at 9:55 am
  • Hi, I would like to chain multiple reducers without an intervening map phase. I looked at ChainReducer but it seems to support chains of the for M+RM*. What I am looking for is M+R*. What would be ...
    IGZ NickIGZ Nick
    Mar 4, 2012 at 2:23 pm
    Mar 4, 2012 at 3:01 pm
  • Hi all, Consider in hadoop cluster having 4 nodes, and in every node the maximum no.of reduce slots fixed at 5. When mapreduce deamons started, 1) Is there any restriction on no. of simultaneously ...
    Vamshi KrishnaVamshi Krishna
    Mar 2, 2012 at 10:10 am
    Mar 2, 2012 at 10:43 am
  • Hi, In many Hadoop production environments you get gzipped files as the raw input. Usually these are Apache HTTPD logfiles. When putting these gzipped files into Hadoop you are stuck with exactly 1 ...
    Niels BasjesNiels Basjes
    Mar 30, 2012 at 2:07 pm
    Mar 30, 2012 at 2:07 pm
  • Hi All, I am running a Map reduce program that scans the HBase and takes the required data from it. Both the hbase and the MR program runs in the same hadoop cluster. In this case after running 950 ...
    V, SriramV, Sriram
    Mar 26, 2012 at 11:55 pm
    Mar 26, 2012 at 11:55 pm
  • Hi All I noticed there is something strange in my Fair Share Scheduler monitor GUI, the SUMof the Faire Share Value is always about 30 even there is only one M/R Job is running, so I don't know ...
    Mar 21, 2012 at 1:20 am
    Mar 21, 2012 at 1:20 am
  • You are using 0.20.203 but the documentation you are looking at is for 0.21. The MultipleOutputs and MultipleInputs were ported to the mapreduce context objects API in release 0.21. ...
    Henry HelgenHenry Helgen
    Mar 18, 2012 at 3:43 am
    Mar 18, 2012 at 3:43 am
  • Hi, We are trying to execute a mapper making a random access during writing files. It seems that HDFS supports only random seek during read and not during write (neither the file modification). Is it ...
    Hassen RiahiHassen Riahi
    Mar 17, 2012 at 2:09 pm
    Mar 17, 2012 at 2:09 pm
  • Hello, Apache MRUnit 0.8.1-incubating has been released and there is a blog post describing the major changes: https://blogs.apache.org/mrunit/entry/apache_mrunit_0_8_1 Chief among the new features ...
    Brock NolandBrock Noland
    Mar 16, 2012 at 4:39 pm
    Mar 16, 2012 at 4:39 pm
  • (Replying to my old email sent on 1/31/2012) https://issues.apache.org/jira/browse/MAPREDUCE-4003 was opened for this issue. Uploaded a silly patch. I hope someone can pick it up from there. Koji
    Koji NoguchiKoji Noguchi
    Mar 15, 2012 at 11:23 pm
    Mar 15, 2012 at 11:23 pm
  • Hi, Are there examples of serializing array of type float to from C++ pipes? From the examples, I assume float floatArrayOut[arrayBytes/sizeof(float)]; // assignment of floatArrayOut entries... char ...
    Charles EarlCharles Earl
    Mar 13, 2012 at 12:17 pm
    Mar 13, 2012 at 12:17 pm
  • The Apache MRUnit team is pleased to announce the release of MRUnit 0.8.1-incubating from the Apache Incubator. This is the third release of Apache MRUnit, a Java library that helps developers unit ...
    Brock NolandBrock Noland
    Mar 11, 2012 at 9:10 pm
    Mar 11, 2012 at 9:10 pm
  • Hi, I'd like to introduce you Pangool <http://pangool.net/ , an easier low-level MapReduce API for Hadoop. I'm one of the developers. We just open-sourced it yesterday. Pangool is a Java, low-level ...
    Pere FerreraPere Ferrera
    Mar 6, 2012 at 10:36 am
    Mar 6, 2012 at 10:36 am
  • Hi, I have a following issue in Hadoop 0.20.2. When i try to use inheritance with WritableComparables the job is failing. Example If i create a base writable called as shape public abstract class ...
    Madhu phatakMadhu phatak
    Mar 5, 2012 at 5:33 am
    Mar 5, 2012 at 5:33 am
Group Navigation
period‹ prev | Mar 2012 | next ›
Group Overview
groupmapreduce-user @

67 users for March 2012

Harsh J: 20 posts Joey Echeverria: 8 posts Brock Noland: 5 posts George Datskos: 5 posts GUOJUN Zhu: 5 posts Piyush Kansal: 5 posts WangRamon: 5 posts Bejoy Ks: 4 posts Dmitriy Lyubimov: 4 posts Fang Xin: 4 posts Hans Uhlig: 4 posts Hassen Riahi: 4 posts Henry Helgen: 4 posts Raghavendhra rahul: 4 posts Bharath Ravi: 3 posts Madhu phatak: 3 posts Marcos Ortiz: 3 posts Niv Mizrahi: 3 posts Pedro Costa: 3 posts Steven Willis: 3 posts
show more