Search Discussions

48 discussions - 166 posts

  • disclaimer : a newbie!!! Howdy? Got a quick question. -libjars option doesn't seem to work for me in - prettymuch - my first (or mayby second) mapreduce job. Here's what i'm doing : $bin/hadoop jar ...
    Vipul PandeyVipul Pandey
    Dec 10, 2010 at 6:30 am
    Sep 17, 2011 at 7:55 am
  • Hello everbody, I am wondering if there is a feature allowing (in my case) reduce tasks to communicate. For example by some volatile variables at some centralized point. Or maybe just notify other ...
    Martin BeckerMartin Becker
    Dec 18, 2010 at 4:05 pm
    Jan 9, 2011 at 9:15 am
  • Hi All, Is there anyway to influence where a reduce task is run? We have a case where we'd like to choose the host to run the reduce task based on the task's input key. Any suggestion is greatly ...
    Jane ChenJane Chen
    Dec 18, 2010 at 5:44 pm
    Jan 9, 2011 at 9:11 am
  • Hi all, I'm running Hadoop 0.20.2 in a cluster setup built by 4 machines, and I'm trying to run my own job. Here is my configuration: core-site.xml <configuration <property <name ...
    Ivan LeonardiIvan Leonardi
    Dec 18, 2010 at 1:55 pm
    Dec 27, 2010 at 10:42 pm
  • Hi, 1 - I would like to understand how a partition works in the Map Reduce. I know that the Map Reduce contains the IndexRecord class that indicates the length of something. Is it the length of a ...
    Pedro CostaPedro Costa
    Dec 22, 2010 at 8:47 pm
    Dec 23, 2010 at 5:00 am
  • Hello everybody, is there a possibility to make sure that certain/all reduce tasks, i.e. the reducers to certain keys, are executed in a specified order? This is Job internal, so the Job Scheduler is ...
    Martin BeckerMartin Becker
    Dec 19, 2010 at 3:40 pm
    Dec 20, 2010 at 7:29 pm
  • Hi all, I want to create index for a bunch of log. Out log is line based text file. Each line contains time, uid and some other data. I want to create index for uid. So for each line, mapper will ...
    Dec 1, 2010 at 6:42 am
    Dec 4, 2010 at 3:17 am
  • 1 - A reduce task should start only when a map task ends ? -- Pedro
    Pedro CostaPedro Costa
    Dec 20, 2010 at 6:33 pm
    Jan 4, 2011 at 8:08 am
  • Hi all, I am trying to figure out how exactly happens inside the job. 1) When the jobtracker launches a task to be run, how does it impact the currently running jobs if the the current running job ...
    Felix gaoFelix gao
    Dec 29, 2010 at 10:43 pm
    Jan 13, 2011 at 5:49 pm
  • Hi there, I have a map-reduce job that processes binary files. I'm currently using /tmp/ as a temporary location to write data to and perform operations like decompression. If a mapper fails, the ...
    Dec 10, 2010 at 9:19 am
    Dec 22, 2010 at 5:40 pm
  • This question may have been asked numerous times, and the answer will probably come down to the specific situation you are in, but I'm going to ask anyway: Which Hadoop version should I pick? I'm ...
    Dec 22, 2010 at 11:40 am
    Dec 22, 2010 at 8:32 pm
  • Hi, I'm trying to understand how the scheduler works in the Hadoop MR, and I've got the following questions: 1 - When we've two JobTrackers running simultaneously, each JobTracker is running in a ...
    Dec 7, 2010 at 2:37 pm
    Dec 15, 2010 at 1:26 am
  • Having an issue with some SequenceFiles that I generated, and I'm trying to write a M/R job to fix them. Situation is roughly this: I have a bunch of directories in HDFS, each of which contains a set ...
    David RosenstrauchDavid Rosenstrauch
    Dec 7, 2010 at 11:43 pm
    Dec 8, 2010 at 10:22 pm
  • Hello all, I have few map reduce jobs that I am invoking from a glasfish container. I am not using "hadoop" command line tool but calling the jobs directly from glassfish programatically. I have a ...
    Praveen PeddiPraveen Peddi
    Dec 29, 2010 at 3:40 pm
    Dec 29, 2010 at 11:25 pm
  • Hi, 1 - Hadoop MR contains a TaskMemoryManagerThread class that is used to manage memory usage of tasks running under a TaskTracker. Why Hadoop MR needs a class to manage memory? Why it couldn't rely ...
    Pedro CostaPedro Costa
    Dec 9, 2010 at 11:06 am
    Dec 10, 2010 at 2:39 am
  • I am trying to run distcp between 2 hdfs clusters and I am getting a ConnectException . The Command I am trying to run is of the form (considering clusters hadoop-A & hadoop-B): ./hadoop distcp ...
    Deepika KheraDeepika Khera
    Dec 6, 2010 at 5:51 pm
    Dec 9, 2010 at 9:59 pm
  • In Hadoop 0.21, I found InputFormat as an Interface in package mapred, and as an abstract class in package mapreduce. The APIs are slightly different. Which one should I choose to extend from or ...
    Jane ChenJane Chen
    Dec 6, 2010 at 9:42 pm
    Dec 7, 2010 at 10:52 pm
  • Hi, I have two data sources of different format, one sequence file and the other text. They share the same key, so I 'd like to have the following, map1: <k, v1 - <k, v2 map2: <k, v1' - <k, v2' Both ...
    Yin LouYin Lou
    Dec 30, 2010 at 4:38 pm
    Dec 30, 2010 at 5:20 pm
  • Hi, 1 - When Hadoop MR is running in cluster mode, is it possible to have several TT, a TT per machine, running simultaneously and the JT is communicating to every TT? For example, running hadoop MR ...
    Pedro CostaPedro Costa
    Dec 29, 2010 at 11:33 pm
    Dec 30, 2010 at 11:36 am
  • Hi, Quick question: is it possible to know the total number of mapper tasks (not attempts) in partitioner? Or, which is in many ways equivalient, the # of splits produced by the file input format? ...
    Dmitriy LyubimovDmitriy Lyubimov
    Dec 25, 2010 at 5:40 pm
    Dec 25, 2010 at 6:15 pm
  • Hi, I am using the setup() and cleanup() methods as follows: @Override protected void setup(Context context) throws IOException, InterruptedException { HBaseConfiguration conf = new ...
    Hari SreekumarHari Sreekumar
    Dec 13, 2010 at 5:26 pm
    Dec 14, 2010 at 4:25 am
  • Hi I want to induce dynamic priority to the jobs being submitted like rather than the default FIFO i want to set a variable that decides the priority of it a situation may be even the job may be ...
    Nitin reddyNitin reddy
    Dec 12, 2010 at 5:51 pm
    Dec 13, 2010 at 4:01 am
  • Hi I am new to hadoop, I am trying to get the statistics for a task tracker like , the task tracker id, the number of tasks submitted to it , the number of tasks were successfully run,may be the run ...
    Nitin reddyNitin reddy
    Dec 6, 2010 at 5:17 pm
    Dec 12, 2010 at 1:56 pm
  • Hi I am just starting to learn about mapreduce. Could any one give inputs how to get the statistics for the task trackers in the cluster . Like if run jobs on the cluster for whole day, i want to get ...
    Nitin reddyNitin reddy
    Dec 11, 2010 at 11:18 pm
    Dec 12, 2010 at 10:40 am
  • Hi, This relates to a bug we had a while back. When running a reducer, if you want to buffer the values, you normally need to take a copy of each value as you iterate through them. This is because ...
    James HammertonJames Hammerton
    Dec 9, 2010 at 10:21 am
    Dec 10, 2010 at 11:17 am
  • I define the counter to count the bad records, there is below code in map task; reporter.incrCounter("bad'," records', 1), When the job is completed, the pritnt the result to use below code: long ...
    Lei liuLei liu
    Dec 26, 2010 at 6:09 am
    Jan 9, 2011 at 8:29 am
  • Hi, I'm trying to understand how can I execute the unit tests of MapReduce in eclipse IDE. I'm trying to execute the unit test class TestMapReduceLocal, but I get errors that I pasted below. How can ...
    Pedro CostaPedro Costa
    Dec 30, 2010 at 5:45 pm
    Dec 30, 2010 at 11:02 pm
  • All, Not sure if this is the right mailing list of this question. I am using pig to do some data analysis and I am wondering if there a way to tell hadoop when it encountered a bad log files either ...
    Felix gaoFelix gao
    Dec 20, 2010 at 8:06 pm
    Dec 26, 2010 at 7:47 am
  • Hi all, a quick question: is the config option name "mapred.reduce.tasks.speculative.execution" valid in hadoop-0.21, or did it change since 0.20? Regards, Paweł Łoziński
    Paweł ŁozińskiPaweł Łoziński
    Dec 22, 2010 at 12:11 pm
    Dec 22, 2010 at 5:33 pm
  • Hi, I have been using Hadoop's MapReduce only for the past few months. I use it for data mining purposes. I use a very small cluster, 4 nodes. 1- Name node, 3 - Datanodes and the Job tracker runs on ...
    Sriram RamachandrasekaranSriram Ramachandrasekaran
    Dec 12, 2010 at 9:35 am
    Dec 22, 2010 at 2:43 am
  • Hi, I'm looking for a way to pass options to my Map class. More specifically, I want my mappers to write to sequence files in a path that I can specify on the command line. I tried setting a variable ...
    Dec 13, 2010 at 9:10 pm
    Dec 13, 2010 at 9:28 pm
  • Hi, I know that Hadoop MR don't use the java object Serialization and use instead the object Writable, and I understand the reasons that the Hadoop MR team chose that. I was doing my modifications to ...
    Dec 10, 2010 at 3:23 pm
    Dec 10, 2010 at 3:50 pm
  • Hi, I chain multiple jobs in my program. Job 1's reduce function has a counter. I want job 3's reduce function to read this Job 1's counter. How? Thanks.
    Savannah BeckettSavannah Beckett
    Dec 10, 2010 at 5:46 am
    Dec 10, 2010 at 5:51 am
  • Hi All, We have a problem in hand which we would like to solve using Distributed and Parallel Processing. *Problem context* : We have a Map (Entity, Value). The entity can have a parent which in turn ...
    Narinder KumarNarinder Kumar
    Dec 9, 2010 at 1:02 pm
    Dec 9, 2010 at 6:46 pm
  • I am trying to run Hadoop on a multi-core architecture (e.g., 32 or more cores). I am somewhat unclear what the best configuration would be to maximize performance of Hadoop in this setting. 1) Since ...
    Jon LedermanJon Lederman
    Dec 4, 2010 at 12:34 am
    Dec 4, 2010 at 2:09 am
  • Hi, I'm trying generate Solr index from hadoop (map/reduce) so I'm using this patch SOLR-301 <https://issues.apache.org/jira/browse/SOLR-1301 , however I don't get it. When I try to run CSVIndexer ...
    Dec 29, 2010 at 9:26 am
    Dec 29, 2010 at 9:26 am
  • Hi, I'm creating multiple sequence files as the output of a large MR-job (with the SequenceFileOutputFormat). As expected, the keys in these sequence files are nicely ordered since the reduce step ...
    Dec 27, 2010 at 4:44 pm
    Dec 27, 2010 at 4:44 pm
  • We have a file like so..... Activityid, accountId, otherinfo 1, 1, xdf 2, 5, sdf 3, 1, sadf 4, 3, asdf 5, 1, asdf We want to read all this in each night and operate on the data. At first I was ...
    Hiller, Dean (Contractor)Hiller, Dean (Contractor)
    Dec 23, 2010 at 5:46 pm
    Dec 23, 2010 at 5:46 pm
  • Hi all, I'm porting an application to MapReduce (currently hadoop v0.21.0) and encountered the following problem: My application implements a cache which contains data instances and implements the ...
    Stanley HillnerStanley Hillner
    Dec 23, 2010 at 12:01 pm
    Dec 23, 2010 at 12:01 pm
  • I am getting this weird exception in JobTracker logs right now when running example Caused by: java.sql.SQLException: socket creation error at org.hsqldb.jdbc.Util.sqlException(Unknown Source) at ...
    Hiller, Dean (Contractor)Hiller, Dean (Contractor)
    Dec 16, 2010 at 4:02 pm
    Dec 16, 2010 at 4:02 pm
  • Hi all, I have couple of boxes that need to periodically copy stuff from their local boxes to HDFS using HDFS Client by issuing hadoop fs -copyFromLocal src dest command on it. The file size is ...
    Felix gaoFelix gao
    Dec 14, 2010 at 9:15 pm
    Dec 14, 2010 at 9:15 pm
  • Hi all, I'm using hadoop for my thesis and have implemented an application for hadoop 0.20.2 On 0.20.2 everything went fine but after switching to 0.21.0, I get the following error several times: ...
    Stanley HillnerStanley Hillner
    Dec 14, 2010 at 9:50 am
    Dec 14, 2010 at 9:50 am
  • Hello. I'm unsure of if this is a bug or an oversight, but since I've not found any reference anywhere to this, I figured I might bring it to light. I've been using MultipleInputs for several of my ...
    Ghigliotti, MatthewGhigliotti, Matthew
    Dec 9, 2010 at 9:43 pm
    Dec 9, 2010 at 9:43 pm
  • Hello, I have a data processing logic implemented so that on input it receives Iterable<Some . I.e. pretty much the same as reducer's API. But I need to use this code in Map, where each element is ...
    Alex BaranauAlex Baranau
    Dec 8, 2010 at 9:14 pm
    Dec 8, 2010 at 9:14 pm
  • This is for hadoop-0.20-append and any other hadoop-0.20.* and later I believe. We run hadoop tasks within an embedded runtime (z2-environment) that has its own class loading hierarchy (not like OSGi ...
    Henning BlohmHenning Blohm
    Dec 8, 2010 at 1:57 pm
    Dec 8, 2010 at 1:57 pm
  • Hello, We (Olivier, Nicolas and I) are organizing a Data Analytics DevRoom that will take place during the next edition of the FOSDEM in Brussels on Feb. 5. Here is the CFP: ...
    Isabel DrostIsabel Drost
    Dec 7, 2010 at 4:05 pm
    Dec 7, 2010 at 4:05 pm
  • When most of the work is done by reducer at cleanup ( takes 90% of the job time) how can I report a proper progress of the overall job? By default the job tracker shows 100% right after all records ...
    Dec 6, 2010 at 5:20 pm
    Dec 6, 2010 at 5:20 pm
  • Hello Fellow Mappers and Reducers, We are meeting at 7:15 pm on December 2nd at the University Heights Community Center 5031 University Way NE Seattle WA 98105 Room #110 The meetings are informal and ...
    Sean jensen-greySean jensen-grey
    Dec 2, 2010 at 4:58 am
    Dec 2, 2010 at 4:58 am
Group Navigation
period‹ prev | Dec 2010 | next ›
Group Overview
groupmapreduce-user @

55 users for December 2010

Harsh J: 20 posts David Rosenstrauch: 11 posts Pedro Costa: 11 posts Martin Becker: 10 posts Eric: 8 posts Hari Sreekumar: 7 posts Jason: 7 posts Allen Wittenauer: 5 posts Ivan Leonardi: 5 posts Jane Chen: 5 posts Ted Yu: 5 posts Nitin reddy: 4 posts Ravi Gummadi: 4 posts Felix gao: 3 posts Hiller, Dean (Contractor): 3 posts Li ping: 3 posts Rahul patodi: 3 posts Shrijeet Paliwal: 3 posts Chase Bradford: 2 posts Deepika Khera: 2 posts
show more