FAQ

Search Discussions

49 discussions - 157 posts

  • Hi to all, I just subscribed to this mailing list and I'd like to ask you if anyone knows how to deal with LD_LIBRARY_PATH. I have a Java application that needs a proper setting of this environment ...
    Donatella FirmaniDonatella Firmani
    Apr 29, 2011 at 10:58 am
    May 2, 2011 at 3:15 pm
  • Hi, I need to process data in a Java MR job (using 0.20.1) in a way such that the largest part of the data is manipulated in the mapper only (i.e. some simple per-record transformation without the ...
    Christoph SchmitzChristoph Schmitz
    Apr 20, 2011 at 8:50 am
    Apr 20, 2011 at 1:20 pm
  • Hi I'm running on Hadoop 0.20.2 and I have a job with the following nature: * Mapper outputs very large records (50 to 200 MB) * Reducer (single) merges all those records together * Map output key is ...
    Shai EreraShai Erera
    Apr 14, 2011 at 6:32 pm
    Apr 19, 2011 at 6:47 pm
  • Hi, I have a map-only mapreduce job where I want to deduce the output filename from the output key/value. I figured ...
    Hari SreekumarHari Sreekumar
    Apr 14, 2011 at 5:40 am
    May 2, 2011 at 10:55 pm
  • Hi, I have a question about copy speed by a MapReduce Job.I have a Cluster with 4 slave and 1 master, computers connected each other with one 8-Port-Switch (up to 1000Mbps). Copy speed is by my Job ...
    Baran cakiciBaran cakici
    Apr 15, 2011 at 2:50 pm
    Apr 19, 2011 at 1:09 pm
  • Hi, I am runnnig a hadoop jar command as: hadoop --config < jar <jarname <main class -conf <conf file My question is how and where can I specify -Xmx option to increase heap assigned to my JVM ? ...
    Mapred LearnMapred Learn
    Apr 28, 2011 at 9:27 pm
    Apr 29, 2011 at 5:32 pm
  • hi when i read the source of mapred, i find there're a lot of classes with the same names between "org.apache.hadoop.mapreduce" and "org.apache.hadoop.mapred", such as ...
    吕鹏吕鹏
    Apr 26, 2011 at 12:17 pm
    Apr 26, 2011 at 6:59 pm
  • Exists any notification mechanism to be notified of all task state changes (task failure,task completion ecc)? In particular i need to follow all the map and reduce task execution.
    Francesco De LucaFrancesco De Luca
    Apr 17, 2011 at 10:18 am
    Apr 18, 2011 at 2:14 pm
  • Hi, I have a job that runs fine with a small data set in pseudo-distributed mode on my desktop workstation but when I run it on our Hadoop cluster it falls over, crashing during the initialisation of ...
    James HammertonJames Hammerton
    Apr 26, 2011 at 5:55 pm
    Apr 27, 2011 at 1:56 pm
  • Hi, As you know, Hadoop MapReduce start child JVM processes to run task. I want to start the process of m/r task myself thus I can pass some OS level parameters to the JVM process. For example, set ...
    Juwei ShiJuwei Shi
    Apr 26, 2011 at 1:03 pm
    Apr 26, 2011 at 4:22 pm
  • Job tracker copies job jar from mapred system directory on HDFS to its local file system ( ${mapred.local.dir}/jobTracker ). What is the purpose of copying this file? It does not run any user code. ...
    Raghu AngadiRaghu Angadi
    Apr 26, 2011 at 6:57 pm
    May 10, 2011 at 10:41 am
  • hello.. Is it possible to create an another map reduce task from inside the map function?? advance thanks... -- Ranjith k +918129419842
    Ranjith kRanjith k
    Apr 23, 2011 at 11:04 am
    Apr 23, 2011 at 2:39 pm
  • Hi, I am coding a map-reduce program which involves several map-reduce steps. The work that my program does is only in the mapper, so I was thinking to have no reduce steps but successive mappers. ...
    Injun JoeInjun Joe
    Apr 15, 2011 at 7:05 pm
    Apr 15, 2011 at 9:24 pm
  • Hi All, I am trying to run a map reduce job and it is running perfectly from cmd using following command hadoop jar Processor.jar arg1 arg2 but when i schedule the same job in oozie, it is giving me ...
    Shuja RehmanShuja Rehman
    Apr 20, 2011 at 8:21 pm
    Apr 22, 2011 at 7:01 am
  • Hi, 1 - I would like to run Hadoop MR in my laptop, but with the cluster mode configuration. I've put a slave file with the following content: [code] 127.0.0.1 localhost 192.168.10.1 mylaptop [/code] ...
    Pedro CostaPedro Costa
    Apr 18, 2011 at 1:10 pm
    Apr 18, 2011 at 1:21 pm
  • Hi, what is the smallest linux system/distros to run hadoop ? I would want to run small linux vms and run jobs on them. -mac
    Web serviceWeb service
    Apr 15, 2011 at 4:47 am
    Apr 15, 2011 at 2:49 pm
  • Hi, I was wondering what's the communication protocol between MapReduce and the HDFS. The MapReduce fetch and saves data blocks from HDFS by HTTP or by RPC? Thanks, -- PSC
    Pedro CostaPedro Costa
    Apr 9, 2011 at 4:58 pm
    Apr 9, 2011 at 9:20 pm
  • Hello. I need to create a custom input split. I need to split my input in to 50 line for one input split. How can i do it. And also there is an another problem for me. I have a file. But it is not in ...
    Ranjith kRanjith k
    Apr 7, 2011 at 4:57 am
    Apr 9, 2011 at 4:21 pm
  • I want to increase DFS capacity. Which command I should use? Using "hadoop dfsadmin -report", the following is my current configuration. Configured Capacity: 53687091200 (50 GB) Present Capacity: ...
    Zhengjun chenZhengjun chen
    Apr 7, 2011 at 2:19 pm
    Apr 9, 2011 at 8:06 am
  • hello. I am new to hadoop map reduce programming. I need to write a map reduce program. I have a input folder, it contain a 10 number of documents in text format. My aim is to write a map reduce ...
    Ranjith kRanjith k
    Apr 1, 2011 at 3:05 pm
    Apr 4, 2011 at 4:31 pm
  • So I wrote my first org.apache.hadoop.madpreduce.Job (not ...mapred.Job). Oddly enough, when the reducer is invoked, the "Iterable values" parameter actually iterates over just one value, not all the ...
    Mike SpreitzerMike Spreitzer
    Apr 28, 2011 at 7:00 pm
    Apr 28, 2011 at 11:59 pm
  • HI, How to generate crc for files on hdfs? I copied files from hdfs to remote machine, I want to verify integrity of files ( using copyToLocal command , I tried using -crc option too , but it looks ...
    Giridhar AddepalliGiridhar Addepalli
    Apr 27, 2011 at 6:27 am
    Apr 27, 2011 at 11:18 am
  • Hi I need to iterate the values of reducer more than one time but it seems that it allow only for once. Does anybody know how to achieve it? Thanks -- Regards Shuja-ur-Rehman Baig ...
    Shuja RehmanShuja Rehman
    Apr 26, 2011 at 8:30 am
    Apr 26, 2011 at 9:07 am
  • Hi, the Hadoop MapReduce counters has the parameters FILE_BYTES_READ, FILE_BYTES_WRITTEN, HDFS_BYTES_READ and HDFS_BYTES_WRITTEN. What the FILE and HDFS counters represents? Thanks, -- Pedro
    Pedro CostaPedro Costa
    Apr 24, 2011 at 11:22 am
    Apr 24, 2011 at 1:24 pm
  • BUILD FAILED ......./branch-0 .20-append/build.xml:927: The following error occurred while executing this line: ....../branch-0 .20-append/build.xml:933: exec returned: 1 Total time: 1 minute 17 ...
    Alex LuyaAlex Luya
    Apr 12, 2011 at 2:46 am
    Apr 12, 2011 at 3:03 am
  • Hi, I am a newbie to the mapreduce (in fact hadoop as a whole) framework. I am trying to run a simple WordCount client class programatically inside the eclipse, hence for that, I have provided the ...
    لسٹ शिराज़لسٹ शिराज़
    Apr 6, 2011 at 1:18 pm
    Apr 9, 2011 at 8:25 am
  • Somehow, in the long chain of shell variable fiddling that creates the command line in bin/daemon.sh, there is a small mistake. The command line for tasktracker comes out (shorn of many other items) ...
    Lance NorskogLance Norskog
    Apr 8, 2011 at 4:37 am
    Apr 9, 2011 at 8:13 am
  • Hi, Is anyone know, how am I limit my MapReduce's CPU-Usage on Windows XP?? Regards, Baran
    Baran cakiciBaran cakici
    Apr 5, 2011 at 1:33 pm
    Apr 5, 2011 at 3:23 pm
  • Hi, First of all, I use Hadoop-0.20.2 on Windows XP Pro with Eclipse Plug-In. I have a Cluster with 1 Master (Jobtracker and Namenode) and 4 Slaves(Datanode and TaskTracker). I have some problems ...
    Baran cakiciBaran cakici
    Apr 1, 2011 at 12:05 pm
    Apr 5, 2011 at 3:22 pm
  • Hi, Where can I get the shuffle and sort time of a reduce task? -- Pedro
    Pedro CostaPedro Costa
    Apr 4, 2011 at 6:16 pm
    Apr 4, 2011 at 7:29 pm
  • Hi Everyone, I have a Cluster with one Master(JobTracker and NameNode - Intel Core2Duo 2 GB Ram) and four Slaves(Datanode and Tasktracker - Celeron 2 GB Ram). My Inputdata are between 2GB-10GB and I ...
    Baran cakiciBaran cakici
    Apr 28, 2011 at 3:22 pm
    Apr 28, 2011 at 3:22 pm
  • hi, I have 8 machines for the hadoop cluster, 1 namenode and 7 data node. I want the production jobs to have more priority than the user-defined jobs, so I use the Fair scheduler. Why sometimes my ...
    Erix YaoErix Yao
    Apr 26, 2011 at 4:21 pm
    Apr 26, 2011 at 4:21 pm
  • I have a use case where I join two equi-partitioned data sets (old and new) to produce a merged set. In theory, this is trivially solvable by map-side join using CompositeInputFormat. No shuffle, ...
    YurgisYurgis
    Apr 25, 2011 at 7:56 pm
    Apr 25, 2011 at 7:56 pm
  • Hello, I am a beginner in MapReduce and I am trying to create a forward and an inverted index for a large number of documents. I believe that parsing each document twice (once for the forward index ...
    Panayotis AntonopoulosPanayotis Antonopoulos
    Apr 25, 2011 at 1:14 am
    Apr 25, 2011 at 1:14 am
  • Room 209! Hello Fellow Hadoopists, We are meeting at 7:15 PM April 21st at the University Heights Community Center 5031 University Way NE Seattle WA 98105 Room #209 Seattle Hadoop Distributing ...
    Sean jensen-greySean jensen-grey
    Apr 20, 2011 at 10:29 pm
    Apr 20, 2011 at 10:29 pm
  • -- Ellis R. Miller 937.830.8242 937.830.6027 <http://my.wisestamp.com/link?u=2hxhdfd4p76bkhcm&site=www.wisestamp.com/email-install Mundo Nulla Fides ...
    Ellis MillerEllis Miller
    Apr 19, 2011 at 11:59 pm
    Apr 19, 2011 at 11:59 pm
  • Hi, I am looking at the feature of multithreaded map tasks. I find that the new API provides org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper class to enable multi-thread in each map task. We ...
    Juwei ShiJuwei Shi
    Apr 19, 2011 at 1:58 pm
    Apr 19, 2011 at 1:58 pm
  • Where are the map and the reduce functions of all examples of GridMix2? -- Pedro
    Pedro CostaPedro Costa
    Apr 18, 2011 at 2:06 pm
    Apr 18, 2011 at 2:06 pm
  • Hi, the gridmix2 contains several tests like, Combiner, StreamingSort, Webdatasort, webdatascan and monsterquery. I would like to know what does this examples do? Which example uses more CPU for the ...
    Pedro CostaPedro Costa
    Apr 18, 2011 at 1:42 pm
    Apr 18, 2011 at 1:42 pm
  • Hi I've checked out Hadoop-0.20.2 from http://svn.apache.org/repos/asf/hadoop/common/tags/release-0.20.2, and from cygwin I run 'ant test-core -Dtestcase=TestLocalDFS'. The test fails. Nothing is ...
    Shai EreraShai Erera
    Apr 18, 2011 at 12:41 pm
    Apr 18, 2011 at 12:41 pm
  • Exists any notification mechanism to be notified of all task state changes (task failure,task completion ecc)? In particular i need to follow all the map and reduce task execution.
    Francesco De LucaFrancesco De Luca
    Apr 17, 2011 at 10:25 am
    Apr 17, 2011 at 10:25 am
  • Hi, is there's such a concept as 'task local working directory' that the task could use for side info? In my problem, map tasks are using side info which are memory-mapped structures and as such must ...
    Dmitriy LyubimovDmitriy Lyubimov
    Apr 14, 2011 at 7:24 pm
    Apr 14, 2011 at 7:24 pm
  • Is it possible to change the logging level for an individual job? (As opposed to the cluster as a whole.) E.g., is there some key that I can set on the job's configuration object that would allow me ...
    David RosenstrauchDavid Rosenstrauch
    Apr 13, 2011 at 3:36 pm
    Apr 13, 2011 at 3:36 pm
  • Hello All, I have a question regarding configuring the number of reducers property in case of a non FIFO scheduler (either Capacity/Fair-share scheduler). As per the guidelines on the Hadoop wiki ...
    Hrishikesh GadreHrishikesh Gadre
    Apr 13, 2011 at 12:59 pm
    Apr 13, 2011 at 12:59 pm
  • Hi, I'm trying to submit a streaming job using the -conf option to specify a job configuration file. One of the options in my configuration file is stream.addenvironment but this option doesn't ...
    Jeremy LewiJeremy Lewi
    Apr 8, 2011 at 4:36 am
    Apr 8, 2011 at 4:36 am
  • Hi, I m getting FileNotFound Exception while using distribute cache. here is the details. Configuration config = new Configuration(); config.clear(); config.set("hbase.zookeeper.quorum", ...
    Shuja RehmanShuja Rehman
    Apr 6, 2011 at 6:28 pm
    Apr 6, 2011 at 6:28 pm
  • Hi guys, I am trying to run Hadoop0.21.0 with PVFS2. Following the email thread http://www.mail-archive.com/[email protected]/msg04434.html, I managed to make my Hadoop cluster up and run ...
    WantaoWantao
    Apr 6, 2011 at 1:58 pm
    Apr 6, 2011 at 1:58 pm
  • Hi All, Does anybody know about this problem? SEVERE: null java.io.FileNotFoundException: File does not exist: /home/shuja/extract/2e8baca8-67e7-4da6-8253-2ae9e6d3fb8a at ...
    Shuja RehmanShuja Rehman
    Apr 5, 2011 at 10:44 am
    Apr 5, 2011 at 10:44 am
  • Hi All, I have implemented the distributed cache according to following article. http://chasebradford.wordpress.com/2011/02/05/distributed-cache-static-objects-and-fast-setup/ but when i run the ...
    Shuja RehmanShuja Rehman
    Apr 4, 2011 at 3:02 pm
    Apr 4, 2011 at 3:02 pm
Group Navigation
period‹ prev | Apr 2011 | next ›
Group Overview
groupmapreduce-user @
categorieshadoop
discussions49
posts157
users52
websitehadoop.apache.org...
irc#hadoop

52 users for April 2011

Harsh J: 26 posts Donatella Firmani: 10 posts Baran cakici: 8 posts Pedro Costa: 7 posts Juwei Shi: 6 posts Shuja Rehman: 6 posts Chris Douglas: 5 posts Francesco De Luca: 5 posts Hari Sreekumar: 5 posts Ranjith k: 5 posts Shai Erera: 5 posts Alex Kozlov: 4 posts Christoph Schmitz: 4 posts Jeremy Lewi: 4 posts Panayotis Antonopoulos: 4 posts Injun Joe: 3 posts James Hammerton: 3 posts Robert Evans: 3 posts Zhengjun chen: 3 posts Erix Yao: 2 posts
show more