FAQ

Search Discussions

74 discussions - 226 posts

  • Hi, I run the PI example of hadoop, and I've got the following error: [code] java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.BooleanWritable, recieved ...
    Pedro CostaPedro Costa
    Jan 26, 2011 at 1:47 pm
    Jan 27, 2011 at 5:30 pm
  • Hi, I'm trying to write (K,V) where K is a Text object and V's CustomObject. But It doesn't run. I'm configuring output job like: SequenceFileInputFormat so I have job with: ...
    JoanJoan
    Jan 14, 2011 at 12:58 pm
    Jan 19, 2011 at 8:04 pm
  • Hi, Is there a way to drain a tasktracker. What we require is not to schedule any more map/red tasks onto a tasktracker(mark it offline) but still the running tasks should not be affected. -- --- ...
    Rishi pathakRishi pathak
    Jan 28, 2011 at 10:14 am
    Jan 31, 2011 at 5:34 pm
  • Hi, I want to reduce number of splits because I think that I get many splits and I want to reduce these splits. While my job is running I can see: *INFO mapreduce.Job: map ∞% reduce 0%* I'm using ...
    JoanJoan
    Jan 19, 2011 at 4:03 pm
    Jan 20, 2011 at 9:22 am
  • Hi, I'm trying load data from big table in Database. I'm using DBInputFormat but when my Job try to get all records, It throws an execption: *Exception in thread "Thread for syncLogs" ...
    JoanJoan
    Jan 3, 2011 at 4:57 pm
    Jan 5, 2011 at 6:20 am
  • Hi, I'm trying set Object into Hadoop's configuration but I don't know how to. I'm want to do: org.apache.hadoop.conf.Configuration conf = new org.apache.hadoop.conf.Configuration(); ...
    JoanJoan
    Jan 26, 2011 at 10:44 am
    Jan 27, 2011 at 3:37 pm
  • Hi, I am trying to run chained jobs using Hadoop 0.20.2. I see that the class JobControl is provided for the same purpose in 0.20.2. However, I can only add the deprecated class ...
    Sarthak DudharaSarthak Dudhara
    Jan 1, 2011 at 11:19 pm
    Jan 18, 2011 at 8:59 am
  • Hi, If a split location contains more that one location, it means that this split file is replicated through all locations, or it means that a split is divided into several blocks, and each block is ...
    Pedro CostaPedro Costa
    Jan 14, 2011 at 11:10 am
    Jan 14, 2011 at 5:34 pm
  • I was looking at distributed cache and how I need to copy local jars to hdfs. I was wondering if there was any plans to just deploy an OSGi bundle(ie. Introspect and auto deploy jars from bundle to ...
    Hiller, Dean (Contractor)Hiller, Dean (Contractor)
    Jan 2, 2011 at 5:51 pm
    Jan 4, 2011 at 8:08 pm
  • Hi! I was running a Hadoop cluster on Amazon EC2 instances, then after 2 days of work, one of the worker nodes just simply died (I cannot connect to the instance either). That node also appears on ...
    Kiss TiborKiss Tibor
    Jan 31, 2011 at 2:16 pm
    Feb 3, 2011 at 12:16 am
  • GZIP is not splittable. Does that mean a GZIP block compressed sequencefile can't take advantage of MR parallelism? How to control the size of block to be compressed in SequenceFile? -- --Sean
    Sean BigdatafunSean Bigdatafun
    Jan 31, 2011 at 8:26 am
    Jan 31, 2011 at 5:41 pm
  • Hello all, I have set the Hadoop environment variable HADOOP_CONF_DIR and trying to run a Hadoop job from a java application but the job is not looking the hadoop config in this HADOOP_CONF_DIR ...
    Praveen PeddiPraveen Peddi
    Jan 25, 2011 at 10:53 pm
    Jan 28, 2011 at 9:04 am
  • Hi, I want to submit my mapreduce written with the new API specification as the older ones are marked deprecated. I could not find a JobClient equivalent in the new API. Has anyone managed to use a ...
    Sarthak DudharaSarthak Dudhara
    Jan 8, 2011 at 10:00 pm
    Jan 10, 2011 at 5:21 pm
  • Hi, Is there any possibility that the intermediate output might be too large to store it in the local disk? If there is, what does hadoop do to solve the problem? Thanks. -- Best regards!
    Debbie FuDebbie Fu
    Jan 3, 2011 at 12:28 pm
    Jan 4, 2011 at 4:32 am
  • Hi, I am doing a piece of backporting some code to old API. i have a need to obtain task attempt id within mapper initialization, which is passed in thru context in the new API. However, in the old ...
    Dmitriy LyubimovDmitriy Lyubimov
    Jan 30, 2011 at 1:57 am
    Jan 30, 2011 at 8:22 am
  • This question has been asked before, but I tried suggested solutions such as call Context.setStatus() or progress(), neither them helped. Please advise. My reduce task is doing some CPU extensive ...
    Anfernee XuAnfernee Xu
    Jan 27, 2011 at 1:48 pm
    Jan 28, 2011 at 12:22 am
  • Hi, I would like pass one object from job1 to job2 Someone can I help me, please? Thanks Joan
    JoanJoan
    Jan 25, 2011 at 2:31 pm
    Jan 26, 2011 at 3:42 pm
  • Hi all, I was trying to compile FairScheduler from source code. I went to: ~/hadoop-0.21.0/mapred/src/contrib/fairscheduler and run: ant clean compile However, I get the following errors: ...
    Robert GrandlRobert Grandl
    Jan 14, 2011 at 2:18 pm
    Jan 14, 2011 at 4:31 pm
  • By scouring various web pages and lists via google I've found some general recommendations when it comes to setting the number of map and reduce slots for a cluster. It seems to come down to setting ...
    Adam PhelpsAdam Phelps
    Jan 7, 2011 at 1:51 am
    Jan 10, 2011 at 10:26 pm
  • Hi, On the reduce side, after the RT had passed the merge phase (before the reduce phase starts), I've got the path of the map_0.out file. I'm opening this file with [code] FSDataInputStream in = ...
    Pedro CostaPedro Costa
    Jan 31, 2011 at 5:56 pm
    Jan 31, 2011 at 6:04 pm
  • mighty user group, I am trying to write a streaming job that does a lot of io in a python program. I know if I don't report back every x minutes the job will be terminated. How do I report back to ...
    Felix gaoFelix gao
    Jan 28, 2011 at 11:51 pm
    Jan 31, 2011 at 5:08 pm
  • Hi, I'm running the Terasort problem in cluster mode, and I've got a RunTimeException in a Reduce Task. java.lang.RuntimeException: problem advancing post rec#499959 (Please, see attachment) What ...
    Pedro CostaPedro Costa
    Jan 28, 2011 at 5:51 pm
    Jan 30, 2011 at 6:59 pm
  • Hi, I am getting this error while submitting the job to a hadoop cluster. HadoopJobLaunch; Unable to submit job to hadoop; [Possible Cause]JT is perhaps unreacheable; ; [NESTED ...
    Anurag Kumar NileshAnurag Kumar Nilesh
    Jan 29, 2011 at 10:35 am
    Jan 30, 2011 at 10:28 am
  • Hi, My cluster contains 22 DataNodes and Task Tracker each with 8 mapper slots and 4 reduce slots, each with 1.5G max heap size. I use cloudera CDH 2 I have a specific job that is constantly failing ...
    David GinzburgDavid Ginzburg
    Jan 23, 2011 at 2:26 pm
    Jan 24, 2011 at 8:11 am
  • I have two files, A and D, containing (vectorId, vector) on each line. Now I want to execute the following for eachItem in A: for eachElem in D: dot_product = eachItem * eachElem save(dot_product) ...
    Rohit KelkarRohit Kelkar
    Jan 19, 2011 at 11:35 am
    Jan 19, 2011 at 6:54 pm
  • hi, I have a basic question. How does partitioning work ? Following is a scenario I created to put up my question. i) A parttition function is defined as partitioning map-output based on aphabetical ...
    Mapred LearnMapred Learn
    Jan 18, 2011 at 8:25 pm
    Jan 18, 2011 at 8:33 pm
  • Hello, I need to run just one JVM (with several threads with mappers) on every node, because I have some data I need to share between mappers, is there any way to do it with the new API? Thanks ...
    Eduard VelebaEduard Veleba
    Jan 12, 2011 at 11:17 am
    Jan 14, 2011 at 3:00 pm
  • I am writing a mapreduce job for converting web pages in attributes such as terms, ngrams, domains, regexs etc. These attributes terms, ngrams, domains etc are kept in seperate files and are pretty ...
    Vipul sharmaVipul sharma
    Jan 12, 2011 at 8:52 pm
    Jan 12, 2011 at 8:59 pm
  • I stopped a job that was running very slowly, it was running in it's reduce (phase:reduce) part. However, I still want it's output and I cannot run this job again. So I have to stick with the ...
    Ferdy GalemaFerdy Galema
    Jan 10, 2011 at 4:38 pm
    Jan 11, 2011 at 11:58 am
  • hi, I am a newbie and am trying to setup hadoop in single user setup on my windows 7 machine. I followed steps at: ...
    Mapred LearnMapred Learn
    Jan 10, 2011 at 10:35 pm
    Jan 10, 2011 at 10:55 pm
  • I see this on my datanode 2011-01-02 12:32:26,230 INFO org.apache.hadoop.mapred.TaskTracker: Starting thre ad: Map-events fetcher for all reduce tasks on tracker_DENVER-DHILLER.jsq.bsg.ad.a ...
    Hiller, Dean (Contractor)Hiller, Dean (Contractor)
    Jan 2, 2011 at 7:38 pm
    Jan 2, 2011 at 9:12 pm
  • Hi, When the reduce fetch from the mappers a map output of the size of 1GB and do the merge, is it possible that part of the map output is saved in disk and other part in memory? Or a map output must ...
    Pedro CostaPedro Costa
    Jan 31, 2011 at 6:52 pm
    Feb 1, 2011 at 5:33 am
  • Hi, I am running a simple invert index generating program in hadoop which will emit every word in a text file as well as it's offsets. So the output key is Text and output value is a list of ...
    ExceptionException
    Jan 30, 2011 at 10:06 am
    Jan 30, 2011 at 10:43 am
  • Hi I'm trying to access to my custom configuration file (myconfig.xml) from MyMapper. So I'm doing: *File configurationFile = new File("./conf/", "myconfig.xml");* But when I see the absolute path ...
    JoanJoan
    Jan 28, 2011 at 10:47 am
    Jan 29, 2011 at 4:26 am
  • Hello all, I am having issues with accessing hdfs and I figured its due to version mismatch. I know my jar files have multiple copies of hadoop (pig has its own, I have hadoop 0.20.2 and Whirr had ...
    Praveen PeddiPraveen Peddi
    Jan 28, 2011 at 7:25 pm
    Jan 29, 2011 at 12:24 am
  • Hi, My cluster contains 5 DataNodes, each with 8 map slots and 2 reduce slots. So there are up to 40 slots in my cluster and 40 tasks can run in parallel. But when running a particular job, I have ...
    ExceptionException
    Jan 25, 2011 at 1:04 pm
    Jan 25, 2011 at 3:08 pm
  • Hi, It is possible to add a new tasktracker node from the Hadoop code directly ? For example I have in the slaves a list of nodes but only some of them are running at the moment. I want to start a ...
    Robert GrandlRobert Grandl
    Jan 18, 2011 at 7:12 pm
    Jan 23, 2011 at 11:07 am
  • There are two input direcoties:/user/test1/ and /user/test2/ , I want to join the two direcoties content, in order to join the two directories, I need to identity the content are handled by mapper ...
    Lei liuLei liu
    Jan 21, 2011 at 2:24 pm
    Jan 21, 2011 at 2:50 pm
  • Hi, I am using hadoop-0.20.2+228 version of hadoop. Want to use Oozie for managing workflows. Trying to install oozie-2.2.1+82.tar.gz version of Oozie. I could see oozie console at ...
    Giridhar AddepalliGiridhar Addepalli
    Jan 20, 2011 at 11:16 am
    Jan 21, 2011 at 6:32 am
  • Hi, I am seeing lots of leftover directories going back as far as 12 days in the task trackers "mapred.local.dir". These directories are for "M/R task attempts". How are these directories end up in ...
    Rakesh kothariRakesh kothari
    Jan 19, 2011 at 1:20 am
    Jan 20, 2011 at 10:59 pm
  • As defined in http://hadoop.apache.org/mailing_lists.html , please send user questions to mapreduce-user@hadoop.apache.org. -- Owen
    Owen O'MalleyOwen O'Malley
    Jan 20, 2011 at 6:33 pm
    Jan 20, 2011 at 8:23 pm
  • Hi, There is a gzipped file that needs to be processed by a Map-only hadoop job. If the size of this file is more than the space reserved for non-dfs use on the tasktracker host processing this file ...
    Rakesh kothariRakesh kothari
    Jan 18, 2011 at 11:37 pm
    Jan 19, 2011 at 3:47 am
  • Hi all, I just compiled CapacityTaskScheduler in src/contrib/capacity-scheduler: ant compile, ant package. Then I copied the tar file in the lib directory. I stopped and restarted the hadoop but I ...
    Robert GrandlRobert Grandl
    Jan 15, 2011 at 3:50 pm
    Jan 18, 2011 at 8:03 pm
  • Hi, I've hadoop installed in a cluster and I would like that JT could guess in the network topology what are the input files in HDFS that are closer to him, and further. So, how can a JT know if an ...
    Pedro CostaPedro Costa
    Jan 13, 2011 at 10:49 pm
    Jan 14, 2011 at 6:58 am
  • Hi, I'm trying build solr index with MapReduce (Hadoop) and I'm using https://issues.apache.org/jira/browse/SOLR-1301 but I've a problem with hadoop version and this patch. When I compile this patch, ...
    JoanJoan
    Jan 13, 2011 at 12:19 pm
    Jan 13, 2011 at 1:02 pm
  • Can someone tell me how to compile hadoop project? I downloaded trunk from svn and I try to follow the instructions http://wiki.apache.org/hadoop/EclipseEnvironment but I'm using Eclipse on Windows ...
    JoanJoan
    Jan 13, 2011 at 12:34 pm
    Jan 13, 2011 at 12:52 pm
  • Hello Friends, I am seeing Hadoop log timestamps & file timestamps not same as system time. I found that this problem was discussed on on mailing list earlier but there was no solution posted. ...
    Ravi PhulariRavi Phulari
    Jan 11, 2011 at 7:24 am
    Jan 11, 2011 at 6:42 pm
  • Hello, I'm running a very simple job that returns the input with a null key and uses no reducer (see below). I'm using MultipleSequenceFileOutputFormat to "split" the input into different files, but ...
    Brett HoernerBrett Hoerner
    Jan 6, 2011 at 8:52 pm
    Jan 8, 2011 at 11:48 pm
  • Hi, I would like to run a sequence of jobs, where the output of a job is the input for the next one. Does the Hadoop Pig helps to do it? Thanks, -- Pedro
    Pedro CostaPedro Costa
    Jan 6, 2011 at 11:03 pm
    Jan 6, 2011 at 11:48 pm
  • Hello all, I am trying to run Hadoop on Rackspace and I am having issues with starting up servers. I have configrued everything on cloud exactly same as my local hadoop (which is working) but I can't ...
    Praveen PeddiPraveen Peddi
    Jan 6, 2011 at 8:40 pm
    Jan 6, 2011 at 9:07 pm
Group Navigation
period‹ prev | Jan 2011 | next ›
Group Overview
groupmapreduce-user @
categorieshadoop
discussions74
posts226
users69
websitehadoop.apache.org...
irc#hadoop

69 users for January 2011

Harsh J: 28 posts Pedro Costa: 21 posts Joan: 18 posts Hiller, Dean (Contractor): 11 posts Arun C Murthy: 8 posts Koji Noguchi: 8 posts David Rosenstrauch: 6 posts Chase Bradford: 5 posts Robert Grandl: 5 posts Sarthak Dudhara: 5 posts Sonal Goyal: 5 posts Praveen Peddi: 4 posts Felix gao: 4 posts Hari Sreekumar: 4 posts Mapred Learn: 4 posts Owen O'Malley: 4 posts Ravi Phulari: 4 posts Rishi pathak: 4 posts Tsz Wo \(Nicholas\), Sze: 4 posts Allen Wittenauer: 3 posts
show more