Search Discussions

48 discussions - 135 posts

  • Hi, While running MR Jobs over a yarn cluster I keep on getting: Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/v2/app/MRAppMaster Caused by ...
    Jun 5, 2012 at 8:45 am
    Jul 13, 2012 at 8:18 pm
  • I have a MapReduce job that reads in several gigs of log files and separates the records based on who generated them. My MapReduce job looks like this: InputFormat: NLineInputFormat - Reads N lines ...
    Berry, MattBerry, Matt
    Jun 28, 2012 at 9:37 pm
    Jul 2, 2012 at 4:00 pm
  • Hi All, We are running a mapreduce job in a fully distributed cluster.The output of the job is writing to HBase. While running this job we are getting an error: *Caused by ...
    Manu SManu S
    Jun 6, 2012 at 2:25 pm
    Jul 3, 2012 at 6:46 am
  • Our job tracker has been seizing up with Out of Memory (heap space) errors for the past 2 nights. After the first night's crash, I doubled the heap space (from the default of 1GB) to 2GB before ...
    David RosenstrauchDavid Rosenstrauch
    Jun 8, 2012 at 3:27 pm
    Jun 12, 2012 at 9:23 pm
  • Hi, I am trying to override mapred-site.xml (more specifically mapred.compress.map.output and mapred.output.compression. codec) from the command line when I execute the jar. I have been using hadoop ...
    Sid KumarSid Kumar
    Jun 6, 2012 at 10:41 pm
    Jun 7, 2012 at 9:39 pm
  • Hi, Hadoop mapreduce can be used for streaming. But what is streaming from the point of view of mapreduce? For me, streaming are video and audio data. Why mapreduce supports streaming? Can anyone ...
    Pedro CostaPedro Costa
    Jun 15, 2012 at 10:46 pm
    Jun 16, 2012 at 4:09 pm
  • Hi All, When I run hadoop jobs, I observe the following errors. Also, I notice that data node dies every time the job is initiated. Does any one know what may be causing this and how to solve this? ...
    Shamshad AnsariShamshad Ansari
    Jun 14, 2012 at 8:10 pm
    Jul 27, 2012 at 5:05 pm
  • Hi I wanted to check what exactly we gain when JVM reusability is enabled in mapped job. My doubt was regarding the setup() method of mapper. Is it called for a mapper even if it is using the JVM for ...
    Arpit WanchooArpit Wanchoo
    Jun 4, 2012 at 12:12 pm
    Jun 5, 2012 at 8:22 pm
  • Hi, If i have few jobs added to a controller, and i explicitly killed a job in between (assuming all the other jobs failed due to dependency). Can i have the control back to perform some operations ...
    Kasi SubrahmanyamKasi Subrahmanyam
    Jun 30, 2012 at 10:01 am
    Jul 1, 2012 at 6:48 am
  • Hi all, I'm new to Hadoop MR and decided to make a go at using only the new API. I have a series of log files (who doesn't?), where a different date is encoded in each filename. The log files are so ...
    Michael ParkerMichael Parker
    Jun 14, 2012 at 2:55 am
    Jun 14, 2012 at 9:32 am
  • Hi All, I am thinking of a condition where the data in two log files are to be compared, can I use Map-Reduce to do this? I have one log file (LOG1) which has user ID and dept ID and another log file ...
    Girish RaviGirish Ravi
    Jun 12, 2012 at 1:01 pm
    Jun 12, 2012 at 1:07 pm
  • Hi, I have a Clouderas CDH3u3 installed on my cluster and mapred.child.env set to "LD_LIBRARY_PATH=/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64" (with libsnappy.so in the folder) in ...
    Marek MiglinskiMarek Miglinski
    Jun 11, 2012 at 1:55 pm
    Jun 11, 2012 at 4:26 pm
  • Hi All, I am using Mapreduce to scan HBase region to get the rowkey_list that related with one query. In Map period, each mapper outputs partial rowkey_list. In reduce period, the reducer will ...
    Liu, Keyan (NSN - CN/Beijing)Liu, Keyan (NSN - CN/Beijing)
    Jun 11, 2012 at 3:10 am
    Jun 11, 2012 at 3:36 am
  • Hi guys! Please, which is the best recipe to install and configure Hadoop in multi-node cluster? And what about to use Eucalyptus under it, where could I find a good tutorial for all of these? -- ...
    Luiz Antonio Falaguasta BarbosaLuiz Antonio Falaguasta Barbosa
    Jun 1, 2012 at 4:58 pm
    Jun 1, 2012 at 5:48 pm
  • I have a task with a folder as input: FileInputFormat.setInputPaths(job, new Path("/folder")); What happens when the task is running and I write new files in the folder? The task receive the new ...
    Félix LópezFélix López
    Jun 27, 2012 at 5:45 pm
    Jun 28, 2012 at 6:37 am
  • How can I send jobs remotely? I have a cluster running and I would like to execute a mapreduce task from another machine (outside the cluster) and without to have to do this : bin/hadoop jar ...
    Félix LópezFélix López
    Jun 27, 2012 at 4:55 pm
    Jun 28, 2012 at 2:31 am
  • Hi, I want to run a job on all of nodes and if one job was completed, the node must wait until the jobs on the other nodes finish. For that, every node must signal to all nodes and when every node ...
    Hamid OliaeiHamid Oliaei
    Jun 26, 2012 at 9:00 am
    Jun 26, 2012 at 9:52 am
  • Hi, I see that eveyr job has _logs dir that has history dir taking 1 block. Is it safe to delete such _logs directories as we hv lot of them ? Thanks, JJ
    Mapred LearnMapred Learn
    Jun 23, 2012 at 12:53 am
    Jun 23, 2012 at 6:50 am
  • Hello, I have a sequence of MR Jobs that are using the SequenceFile for their output and input format. If I run them without any compression enabled they work fine. If I use the LzoCodec they also ...
    Jun 15, 2012 at 6:04 am
    Jun 18, 2012 at 6:35 am
  • Hi All, I am trying to implement some MapReduce. At one point I see there are different ways of creating Job: org.apache.hadoop.mapreduce.Job and org.apache.hadoop.mapred.jobcontrol.Job What's the ...
    Girish RaviGirish Ravi
    Jun 16, 2012 at 6:29 am
    Jun 16, 2012 at 4:07 pm
  • Hi, We are trying to solve the travelling salesman problem using hadoop. our input files contain just a single line that has the euclidean coordinates of the cities. we need to pass this single line ...
    Sharat attupurathSharat attupurath
    Jun 10, 2012 at 3:02 pm
    Jun 11, 2012 at 6:54 am
  • I am running hadoop-1.0.1 in Ubuntu. This is a single node cluster. When I run a jar file that contains my Mapper, I get the following exception. Could you please help solve this issue? The command ...
    Shamshad AnsariShamshad Ansari
    Jun 6, 2012 at 1:11 pm
    Jun 10, 2012 at 5:47 am
  • Hi Following is the hadoop directory structure after extracting the tar ball. I would like to know where and to which folder I need to set the HADOOP_MAPRED_HOME, ...
    Jun 5, 2012 at 1:43 pm
    Jun 6, 2012 at 8:05 am
  • Hi, I have been trying to setup up Hadoop logging at the task level but with no success so far. I have modified log4j.properties and set many parameters to DEBUG level ...
    Sherif AkoushSherif Akoush
    Jun 27, 2012 at 12:23 pm
    Jun 27, 2012 at 2:26 pm
  • After about a week of researching, logging, etc. I have finally discovered what is happening, but I have no idea why. I have created my own WritableComparable object so I can emit it as the key from ...
    Dave ShineDave Shine
    Jun 26, 2012 at 2:20 pm
    Jun 26, 2012 at 2:42 pm
  • Hello, I am building from $PIG_HOME under Hadoop 23: ant eclipse-files and I get the following error: BUILD FAILED /home/kereno/hadoop23/pig10/build.xml:301 ...
    Keren OuaknineKeren Ouaknine
    Jun 21, 2012 at 8:11 am
    Jun 21, 2012 at 10:56 am
  • Hi, I am going through some samples of using MapReduce with HBase. My question is concerning the importance of the KEYOUT type of a TableReducer. Does the output key really matter if the output value ...
    Jun 18, 2012 at 12:48 pm
    Jun 19, 2012 at 6:00 am
  • Hello, I would like to query about the list of datanode's ips and ports. I am aware the default port is 50075, but on this scenario I might have two versions of Hadoop with datanode running on ...
    Keren OuaknineKeren Ouaknine
    Jun 18, 2012 at 3:07 pm
    Jun 19, 2012 at 5:39 am
  • We are planning to run a next generation of Hadoop ecosystem components in our production in a few months. We plan to use HDFS 2.0 for the HA NameNode work. The platform will also include YARN but ...
    Andrew PurtellAndrew Purtell
    Jun 16, 2012 at 10:23 pm
    Jun 17, 2012 at 5:22 am
  • Hi, all, RMContainerRequestor creates more than one resource requests for each task attempt. If we have a hdfs file with a 3 replicas, on which a task attempt would runs in a one rack only cluster , ...
    Min ZhouMin Zhou
    Jun 16, 2012 at 3:10 pm
    Jun 16, 2012 at 5:05 pm
  • Hi all, One more question. I have two jobs to run serially using a JobControl. The key-value types for the outputs of the reducer of the first job are <ActiveDayKey, Text , where ActiveDayKey is a ...
    Michael ParkerMichael Parker
    Jun 14, 2012 at 5:42 pm
    Jun 15, 2012 at 5:09 am
  • When procession 65billion records and using LZO or Snappy codecs, disk IO is at 100% because mappers are spilling all the time, but CPU is at 40%. Is there a setting where I can raise compression ...
    Marek MiglinskiMarek Miglinski
    Jun 14, 2012 at 8:39 am
    Jun 14, 2012 at 3:56 pm
  • Hello Team, I have started to understand about Hadoop Mapreduce and was able to set-up a single cluster single node execution environment. I want to now extend this to a multi node environment. I ...
    Girish RaviGirish Ravi
    Jun 12, 2012 at 6:57 am
    Jun 12, 2012 at 7:08 am
  • Hi I have been trying to setup a map reduce job with hadoop Scenario : My mapper is writing key value pairs where I have total 13 types of keys and corresponding value classes. For each ...
    Arpit WanchooArpit Wanchoo
    Jun 4, 2012 at 12:06 pm
    Jun 11, 2012 at 2:33 am
  • Hi, I executed Pig java program loading data from HBase in local mode in Eclipse IDE correctly . But when executing through command mode with the following commands its showing error. Input :- ...
    Bajeesh rahmanBajeesh rahman
    Jun 5, 2012 at 9:49 am
    Jun 5, 2012 at 12:47 pm
  • The folder contains files with text and other folders with text files. The text is not key/value, it's just text. Something like this: Lorem Ipsum is simply dummy text of the printing and typesetting ...
    Félix LópezFélix López
    Jun 29, 2012 at 7:37 am
    Jun 29, 2012 at 7:37 am
  • I'm starting to organize a Hadoop/Big Data User Group in the Central Florida area. If you are geographically close and are interested in professional growth and networking in the area of Big Data, ...
    Dave ShineDave Shine
    Jun 28, 2012 at 3:03 pm
    Jun 28, 2012 at 3:03 pm
  • Hello, Huahin Framework 0.1.0 is released. Huahin is simple Java framework for Hadoop MapReduce. I have been working in data mining company in Japan. Although we were using the internal framework, we ...
    Ryu KobayashiRyu Kobayashi
    Jun 28, 2012 at 9:08 am
    Jun 28, 2012 at 9:08 am
  • Hi all, YourKit is a great profiling tool and provides free licence to open source community. But the problem is Hadoop profiling supports a "%s" for the output *file*, while YourKit only supports ...
    Jie LiJie Li
    Jun 24, 2012 at 6:33 am
    Jun 24, 2012 at 6:33 am
  • Hi, I have a RecordReader implementation which reads the records asynchronously and caches them in memory(In a BlockingQueue). When TrackingRecordReader calls for next Record, the internal ...
    Jun 21, 2012 at 10:53 pm
    Jun 21, 2012 at 10:53 pm
  • Hi All I have a Map-reduce which writes some data to the DFS. In my reduce I am using MultipleFormats. In the Reducer:setUp, I am creating the MultipleOutputs. Then in the Reducer:reduce, I am using ...
    Girish RaviGirish Ravi
    Jun 18, 2012 at 1:58 pm
    Jun 18, 2012 at 1:58 pm
  • Thank you. Artem Ervits New York Presbyterian Hospital This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or ...
    Artem ErvitsArtem Ervits
    Jun 12, 2012 at 7:17 pm
    Jun 12, 2012 at 7:17 pm
  • hi everyone, I've to generate sequence number for my data in pig.I'm using RANDOM() function to generate it.I'm storing the random no value to some alias.But if I'm using this alias further in my pig ...
    Avnish pundirAvnish pundir
    Jun 8, 2012 at 4:54 am
    Jun 8, 2012 at 4:54 am
  • Hi, I am facing a problem where the AM goes done once it is finished but, the API keeps on polling to AM for JobStatus/Report and faces SocketTimeout By default ...
    Jun 7, 2012 at 12:46 pm
    Jun 7, 2012 at 12:46 pm
  • Hi Users I was making a Table on Hadoop on the base of Hbase and I wrote a mapReduce Program for this Table But I can't get Answer from this program. I want to read from my file and write on the ...
    Mohamad hosein jafariMohamad hosein jafari
    Jun 6, 2012 at 10:54 am
    Jun 6, 2012 at 10:54 am
  • Hi, I think I met with a possible deadlock situation. Not sure whether it is actually a deadlock or not :-) Here is my scenario: Run a Job and call JobClient.monitorAndPrintJob to monitor the job and ...
    Jun 6, 2012 at 9:47 am
    Jun 6, 2012 at 9:47 am
  • hi,,, i am a beginner to hadoop map-reduce...plz help me with a link to any large data-set and corresponding map-reduce code to understand the implementation... (other than wordcount program) thanks ...
    Rahmat inamdarRahmat inamdar
    Jun 5, 2012 at 8:22 am
    Jun 5, 2012 at 8:22 am
  • Hello, Apologies in advance 'cause this is not directly related to Hadoop, but I thought many on this mailing list may have some experience with both these tools: Flume & Kafka. We are looking for a ...
    Something SomethingSomething Something
    Jun 5, 2012 at 5:41 am
    Jun 5, 2012 at 5:41 am
Group Navigation
period‹ prev | Jun 2012 | next ›
Group Overview
groupmapreduce-user @

52 users for June 2012

Harsh J: 20 posts Subroto: 10 posts GUOJUN Zhu: 7 posts Girish Ravi: 5 posts Arpit Wanchoo: 4 posts Bejoy KS: 4 posts David Rosenstrauch: 4 posts Félix López: 4 posts Manu S: 4 posts Mayank Bansal: 4 posts Sid Kumar: 4 posts Arun C Murthy: 3 posts Berry, Matt: 3 posts Devaraj k: 3 posts Jagat Singh: 3 posts Michael Parker: 3 posts Dave Shine: 2 posts Hamid Oliaei: 2 posts JOAQUIN GUANTER GONZALBEZ: 2 posts Kasi Subrahmanyam: 2 posts
show more