Search Discussions

153 discussions - 560 posts

  • Hi! I just wondered what other people use to access the hadoop webservers, when running on EC2? Ideas that I had: 1.) opening ports 50030 and so on = not good, data goes unprotected over the ...
    Andreas KostyrkaAndreas Kostyrka
    May 28, 2008 at 8:22 pm
    Jun 4, 2008 at 3:31 pm
  • I have a question for users: how do they ensure their client apps have configuration XML file that are kept up to date? I know how I do it to date (get the site config off the site team, have my ...
    Steve LoughranSteve Loughran
    May 15, 2008 at 12:06 pm
    May 20, 2008 at 5:21 pm
  • Hi all, I'm looking a way to force Streaming to shutdown the whole job in case when some of its subprocesses exits with non-zero error code. We have next situation. Sometimes either mapper or reducer ...
    Andrey PankovAndrey Pankov
    May 13, 2008 at 2:46 pm
    Dec 24, 2008 at 6:53 am
  • One of the things we had discussed at the Hadoop summit was to set up monthly user group meetings to discuss topics of interest to the hadoop community. We have scheduled the first of these meetings ...
    Ajay AnandAjay Anand
    May 6, 2008 at 4:55 pm
    May 21, 2008 at 6:58 pm
  • Hi, My datanode and jobtracker are started by user "hadoop". And user "Test" needs to submit the job. So if the user "Test" copies file to HDFS, there is a permission error. ...
    Natarajan, SenthilNatarajan, Senthil
    May 7, 2008 at 9:36 pm
    May 9, 2008 at 8:46 pm
  • Hello, I am not sure if this is a genuine hadoop question or more towards a core-java question. I am hoping to create a wrapper over Lucene Document, so that this wrapper can be used for the value ...
    Jim the Standing BearJim the Standing Bear
    May 28, 2008 at 2:19 am
    May 28, 2008 at 8:21 am
  • What is this bit of the log trying to tell me, and what sorts of things should I be looking at to make sure it doesn't happen? I don't think the network has any basic configuration issues - I can ...
    James MooreJames Moore
    May 7, 2008 at 8:29 pm
    May 13, 2008 at 9:11 pm
  • Hi, I have a case of a corrupt HDFS (according to bin/hadoop fsck) and I'm trying not to lose the precious data in it. I accidentally run bin/hadoop namenode -format on a *new DN* that I just added ...
    Otis GospodneticOtis Gospodnetic
    May 9, 2008 at 3:35 am
    May 9, 2008 at 6:36 pm
  • Hello, I'm still getting my head around how Hadoop works. A survey question: what kind of serialization do you use to output structured data from your map/reduce jobs? When both key and value are ...
    Stuart SierraStuart Sierra
    May 22, 2008 at 8:55 pm
    May 27, 2008 at 9:56 am
  • Hi, I was wondering is it possible to submit MapReduce job on remote Hadoop cluster. (i.e) Submitting the job from the machine which doesn't have Hadoop installed and submitting to different machine ...
    Natarajan, SenthilNatarajan, Senthil
    May 23, 2008 at 8:47 pm
    May 23, 2008 at 10:52 pm
  • Hi All: We had a primary node failure over the weekend. When we brought the node back up and I ran Hadoop fsck, I see the file system is corrupt. I'm unsure how best to proceed. Any advice is greatly ...
    C GC G
    May 12, 2008 at 3:24 am
    May 16, 2008 at 2:14 pm
  • In the system I am working, we have 6 million blocks total and the namenode heap size is about 600 MB and it takes about 5 minutes for namenode to leave the safemode. I try to estimate what would be ...
    Cagdas GeredeCagdas Gerede
    May 2, 2008 at 5:25 pm
    May 3, 2008 at 12:20 am
  • Hello. I'm trying to figure out why I need to use HOD vs. trying to run multiple jobs at the same time on the same set of resources. Is it possible to run multiple hadoop jobs at the same time on the ...
    Kayla JayKayla Jay
    May 22, 2008 at 8:46 pm
    May 24, 2008 at 10:54 am
  • hello. why do we have to set the map and reduce classes as static? i need inside them to access some data which is not static. what i should do? non static map or reduce classes generates the ...
    Deyaa AdranaleDeyaa Adranale
    May 21, 2008 at 10:18 am
    May 27, 2008 at 5:58 pm
  • I'm trying to bring up a cluster on EC2 using (http://wiki.apache.org/hadoop/AmazonEC2) and it seems that 0.17 is the version to use because of the DNS improvements, etc. Unfortunately, I cannot find ...
    Jeff EastmanJeff Eastman
    May 14, 2008 at 7:49 pm
    May 22, 2008 at 4:15 pm
  • Earlier this week I wrote about a master node crash and our efforts to recover from the crash. We recovered from the crash and all systems are normal. However, I have a concern about what fsck is ...
    C GC G
    May 15, 2008 at 7:16 pm
    May 15, 2008 at 10:00 pm
  • Hi All, I was looking for, how multiple inputs can be written to same output that too at different intervals of time ( ie.. I want to re-open the same file to append data to it ) This link did not ...
    May 5, 2008 at 1:28 am
    May 6, 2008 at 10:46 am
  • Should the installation paths be the same in all the nodes? Most documentation seems to suggest that it is _*recommended*_ to have the _*same *_ paths in all the nodes. But what is the workaround, ...
    Sridhar RamanSridhar Raman
    May 30, 2008 at 8:36 am
    Jun 2, 2008 at 7:25 pm
  • Hi, After I start the cluster as user1, I submit a job as a different user (user2) and get the following error. It seems that the job submitter still tries to act as user1 and locate the job.xml from ...
    Rui ShiRui Shi
    May 30, 2008 at 6:23 am
    May 31, 2008 at 12:39 pm
  • What is the right way to use a jar file within my map reduce program. I want to use the simmetrics code for double metaphone, but I'm not sure how to include it so that my map/reduce code can see it. ...
    Tanton GibbsTanton Gibbs
    May 29, 2008 at 5:37 pm
    May 30, 2008 at 2:59 pm
  • Hello, i am considering using hadoop map/reduce but have some difficulties getting around the basic concepts of chunks distribution. How does the 'distributed' processing on large files account for ...
    May 26, 2008 at 11:33 am
    May 29, 2008 at 5:21 pm
  • I have installed hadoop on cygwin , I am running windows XP. My Java directory is C:\Program Files\Java\jre1.6.0_06 I am not able to run hadoop as it complains of "no such file or directory error". I ...
    May 24, 2008 at 12:41 am
    May 24, 2008 at 4:03 pm
  • Dear all, I am trying to use DistributedCache class for distributing files required for running my jobs. While API documentation provides good guidelines, Is there any tips or usage examples (e.g. ...
    Taeho KangTaeho Kang
    May 22, 2008 at 5:45 am
    May 23, 2008 at 9:58 am
  • I uploaded the slides from my Mahout overview to our wiki (http://cwiki.apache.org/confluence/display/MAHOUT/FAQ) along with another recent talk by Isabel Drost. Both are similar in content but their ...
    Jeff EastmanJeff Eastman
    May 22, 2008 at 4:37 pm
    May 22, 2008 at 6:18 pm
  • Hi all, Hadoop is a great project and a growing niche. As it becomes even more popular, there will be increasing demand for experts in the field. I am compiling a contact list of Hadoop experts who ...
    Jim R. WilsonJim R. Wilson
    May 15, 2008 at 9:23 pm
    May 22, 2008 at 4:38 am
  • Hi, I'm currently trying to make the case for using Hadoop (or more precisely HDFS) as part of a storage architecture for a large media asset repository. HDFS will be used for storing up to total of ...
    Robert KrügerRobert Krüger
    May 16, 2008 at 8:23 am
    May 16, 2008 at 11:00 pm
  • I'm trying to create a java application that writes to HDFS. I have it set up such that hadoop-0.16.3 is on my machine, and the env variables HADOOP_HOME and HADOOP_CONF_DIR point to the correct ...
    Bryan DuxburyBryan Duxbury
    May 13, 2008 at 6:28 pm
    May 15, 2008 at 3:17 pm
  • Some of the details that might reveal something more about the the problem i posted http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200805.mbox/%3c482281CB.9000106@cse.iitb.ac.in%3e Hadoop ...
    Amit Kumar SinghAmit Kumar Singh
    May 8, 2008 at 8:51 pm
    May 9, 2008 at 9:52 am
  • Hi, I want to get the "URL" paths of files that are stored in dfs. Is there any way to get it? Thank you
    Chaitanya krishnaChaitanya krishna
    May 8, 2008 at 9:38 am
    May 8, 2008 at 6:16 pm
  • Hi, How does one do a join operation in map reduce? Is there more than one way to do a join? Which way works better and why? Thanks, Shirley
    Shirley CohenShirley Cohen
    May 21, 2008 at 6:16 pm
    Jun 30, 2008 at 4:56 pm
  • Hi everyone, I am using hadoop (17) to try and do some large scale user comparisons and although the programs are all written, its taking incredibly long to run and it seems like it should be going ...
    May 26, 2008 at 4:52 am
    May 29, 2008 at 2:07 pm
  • Hi, Is it correct that an intermediate key from a mapper goes to only 1 reducer ? If yes, then if I have to sum up values of some col in a log file, a reducer will consume a lot of memory - I have a ...
    Tarandeep SinghTarandeep Singh
    May 27, 2008 at 9:54 pm
    May 27, 2008 at 11:17 pm
  • We've recently upgraded from 0.15.0 to 0.16.4. Two nights ago we had a problem where DFS nodes could not communicate. After not finding anything obviously wrong we decided to shut down DFS and ...
    C GC G
    May 23, 2008 at 12:48 pm
    May 24, 2008 at 2:26 am
  • Hi, I have a working 0.15.3 install and am trying to upgrade to 0.16.4. I want to start clean with an empty filesystem, so I just reformatted the filesystem instead of using the upgrade option. When ...
    Adam WynneAdam Wynne
    May 14, 2008 at 3:09 pm
    May 22, 2008 at 8:11 am
  • How does Hadoop manage the failure of the JobTracker (Master Node)? For example, Google Map/Reduce version "aborts the MapReduce computation if the master fails". As NameNode and SecondaryNameNode, ...
    Fabrizio detto MarioFabrizio detto Mario
    May 19, 2008 at 8:37 am
    May 20, 2008 at 1:30 pm
  • Hi list, I want to output my reduced results into several files according to some types the results blongs to. How can I implement this? Thx, Jeremy -- My research interests are distributed systems, ...
    Jeremy ChowJeremy Chow
    May 8, 2008 at 8:36 am
    May 15, 2008 at 6:55 am
  • Can someone familiar with permissions offer an opinion on the below? Thanks, St.Ack
    May 9, 2008 at 3:45 am
    May 9, 2008 at 10:44 pm
  • Hi, I want to process the information in tif images using hadoop. For this, a BufferedImage object has to be created. For JPEG images, ImageIO is used alongwith the ByteArrayOutputStream which ...
    May 8, 2008 at 8:03 am
    May 9, 2008 at 5:04 pm
  • Hey, -Derek
    Derek ShawDerek Shaw
    May 7, 2008 at 3:27 am
    May 7, 2008 at 11:38 pm
  • Hi, I have a problem with counters been updated, after i upgraded my hadoop from 0.15.1 to 0.16.4 and i tried 0.17.0 too. The counters are first updated only after first map task completes. The ...
    Ion BaditaIon Badita
    May 21, 2008 at 9:13 am
    Jul 22, 2008 at 3:22 pm
  • Hi, A looked over the class org.apache.hadoop.metrics.spi.AbstractMetricsContext and i have a question: why in the update(MetricsRecordImpl record) metricUpdates Map is not cleared after the updates ...
    Ion BaditaIon Badita
    May 28, 2008 at 12:23 pm
    May 31, 2008 at 11:11 am
  • Hi All: I'm seeing an inability to run one of our applications over a reasonably small dataset (~200G input) while running 0.16.4. Previously we were on 0.15.0 and the same application ran fine with ...
    C GC G
    May 27, 2008 at 3:58 pm
    May 28, 2008 at 9:50 pm
  • Hi, we wrote a program that uses a Writer to append keys and values to a file. If we do an fsck during these writing the opened files are reported as corrupt and the file size is zero until they are ...
    Martin SchaafMartin Schaaf
    May 22, 2008 at 11:15 pm
    May 22, 2008 at 11:48 pm
  • Hello, We have a cluster that we initially configured using IP addresses instead of hostnames (i.e., all namenode, datanode references are mentioned as IP addresses rather than hostnames. We did this ...
    May 20, 2008 at 8:15 pm
    May 20, 2008 at 9:01 pm
  • How does one learn to program in Hadoop? What do you suggest? Where I can start? -- View this message in context: ...
    May 18, 2008 at 11:03 pm
    May 20, 2008 at 12:53 pm
  • Hi Everyone, I am working on a project which takes in data from a lot of text files, and although there are a lot of ways to do it, it is not clear to me which is the best/fastest. I am working on an ...
    May 17, 2008 at 11:13 pm
    May 19, 2008 at 3:52 am
  • Hi, what are the options to keep a copy of data from an HDFS instance in sync with a backup file system which is not HDFS? Are there Rsync-like tools that allow only to transfer deltas or would one ...
    Robert KrügerRobert Krüger
    May 16, 2008 at 1:34 pm
    May 16, 2008 at 10:56 pm
  • Hi all, I've set up a standalone hadoop server , and when I run bin/hadoop dfs namenode -format I get the following message ( repeating 10 times ) : ipc.Client: Retrying connect to server: ...
    May 14, 2008 at 4:14 pm
    May 15, 2008 at 4:50 pm
  • Hi, I have been working on a problem where I have to process a particular data and return three varieties of data and then, I have to process each of them and store each variety of data into separate ...
    Novice userNovice user
    May 14, 2008 at 9:45 am
    May 15, 2008 at 12:22 pm
  • hi all: I uses two computers A and B as a hadoop cluster,A is JobTracker and NameNode,both A and B are slaves. The input data size is about 80MB,including 100,000records. The job is to read one ...
    May 13, 2008 at 3:13 pm
    May 13, 2008 at 7:39 pm
Group Navigation
period‹ prev | May 2008 | next ›
Group Overview
groupcommon-user @

163 users for May 2008

Ted Dunning: 51 posts Otis Gospodnetic: 18 posts Arun C Murthy: 15 posts C G: 15 posts Steve Loughran: 13 posts Hairong Kuang: 12 posts Natarajan, Senthil: 12 posts Sridhar Raman: 11 posts Cagdas Gerede: 10 posts Doug Cutting: 10 posts Lohit: 10 posts S29752-Hadoopuser: 10 posts Andreas Kostyrka: 9 posts Dhruba Borthakur: 9 posts Tarandeep Singh: 9 posts Jason Venner: 8 posts Tanton Gibbs: 8 posts Amar Kamat: 7 posts James Moore: 7 posts Kayla Jay: 7 posts
show more