Search Discussions

207 discussions - 847 posts

  • Hi. I have HDFS client and HDFS datanode running on same machine. When I'm trying to access a dozen of files at once from the client, several times in a row, I'm starting to receive the following ...
    Stas OskinStas Oskin
    Jun 21, 2009 at 10:07 am
    Aug 3, 2009 at 8:52 pm
  • Hi, I am creating a small Hadoop (0.19.1) cluster (2 nodes to start), each of the machines has 2 NIC cards (1 external facing, 1 internal facing). It is important that Hadoop run and communicate on ...
    John MartyniakJohn Martyniak
    Jun 9, 2009 at 4:27 am
    Jun 22, 2009 at 7:15 am
  • http://googleresearch.blogspot.com/2009/06/large-scale-graph-computing-at-google.html -- It sounds like Pregel seems, a computing framework based on dynamic programming for the graph operations. I ...
    Edward J. YoonEdward J. Yoon
    Jun 22, 2009 at 6:50 am
    Jul 3, 2009 at 5:39 am
  • I am evaluating hadoop for a problem that do a Cartesian product of input from one file of 600K (File A) with another set of file set (FileB1, FileB2, FileB3) with 2 millions line in total. Each line ...
    Jun 18, 2009 at 5:56 pm
    Dec 8, 2009 at 10:52 pm
  • Hi All! I want a directory to be present in the local working directory of the task for which I am using the following statements: DistributedCache.addCacheArchive(new ...
    Jun 25, 2009 at 5:32 pm
    Jul 3, 2009 at 7:32 am
  • Hi, I want to share some data structures for the map tasks on a same node(not through files), I mean, if one map task has already initialized some data structures (e.g. an array or a list), can other ...
    Jun 16, 2009 at 12:06 am
    Jun 17, 2009 at 4:40 am
  • Hello! I am have following queries related to Hadoop:: - Once I place my data in HDFS, it gets replicated and chunked automatically over the datanodes. Right? Hadoop takes care of all those things. - ...
    Sugandha NaolekarSugandha Naolekar
    Jun 5, 2009 at 7:31 am
    Jun 10, 2009 at 5:27 pm
  • Hi. Wonder how one should improve the startup times of a hadoop job. Some of my jobs which have a lot of dependencies in terms of many jar files take a long time to start in hadoop up to 2 minutes ...
    Marcus HerouMarcus Herou
    Jun 28, 2009 at 7:44 pm
    Jul 7, 2009 at 9:49 pm
  • Hi All, Can I map/reduce logs that have the .bz2 extension in Hadoop 18.3? I tried but interestingly the output was not what i expected versus what i got when my data was in uncompressed format. ...
    Usman WaheedUsman Waheed
    Jun 24, 2009 at 10:48 am
    Jun 24, 2009 at 6:22 pm
  • Hi All, I am running my mapred program in local mode by setting mapred.jobtracker.local to local mode so that I can debug my code. The mapred program is a direct porting of my original sequential ...
    Jun 16, 2009 at 5:51 pm
    Jun 19, 2009 at 3:07 pm
  • Hello! If I try to transfer a 5GB VDI file from a remote host(not a part of hadoop cluster) into HDFS, and get it back, how much time is it supposed to take? No map-reduce involved. Simply Writing ...
    Sugandha NaolekarSugandha Naolekar
    Jun 10, 2009 at 6:26 am
    Jun 13, 2009 at 1:55 am
  • Hi, all I have a large csv file ( larger than 10 GB ), I'd like to use a certain InputFormat to split it into smaller part thus each Mapper can deal with piece of the csv file. However, as far as I ...
    Wenrui GuoWenrui Guo
    Jun 10, 2009 at 12:07 pm
    Jun 12, 2009 at 7:21 am
  • Hi, Is there any restriction on the amount of putting files? I tried to put/copyFromLocal about 50,573 files to HDFS, but I faced a problem: ...
    Jun 22, 2009 at 3:52 am
    Jul 3, 2009 at 8:27 pm
  • CloudBase is a data warehouse system for Terabyte & Petabyte scale analytics. It is built on top of hadoop. It allows you to query flat files using ANSI SQL. We have released 1.3.1 version of ...
    Leo DagumLeo Dagum
    Jun 19, 2009 at 7:19 pm
    Jun 22, 2009 at 5:47 pm
  • Hey all, I'm currently tasked to come up with a web/flex-based visualization/monitoring system for a cloud system using hadoop as part of a university research project. I was wondering if I could ...
    Anthony McCulleyAnthony McCulley
    Jun 5, 2009 at 2:07 pm
    Jun 12, 2009 at 3:32 am
  • Hi, I'm a Hadoop 17 user who is doing research with Prof. Magda Balazinska at the University of Washington on an improved progress indicator for Pig Latin. We have a question regarding how Hadoop ...
    Kristi MortonKristi Morton
    Jun 5, 2009 at 2:48 am
    Jun 8, 2009 at 9:48 am
  • I'm backporting some code I wrote for 0.19.1 to 0.18.3 (long story), and I'm finding that when I run a job and try to pass options with -D on the command line, that the option values aren't showing ...
    Ian SoboroffIan Soboroff
    Jun 3, 2009 at 5:20 pm
    Jun 4, 2009 at 9:12 pm
  • Hello, I have a problem getting the map input file name. Here is what I tried: public class Map extends Mapper<Object, Text, LongWritable, Text { public void map(Object key, Text value, Context ...
    Rares VernicaRares Vernica
    Jun 2, 2009 at 4:45 pm
    Jun 4, 2009 at 3:06 pm
  • Am I total moron or have the Subversion repo gone fishing ? I noticed that yesterday when I did a svn up. I get a 404 on this url: http://svn.apache.org/repos/asf/hadoop/core/ which is refferred to ...
    Marcus HerouMarcus Herou
    Jun 30, 2009 at 9:25 pm
    Jul 1, 2009 at 5:07 pm
  • Hi, I've just installed a new test cluster and I'm trying to give it a quick smoke test with RandomWriter and Sort. I can run these fine with the superuser account. When I try to run them as another ...
    Stephen mulcahyStephen mulcahy
    Jun 26, 2009 at 11:40 am
    Jun 30, 2009 at 11:20 am
  • Hi all, How does one handle a mount running out of space for HDFS? We have two disks mounted on /mnt and /mnt2 respectively on one of the machines that are used for HDFS, and /mnt is at 99% while ...
    Kris JirapinyoKris Jirapinyo
    Jun 22, 2009 at 5:21 pm
    Jun 22, 2009 at 10:25 pm
  • Hi All, I am trying to build a distributed system to build and serve lucene indexes. I came across the Distributed Lucene project- http://wiki.apache.org/hadoop/DistributedLucene ...
    Tarandeep SinghTarandeep Singh
    Jun 1, 2009 at 4:55 pm
    Jun 3, 2009 at 4:00 am
  • when i try to execute the command bin/start-dfs.sh , i get the following error . I have checked the hadoop-site.xml file on all the nodes , and they are fine .. can some-one help me out! ...
    Bharath vissapragadaBharath vissapragada
    Jun 23, 2009 at 1:42 pm
    Jun 23, 2009 at 5:02 pm
  • Hi Group, I was having trouble getting through an example Hadoop program. I have searched the mailing list but could not find any thing useful. Below is the issue: 1) Executed below command to submit ...
    Shravan MahankaliShravan Mahankali
    Jun 22, 2009 at 10:15 am
    Jun 23, 2009 at 4:37 am
  • Hello, I wasn't able to find this anywhere, so I'm sorry if this has been asked before. I am wondering whether there is a practical limit of the amount of bytes that an emitted Map/Reduce value can ...
    Leon MergenLeon Mergen
    Jun 18, 2009 at 3:24 pm
    Jun 18, 2009 at 5:45 pm
  • I am working with the wordcount example of Hadoop Pipes (0.20.0). I have a 7 machine cluster. When I look at MapContext.getInputSplit() in my map function, I see that it returns the empty string. I ...
    Roshan JamesRoshan James
    Jun 12, 2009 at 11:02 pm
    Jun 17, 2009 at 7:10 pm
  • Hi all, I wrote a java code(map-reduce). Can anyone tell me in detail, how to run it on hadoop (right from how to create a jar file).. or send me a link specifying the same. Thanks in advance
    Bharath vissapragadaBharath vissapragada
    Jun 13, 2009 at 10:29 am
    Jun 15, 2009 at 4:12 pm
  • Hi all, In the remove lzo JIRA ticket https://issues.apache.org/jira/browse/HADOOP-4874 Tatu mentioned he was going to port fastlz from C to Java and provide a patch. Has there been any updates on ...
    Kris JirapinyoKris Jirapinyo
    Jun 3, 2009 at 6:11 pm
    Jun 5, 2009 at 1:07 am
  • As per a previous list question (http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200804.mbox/%3Ce75c02ef0804011433x144813e6x2450da7883de3aca@mail.gmail.com%3E) it looks as though it's not ...
    David RosenstrauchDavid Rosenstrauch
    Jun 2, 2009 at 8:22 pm
    Jun 4, 2009 at 3:06 pm
  • Hi. We have a deployment of 10 hadoop servers and I now need more mapping capability (no not just add more mappers per instance) since I have so many jobs running. Now I am wondering what I should ...
    Marcus HerouMarcus Herou
    Jun 26, 2009 at 9:44 pm
    Jun 28, 2009 at 8:51 pm
  • Hi, When i am trying to run a Jar in Hadoop, it is giving me the following error hadoop@krishna-dev:/usr/local/hadoop$ bin/hadoop jar /user/hadoop/hadoop-0.18.0-examples.jar java.io.IOException: ...
    Krishna prasannaKrishna prasanna
    Jun 25, 2009 at 7:31 am
    Jun 25, 2009 at 9:49 am
  • Hi All, Do you know if the tmp directory on every map/reduce task will be deleted automatically after the map task finishes or will do I have to delete them? I mean the tmp directory that ...
    Qin GaoQin Gao
    Jun 22, 2009 at 7:16 pm
    Jun 22, 2009 at 8:46 pm
  • Hi Group, I have trouble running couple of examples provided by Hadoop. Below are the error messages I have from the console, could you please advise what could be the problem and probable solution? ...
    Shravan MahankaliShravan Mahankali
    Jun 17, 2009 at 12:36 pm
    Jun 18, 2009 at 7:10 am
  • Hi everyone, Is there anyway to sort the "keys" before Reduce but after Map ? I also think of sorting keys myself in Reduce function, but it might take too many memory once the number of results ...
    Kunsheng ChenKunsheng Chen
    Jun 15, 2009 at 11:23 pm
    Jun 17, 2009 at 11:13 pm
  • Hi ! I write a application which has two jobs: the second job use the input datasource same as the first job's added the the output(some objects) of first job.Can I transfer some objects from one job ...
    Jun 15, 2009 at 1:31 pm
    Jun 15, 2009 at 4:26 pm
  • Hi, I am trying to understand the effects of increasing block size or minimum split size. If I increase them, then a mapper will process more data, effectively reducing the number of mappers that ...
    Tarandeep SinghTarandeep Singh
    Jun 11, 2009 at 6:06 pm
    Jun 13, 2009 at 11:17 am
  • Hi, Our architecture team wants to run Hadoop/Hbase and the mapreduce jobs using OSGi container. This is to take advantages of the OSGi framework to have a pluggable architecture. I have searched ...
    Ninad RautNinad Raut
    Jun 11, 2009 at 8:10 am
    Jun 12, 2009 at 10:35 am
  • Hi, I've been running some tests with hadoop in pseudo-distributed mode. My config includes the following in conf/hdfs-site.xml <property <name dfs.data.dir</name <value /hdfs/disk1, /hdfs/disk2, ...
    Stephen mulcahyStephen mulcahy
    Jun 10, 2009 at 12:56 pm
    Jun 10, 2009 at 11:17 pm
  • In honor of the Hadoop Summit on June 10th(tomorrow), Apress has agreed to provide some conference swag, in the form of a 50% off coupon Purchase the book at http://eBookshop.apress.com and use code ...
    Jason hadoopJason hadoop
    Jun 10, 2009 at 2:15 am
    Jun 10, 2009 at 4:42 am
  • Hello all, I'm trying to setup a two node cluster < remote using the following tutorials { NOTE : i'm ignoring the tmp directory property in hadoop-site.xml suggested by Michael } Running Hadoop On ...
    Asif mdAsif md
    Jun 4, 2009 at 7:39 pm
    Jun 4, 2009 at 11:00 pm
  • Hello all, I'm trying to run the hadoop-1.19.1-examples.jar teragen and terasort programs on a cluster. I have two problems with these programs: 1. The data is generated in a fashion to where it is ...
    Gross, DannyGross, Danny
    Jun 29, 2009 at 9:04 pm
    Jul 1, 2009 at 2:57 am
  • Hi all, I am a student and I am trying to install the Hadoop on a cluster, I have one machine running namenode, one running jobtracker, two slaves. When I run the /bin/start-dfs.sh , there is ...
    Boyu ZhangBoyu Zhang
    Jun 26, 2009 at 8:26 pm
    Jun 26, 2009 at 9:45 pm
  • Hi, One of our test clusters is running HADOOP 15.3 with replication level set to 2. The datanodes are not balanced at all. Datanode_1: 52% Datanode_2: 82% Datanode_3: 30% 15.3 does not have the ...
    Usman WaheedUsman Waheed
    Jun 25, 2009 at 9:37 am
    Jun 25, 2009 at 10:32 am
  • hello everyone, I have added 7 nodes to my 3 node cluster. I followed the following steps to do this 1. added the node's ip to conf/slaves at master 2. ran bin/start-balance.sh at each node As i ...
    Asif mdAsif md
    Jun 24, 2009 at 9:50 pm
    Jun 25, 2009 at 12:37 am
  • Hi, Just wanted to know if multicluster communication is possible in hadoop for example i have 10 nodes. Hadoop cluster1 node1 - Master 1 node2 - slave of master1 node3 - slave of master1 node4 - ...
    Rakhi KhatwaniRakhi Khatwani
    Jun 19, 2009 at 10:36 am
    Jun 19, 2009 at 5:30 pm
  • Hi, Can I restrict the output of mappers running on a node to go to reducer(s) running on the same node? Let me explain why I want to do this- I am converting huge number of XML files into ...
    Tarandeep SinghTarandeep Singh
    Jun 17, 2009 at 11:40 pm
    Jun 19, 2009 at 3:07 pm
  • Hi all I "dfs put" a large dataset onto a 10-node cluster. When I observe the Hadoop progress (via web:50070) and each local file system (via df -k), I notice that my master node is hit 5-10 times ...
    Jun 18, 2009 at 7:57 pm
    Jun 19, 2009 at 2:04 am
  • Hello, I am doing my master.... my final year project is on Hadoop ...so I would like to know some thing about Hadoop cluster i.e, Do new version of Hadoop are able to handle heterogeneous ...
    Ashish pareekAshish pareek
    Jun 18, 2009 at 2:47 pm
    Jun 18, 2009 at 5:35 pm
  • Hi, I was wondering if it's possible to get a hold of the task id inside a mapper? I cant' seem to find a way by trolling through the API reference. I'm trying to implement a Map Reduce version of ...
    Mark DesnoyerMark Desnoyer
    Jun 18, 2009 at 2:12 pm
    Jun 18, 2009 at 3:02 pm
  • Ran into an issue with running hadoop on a cluster that also has AFS installed. When a user ssh's in they get an 'extra' group id, I think it is called a 'pag', Problem is when you try to start ...
    Brock PalenBrock Palen
    Jun 17, 2009 at 3:35 pm
    Jun 17, 2009 at 4:13 pm
Group Navigation
period‹ prev | Jun 2009 | next ›
Group Overview
groupcommon-user @

219 users for June 2009

Jason hadoop: 49 posts Aaron Kimball: 26 posts Steve Loughran: 23 posts Alex Loddengaard: 21 posts Sugandha Naolekar: 21 posts Akhil langer: 19 posts Brian Bockelman: 16 posts Raghu Angadi: 16 posts Tarandeep Singh: 15 posts Usman Waheed: 15 posts Roshan James: 13 posts Stas Oskin: 13 posts Harish Mallipeddi: 12 posts John Martyniak: 12 posts Marcus Herou: 12 posts Todd Lipcon: 12 posts Asif md: 11 posts Tom White: 11 posts Owen O'Malley: 10 posts Pmg: 10 posts
show more