Search Discussions

24 discussions - 70 posts

  • Hi there, Howdy. I've been using hadoop to parse and index XML documents. Its a 2 step process similar to Nutch. I parse the XML and create field-value tuples written to a file. I read this file and ...
    Venkat SeethVenkat Seeth
    Feb 20, 2007 at 6:02 am
    Feb 22, 2007 at 9:45 pm
  • I am part of a working group that is developing a Bigtable-like structured storage system for Hadoop HDFS (see http://wiki.apache.org/lucene-hadoop/Hbase). I am interested in learning about large ...
    Jim KellermanJim Kellerman
    Feb 2, 2007 at 7:10 pm
    Feb 6, 2007 at 8:02 pm
  • I may be missing something silly here, I have a MR that generates an output type (Text,Text) Consuming that output for another MR it becomes a plain text file thus the input is (LongWriteable, Text) ...
    Alejandro AbdelnurAlejandro Abdelnur
    Feb 2, 2007 at 3:53 am
    Feb 4, 2007 at 4:31 pm
  • Hi all, In the hadoop-0.11.1-core.jar, there is already a log4j.properties file. I am using commons-logging and log4j now, so I tried to remove the commons-logging.properties and log4j.properties ...
    Andrew JsyqfAndrew Jsyqf
    Feb 17, 2007 at 8:14 am
    Feb 17, 2007 at 6:08 pm
  • Hi! I tried to run wordcount example with a 1,9 GB input file, on 6 Nodes. Hadoop has split the job in: 31 maps and one reduce. All the maps fail with this stacktrace java.lang.NullPointerException ...
    Ion BaditaIon Badita
    Feb 2, 2007 at 12:08 pm
    Mar 15, 2007 at 4:10 am
  • Hi there, I'm a new user of Hadoop and I am having trouble getting the single-node cluster to run. I'm following the instructions on the wiki, but I'm getting the following error in the log file: ...
    Eugene WeinsteinEugene Weinstein
    Feb 15, 2007 at 12:54 am
    Feb 15, 2007 at 1:16 am
  • Hi, I have a question regarding task allocation to TaskTrackers (could not find an answer in the docs). When a MapReduce job is run, does the system attempt to schedule a Map task on a machine that ...
    Vasiliy BaranovVasiliy Baranov
    Feb 14, 2007 at 2:14 pm
    Feb 14, 2007 at 10:10 pm
  • Hi there, Howdy. I observe at times that few of the reduce tasks hangs during copy phase and does not result in failures also. Hence these tasks never complete nor rerun for timeouts. reduce copy ...
    Venkat SeethVenkat Seeth
    Feb 24, 2007 at 5:32 pm
    Feb 24, 2007 at 7:42 pm
  • Hello there, Howdy. I've a quick question on configuring mapred.tasktracker.tasks.maximum for a node in a hadoop cluster. I have N nodes with 4-Dual-proc CPUs, 64GB Ram and M nodes with 2-Dual-proc ...
    Venkat SeethVenkat Seeth
    Feb 22, 2007 at 9:51 pm
    Feb 23, 2007 at 12:02 am
  • Hi, I want to use Hadoop on windows for demonstration. Has anyone was able to run it on Windows. I tried to modify the startup script and it is not working. We need simple config where ...
    Feb 7, 2007 at 4:18 am
    Feb 8, 2007 at 11:50 am
  • such as Coda or GFS (RHEL) , i think their performance or features will be more mature? I have run the random example to generate 10GB data, seems currently HDFS is the bottomneck? regards, howa
    Howard chenHoward chen
    Feb 1, 2007 at 5:20 pm
    Feb 5, 2007 at 5:36 pm
  • Hi: Although I havent found the solution to the problem, at least I could minimize the effects. First of all I'll try to explain what I think is happening here. Lets have a look a thread dump when ...
    Alvaro CabrerizoAlvaro Cabrerizo
    Feb 14, 2007 at 5:03 pm
    Feb 14, 2007 at 9:34 pm
  • Hi list, I need a solution where it is possible to have three fileservers in three different locations. All of them have the same data. This data should be replicated over all of those three ...
    Achim StumpfAchim Stumpf
    Feb 1, 2007 at 2:54 pm
    Feb 1, 2007 at 7:00 pm
  • Hi. I've run into problems with the tasktrackers dying when I run a fairly big job. The input data is roughly 290 gb and the tasktrackers die because of out of memory errors. This job is run on a 19 ...
    Johan OskarssonJohan Oskarsson
    Feb 28, 2007 at 2:16 pm
    Feb 28, 2007 at 2:16 pm
  • Hello there, Howdy. I've seen in the past that mapred.system.dir needs to be a directory shared across all the slaves else I get a "No such file or directory" exception. I'm adding few more slaves to ...
    Venkatesh SeetharamVenkatesh Seetharam
    Feb 27, 2007 at 12:30 am
    Feb 27, 2007 at 12:30 am
  • Hi there, I am storing web log files in HDFS, and run grep map&reduce task. Web log file is so big(xGB/1day), so I compressed that files. **unfortunately, If a file is compressed like gzip, hadoop ...
    Feb 26, 2007 at 2:03 am
    Feb 26, 2007 at 2:03 am
  • Hello there, Howdy. I've configured 45 tasks per node and I only see 2 map tasks per node. Number of Maps is only 4. I've configured 160 maps but I see 1540 maps created. When I look for the number ...
    Venkat SeethVenkat Seeth
    Feb 25, 2007 at 10:58 pm
    Feb 25, 2007 at 10:58 pm
  • Hi. I've been trying to get the nutch fetcher to work, but it always hangs on one of the reduce process, and job is failed. I am using 160 map tasks and 16 reduce tasks during fetch on a 8 machine ...
    Seok keun ohSeok keun oh
    Feb 24, 2007 at 2:33 am
    Feb 24, 2007 at 2:33 am
  • I just run simple map reduce task. (identity mapper, identity reducer) Job progress reach map 100% reduce 89~92%... Job hanging... what's wrong. version : hadoop 0.8.0 os : FreeBSD 6.2 64bit data ...
    Seok keun ohSeok keun oh
    Feb 15, 2007 at 1:46 am
    Feb 15, 2007 at 1:46 am
  • Can folks please add details of their Hadoop usage to the following Wiki page? http://wiki.apache.org/lucene-hadoop/PoweredBy Please tell whatever you can. This helps folks who are considering Hadoop ...
    Doug CuttingDoug Cutting
    Feb 14, 2007 at 11:14 pm
    Feb 14, 2007 at 11:14 pm
  • I tried to use Metadata to store the count of entries in a SequenceFile. I don't know the entry count only after all the data is appended to the file. But the Metadata is written in the "header" ...
    Ion BaditaIon Badita
    Feb 14, 2007 at 1:33 pm
    Feb 14, 2007 at 1:33 pm
  • unsubscribe
    Feb 8, 2007 at 3:09 am
    Feb 8, 2007 at 3:09 am
  • Hi, We look for someone who knows Hadoop well to help implement Nutch-Hadoop project that allow us to do the search, index, add and delete by using the advantage of hadoop distribute filesystem. If ...
    Smart smSmart sm
    Feb 6, 2007 at 8:46 pm
    Feb 6, 2007 at 8:46 pm
  • Hi, Is there any way to detect the duration between the time when all the map tasks have been assigned to a machine and the time when all the reduce tasks finish? Currently, I can only track the ...
    Lei ChenLei Chen
    Feb 5, 2007 at 5:51 pm
    Feb 5, 2007 at 5:51 pm
Group Navigation
period‹ prev | Feb 2007 | next ›
Group Overview
groupcommon-user @

32 users for February 2007

Venkat Seeth: 13 posts Bryan A. P. Pendleton: 5 posts Andrzej Bialecki: 4 posts Doug Cutting: 4 posts Dennis Kubes: 3 posts Eugene Weinstein: 3 posts Jim Kellerman: 3 posts Alejandro Abdelnur: 2 posts Alvaro Cabrerizo: 2 posts Devaraj Das: 2 posts Feng Jiang: 2 posts Ion Badita: 2 posts Konstantin Shvachko: 2 posts Nigel Daley: 2 posts Senthil: 2 posts Seok keun oh: 2 posts Vasiliy Baranov: 2 posts Meenali: 1 post 김형준: 1 post Achim Stumpf: 1 post
show more