Search Discussions

109 discussions - 406 posts

  • Who is doing multiplication of large dense matrices using Hadoop? What is a good way to do that computation using Hadoop? Thanks, Mike
    Mike SpreitzerMike Spreitzer
    Nov 18, 2011 at 4:59 pm
    Nov 24, 2011 at 5:39 pm
  • I used the command : $HADOOP_PREFIX_HOME/bin/hdfs start namenode --config $HADOOP_CONF_DIR to sart HDFS. This command is in Hadoop document (here ...
    Cat faCat fa
    Nov 29, 2011 at 10:28 am
    Nov 30, 2011 at 5:12 pm
  • Hi, I've just installed a small hadoop cluster (4 nodes) and am trying to configure it such that different users can run jobs on it (rather than having everyone submit jobs as the ...
    Stephen mulcahyStephen mulcahy
    Nov 9, 2011 at 6:13 pm
    Nov 17, 2011 at 8:34 am
  • For your reading pleasure! PDF 3.3MB uploaded at (the mailing list has a cap of 1MB attachments): https://docs.google.com/open?id=0B-zw6KHOtbT4MmRkZWJjYzEtYjI3Ni00NTFjLWE0OGItYTU5OGMxYjc0N2M1 ...
    Maneesh varshneyManeesh varshney
    Nov 30, 2011 at 8:29 pm
    Dec 1, 2011 at 7:20 pm
  • Hi I' trying to modify the word count example (http://wiki.apache.org/hadoop/WordCount) using the new api (org.apache.hadoop.mapreduce.*). I run the job on a remote pseudo-distributed cluster. It ...
    Denis KreisDenis Kreis
    Nov 24, 2011 at 10:51 am
    Nov 28, 2011 at 4:35 pm
  • Hi, I have a big file consisting of XML data.the XML is not represented as a single line in the file. if we stream this file using ./hadoop dfs -put command to a hadoop directory .How the ...
    Nov 22, 2011 at 1:20 am
    Nov 22, 2011 at 12:49 pm
  • Hi, I have written a fairly straightforward Hadoop program, modelled after the PiEstimator example which is shipped with the distro. 1) I write a series of files to HDFS, each containing the input ...
    Andy DoddingtonAndy Doddington
    Nov 10, 2011 at 10:55 am
    Nov 16, 2011 at 12:09 pm
  • All I assumed that the input splits for a streaming job will follow the same logic as a map reduce java job but I seem to be wrong. I started out with 73 gzipped files that vary between 23MB to ...
    Raj VRaj V
    Nov 10, 2011 at 10:41 pm
    Nov 14, 2011 at 11:42 pm
  • Hi I have problem in running hadoop under cygwin 1.7 only tasktracker ran by cyg_server user and so make some problems, so any idea please??? BS. Masoud.
    Nov 1, 2011 at 5:36 am
    Nov 4, 2011 at 9:30 am
  • Hello folks, seems like you deal here with HBase-questions. Below you will find my question. Thanks! Em -------- original message -------- Hello list, I was asked whether it is a good idea to replace ...
    Nov 15, 2011 at 4:28 pm
    Nov 15, 2011 at 8:31 pm
  • Hi, In the jobs running on my cluster of 20 machines, I used to run jobs (via "hadoop jar ...") that would spawn around 4000 map tasks. Now when I run the same jobs, that number is 20; and I notice ...
    Brendan W.Brendan W.
    Nov 4, 2011 at 2:33 pm
    Nov 7, 2011 at 4:32 pm
  • I'm trying to prove that my cluster will in fact support multiple reducers, the wordcount example doesn't seem to spawn more that one (1). Is that correct? Is there a sure fire way to prove my ...
    Hoot ThompsonHoot Thompson
    Nov 29, 2011 at 2:34 pm
    Dec 8, 2011 at 5:27 am
  • Hi All I'm new to hadoop, I know I can use "haddop jar" to submit my M/R job, but we need to submit a lot of jobs in my real environment, there is priority requirement for each jobs, so is there any ...
    Nov 20, 2011 at 9:44 am
    Nov 22, 2011 at 3:22 am
  • hi,all I modify some code in hadoop. but i'm not good at ant or maven.what cmd should i enter in linux shell to build a new hadoop-*.jar to test my code?
    Seven garfeeSeven garfee
    Nov 17, 2011 at 10:28 am
    Nov 20, 2011 at 8:57 pm
  • Hi, I' am evaluating different solutions for massive phrase query execution. I need to execute millions of greps or more precise phrase queries consisting of 1-4 terms against millions of documents. ...
    Oliver KrohneOliver Krohne
    Nov 3, 2011 at 10:47 am
    Nov 3, 2011 at 6:40 pm
  • I can't seem to get past this heapsize error. Relevant settings are as follows: # The maximum amount of heap to use, in MB. Default is 1000. export HADOOP_HEAPSIZE=10000 <name ...
    Hoot ThompsonHoot Thompson
    Nov 28, 2011 at 12:09 pm
    Nov 28, 2011 at 6:42 pm
  • Hi, I'm wondering how to delete files older than X days with HDFS/Hadoop. On linux we can do it with the folowing command: find ~/datafolder/* -mtime +7 -exec rm {} \; Any ideas?
    Raimon BoschRaimon Bosch
    Nov 26, 2011 at 3:02 pm
    Nov 28, 2011 at 4:38 am
  • Hello We are currently using hadoop 0.20.203 on a 10 node cluster. We are considering upgrading to a newer version and I have two questions in this regard. 1) It seems 0.21 is unlikely to become a ...
    Niranjan BalasubramanianNiranjan Balasubramanian
    Nov 22, 2011 at 8:05 pm
    Nov 22, 2011 at 10:37 pm
  • Hi all. Can anyone give me an example how i can programmatically authenticate from a remote client as a particular user? Thanks! Denis
    Denis KreisDenis Kreis
    Nov 18, 2011 at 11:47 am
    Nov 18, 2011 at 4:23 pm
  • Hi guys : In a shared cluster environment, whats the best way to reduce the number of mappers per job ? Should you do it with inputSplits ? Or simply toggle the values in the JobConf (i.e. increase ...
    Jay VyasJay Vyas
    Nov 16, 2011 at 6:06 pm
    Nov 17, 2011 at 10:43 am
  • Hello, Friends I am using Hadoop 0.20.2 version, My problem is whenever I kill the tasktracker and start it again, jobtrakcer shows one extra tasktracker (the one which is killed & the other which ...
    Mohmmadanis moulaviMohmmadanis moulavi
    Nov 14, 2011 at 7:39 am
    Nov 14, 2011 at 1:39 pm
  • What sorts of causes might be responsible for a long or slow shuffle stage? For example, I have a job of 266 maps (each emitting 4 records) and 17 reduces (each ingesting about 60 records) that takes ...
    Keith WileyKeith Wiley
    Nov 11, 2011 at 1:20 am
    Nov 11, 2011 at 4:44 pm
  • Hi Anybody ran hadoop on cygwin for development purpose??? Did you have any problem in running tasktracker? Thanks
    Nov 1, 2011 at 8:54 am
    Nov 4, 2011 at 1:43 am
  • Hi, I'm having trouble setting up Hadoop 0.20.2 with Ganglia 3.1. Ganglia is running, and I am getting standard metrics, but I am not seeing any of the Hadoop metrics. BTW, I'm running this in EC2. I ...
    Marc LimotteMarc Limotte
    Nov 3, 2011 at 4:22 pm
    Nov 3, 2011 at 6:08 pm
  • Hi, I have a big file consisting of XML data.the XML is not represented as a single line in the file. if we stream this file using ./hadoop dfs -put command to a hadoop directory .How the ...
    Nov 22, 2011 at 1:21 am
    May 15, 2012 at 6:12 pm
  • Hi All, I am trying to run a mapreduce job to process the Amazon S3 logs. However, the code hangs at INFO mapred.JobClient: map 0% reduce 0% and does not even attempt to launch the tasks. The sample ...
    Nitika GuptaNitika Gupta
    Nov 28, 2011 at 10:42 pm
    Dec 2, 2011 at 12:58 am
  • Hi All, I am just beginning to learn how to deploy a small cluster (a 3 node cluster) on EC2. After some quick Googling, I see the following approaches: 1. Use Whirr for quick deployment and tearing ...
    Nov 29, 2011 at 8:29 pm
    Nov 30, 2011 at 1:51 am
  • Hi, I need to implement distributed sorting using Hadoop. I am quite new to Hadoop and I am getting confused. If I want to implement Merge sort, what my Map and reduce should be doing. ? Should all ...
    Nov 26, 2011 at 1:05 pm
    Nov 29, 2011 at 7:12 pm
  • Hi all, I need some info related to the code section which handles the following operations. Basically DataXceiver.c on the client side transmits the block in packets and on the data node side we ...
    Kartheek muthyalaKartheek muthyala
    Nov 3, 2011 at 5:53 am
    Nov 4, 2011 at 3:15 am
  • hi all, in order to run DFSIO in my cluster, do i need to run JobTracker, and TaskTracker, or just running HDFS is enough? Many thanks, Thanh
    Thanh DoThanh Do
    Nov 24, 2011 at 8:15 pm
    Mar 2, 2012 at 7:09 am
  • HI , I have successfully setup Hadoop 0.23.0 in a single m/c. When i post a job, it gets posted successfully (i can see the job in UI), but the job is never "ASSIGNED" and waits forever. Here are ...
    Nitin KhandelwalNitin Khandelwal
    Nov 30, 2011 at 4:37 am
    Dec 9, 2011 at 7:51 am
  • Hi, I was trying to setup Hadoop 0.23.0 with help of http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/SingleCluster.html. After starting resourcemanager and nodemanager, I ...
    Nitin KhandelwalNitin Khandelwal
    Nov 28, 2011 at 12:03 pm
    Nov 29, 2011 at 5:38 am
  • hey, I just started using oceansync hadoop management beta and i was wondering how to get the hadoop JobTracker URI information? i found the hdfs uri but not the jobtracker uri. it says there should ...
    Rock OnRock On
    Nov 25, 2011 at 7:19 am
    Nov 25, 2011 at 10:44 pm
  • Hi , Are there some a bit more sophisticated example of MapReduce than the wordcounting available? I work for a database department and just the word count is not very impressive. Regards Hans-Peter ...
    Sloot, Hans-PeterSloot, Hans-Peter
    Nov 24, 2011 at 12:03 pm
    Nov 24, 2011 at 1:26 pm
  • Besides SCM from Cloudera, and from Oceansync, wonder if any of management console that actually fully free and open source ? If not, it's kind of inspiring to build one. -P
    Patai SangbutsarakumPatai Sangbutsarakum
    Nov 22, 2011 at 1:56 am
    Nov 22, 2011 at 5:18 am
  • Hi Bangalore Area Hadoop Developers and Users, There is a lot of interest in Hadoop and Big Data space in Bangalore. Many folks have been asking for Bangalore meetups for long. I have just created ...
    Sharad AgarwalSharad Agarwal
    Nov 17, 2011 at 1:01 pm
    Nov 19, 2011 at 10:29 am
  • Hi all, I know that while writing the file, a hdfs client writes the blocks in sequential fashion, writing one block only at one time and when this block is acknowledged as a complete write then the ...
    Kartheek muthyalaKartheek muthyala
    Nov 19, 2011 at 5:57 am
    Nov 19, 2011 at 9:16 am
  • Hi, I'm testing out native lib support on our test amd64 test cluster running 0.20.205 running the following ./bin/hadoop jar hadoop-test- testsequencefile -seed 0 -count 1000 ...
    Stephen mulcahyStephen mulcahy
    Nov 16, 2011 at 11:51 am
    Nov 16, 2011 at 3:50 pm
  • A question related to standing up cloud infrastructure for running Hadoop/HDFS. We are building up an infrastructure using Openstack which has its own storage management redundancy. We are planning ...
    Edmon BegoliEdmon Begoli
    Nov 12, 2011 at 2:59 am
    Nov 14, 2011 at 10:33 am
  • Dear all, I'm trying to install Hadoop (0.20.2) in pseudo distributed mode to run some tests on a Linux machine (Fedora 8) . I have followed the installation steps in the guide available here ...
    Paolo Di TommasoPaolo Di Tommaso
    Nov 8, 2011 at 10:55 am
    Nov 8, 2011 at 12:20 pm
  • Hi, Running 0.20.2: A job with about 4000 map tasks quickly blew through all but 3 in a couple of hours, with the tasks taking about two minutes each. The remaining three, however, inched along, with ...
    Brendan W.Brendan W.
    Nov 3, 2011 at 1:18 pm
    Nov 3, 2011 at 2:56 pm
  • Hi I am trying to run the matrix multiplication example mentioned(with source code) on the following link: http://www.norstad.org/matrix-multiply/index.html I have hadoop setup in pseudodistributed ...
    Nov 30, 2011 at 11:23 am
    Dec 28, 2011 at 8:47 pm
  • Hi guys ! I see that hadoop doesn't capture the Map task I/O time and Reduce task I/O time and captures only map runtime and reduce runtime. Am i right ? By I/O time for map task i meant time taken ...
    Nov 29, 2011 at 2:26 pm
    Dec 3, 2011 at 7:59 am
  •  Friends,            I want to know, how Jobtracker stores information about tasktracker & their tasks?      and also where does Jobtracker store ...
    Mohmmadanis moulaviMohmmadanis moulavi
    Nov 28, 2011 at 11:18 am
    Nov 30, 2011 at 1:36 pm
  • Hey everyone, First time posting to the list. I'm currently writing a hadoop job that will run daily and whose output will be part of the part of the next day's input. Also, the output will ...
    Leonardo UrbinaLeonardo Urbina
    Nov 26, 2011 at 7:47 pm
    Nov 27, 2011 at 3:49 am
  • Hi, I'm developing on top of Hadoop for my Master final thesis, which aims to connect theoretical MapReduce modelling to a more practical ground. A part of the project is about analyzing the impact ...
    Paolo RodeghieroPaolo Rodeghiero
    Nov 17, 2011 at 6:35 pm
    Nov 23, 2011 at 10:12 am
  • Hello, I wasn't able to find this information in the documentation anywhere, but are the part-* output files guaranteed to be sorted? As in, when traversing the files as part-00000, part-00001, ...
    Leon MergenLeon Mergen
    Nov 22, 2011 at 2:28 pm
    Nov 22, 2011 at 9:09 pm
  • Hi guys : I followed the exact directions on the hadoop installation guide for psuedo-distributed mode here http://hadoop.apache.org/common/docs/current/single_node_setup.html#Configuration However, ...
    Jay VyasJay Vyas
    Nov 17, 2011 at 8:07 pm
    Nov 20, 2011 at 8:58 pm
  • Hi guys : I do not see a "conf/hadoop-env.sh" file, which is required for hadoop installation, according to standard hadoop install directions which I find online and in the hadoop elephant ...
    Jay VyasJay Vyas
    Nov 17, 2011 at 6:41 pm
    Nov 17, 2011 at 7:05 pm
  • Friends, Where can i find the source code of hadoop 0.20.2 version, i specifically want the source code of jobtracker. I am using hadoop which comes along with the nutch-1.2. Reagrds, Mohmmadanis ...
    Mohmmadanis moulaviMohmmadanis moulavi
    Nov 15, 2011 at 12:30 pm
    Nov 15, 2011 at 5:05 pm
Group Navigation
period‹ prev | Nov 2011 | next ›
Group Overview
groupcommon-user @

140 users for November 2011

Harsh J: 29 posts Uma Maheswara Rao G 72686: 18 posts Joey Echeverria: 13 posts Bejoy KS: 12 posts Prashant Sharma: 12 posts Stephen mulcahy: 12 posts Mohmmadanis moulavi: 11 posts Andy Doddington: 10 posts ArunKumar: 9 posts Masoud: 9 posts Michel Segel: 9 posts Cat fa: 7 posts Kartheek muthyala: 7 posts Raj V: 7 posts Brendan W.: 6 posts Nitin Khandelwal: 6 posts Alexander C.H. Lorenz: 5 posts Denis Kreis: 5 posts Jay Vyas: 5 posts Ayon Sinha: 4 posts
show more