Search Discussions
-
Hi, I've created a cluster w/o bootstrapping R but selecting --ami-version 3.0.4 (w/ R 3.0.1) and --hadoop-version 2.2.0. Executing an R script connecting via SSH (from Win7 box) to the MasterNode ...
Rob
May 16, 2014 at 4:38 pm
May 16, 2014 at 5:02 pm -
Hi All, I am getting error while using mr jobs output in native format as input to another mr job. I am getting following error in JT logs. Error in as(unlist(x[[i]]), class(template[[i]])) : no ...
Abhishek Gayakwad
May 14, 2014 at 4:07 pm
May 16, 2014 at 8:51 pm -
Hello, I am not sure about passing a data frame as a key to the reduce function. If the data frame contains a numeric variable and a factor variable, in the local backend it works, but it fails in ...
Yo
May 14, 2014 at 2:32 pm
May 14, 2014 at 5:21 pm -
1
[RHadoop:#1442] Call to mapreduce() is throwing errors even though Hadoop Streaming is working fine
I am having a persistent problem with using RHadoop where even a call to the simplest mapreduce() function causes a failure like this . ...Calcutta
May 13, 2014 at 2:28 pm
May 13, 2014 at 2:35 pm -
Hi, We are thinking to write an article based on our latest work in the field of computational genetics. This time we have used RHadoop for structuring the computational environment. My source of ...
Salman Toor
May 12, 2014 at 8:54 am
May 12, 2014 at 5:38 pm -
Hi, I have many huge csv files(more 20GB) on my hortonworks HDP 2.0.6.0 GA cluster, I use the following code to read file from HDFS ...
James Chang
May 7, 2014 at 4:03 pm
May 16, 2014 at 4:29 pm -
Issue when from.hdfs() function used. Please let me know what could be wrong. *Packages and Versions:* R version 3.0.3 (2014-03-06) -- "Warm Puppy" rmr2_2.3.0 rhdfs_1.0.8 *R command prompt:* Loading ...
Ambika J
Apr 29, 2014 at 4:09 pm
Apr 29, 2014 at 5:38 pm -
Hello RHadoop experts, My program has several MR functions chained in series. Output of one is used by the next as the input. The error occurs on the 3rd MR function. 1st and 2nd MR functions are ...
Mukul Biswas
Apr 28, 2014 at 4:30 pm
May 1, 2014 at 5:38 pm -
Good evening. This is the scenario: Cluster with head nodes: Namenode in H.A. Namenode server #1 Namenode server #2 Jobtracker Cloudera Manager Service node Hue R-Studio + R + Rhadoop Slaves nodes: N ...
Lorenzo Ramírez Hernández
Apr 24, 2014 at 2:44 pm
May 8, 2014 at 11:34 am -
Hi, Seems like NLineInputFormat is available http://hadoop.apache.org/docs/r1.0.4/api/org/apache/hadoop/mapreduce/lib/input/NLineInputFormat.html Can I use it in RHadoop? I want to pass fix number of ...
Salman Toor
Apr 10, 2014 at 9:58 pm
Apr 11, 2014 at 2:31 am -
Hi, I have this problem when setting second account for remote access via RStudio server. Here is my current settings: I have a super user A, and a hadoop user B. On HDFS, I have drwxr-xr-x - A ...
Lishu Liu
Apr 9, 2014 at 9:18 pm
Apr 10, 2014 at 5:06 pm -
Hi, I am new to the Rhadoop package and I have an issue where I am trying to control the inputs to my mappers. Let me detail the setup. Main Objective: I have a large number of individual files that ...
Reddy
Apr 8, 2014 at 4:51 pm
Apr 8, 2014 at 9:51 pm -
I am doing a simple Principal Component transformation on a dataset for each sample, the data looks like Sample X Y Z 1 ... (some entries for X Y and Z) 1 ... 1 ... 1 ... 2 ... 2 ... 2 ... 2 ... 3 ...
Liang Zhou
Apr 7, 2014 at 9:45 pm
Apr 8, 2014 at 2:34 am -
Plyrmr 0.2.0 released! Simplified API, loads of new features, bugs exterminated, you name it. I normally link to the Changelog, but you've got to read the new in this ...
Antonio Piccolboni
Mar 31, 2014 at 11:56 pm
Mar 31, 2014 at 11:56 pm -
Hi, can input be a folder to Rdata files? i realized that it works for csv files and for a single R object with to.dfs() However, it does not seem to work for a folder of Rdata files in the hdfs. Any ...
Elizabeth Tang
Mar 31, 2014 at 3:48 pm
Apr 3, 2014 at 2:44 am -
Is it possible to run rmr2 3.1.0 and rhdfs 1.0.8 on a machine with CentOS6.5 and Cloudera's CHD4.6 in pseudo-distributed mode. -- post: <span class="m_body_email_addr" ...
Zoran Djordjevic
Mar 31, 2014 at 3:47 pm
Mar 31, 2014 at 4:38 pm -
It's available! Short story in the changelog<https://github.com/RevolutionAnalytics/RHadoop/wiki/Changelog , with link to long story and downloads from there. Given the number of bug fixes and the ...
Antonio Piccolboni
Mar 27, 2014 at 11:22 pm
Mar 27, 2014 at 11:22 pm -
I am trying to install rhdfs in a Sandbox HDP2 (virtual box), but it does not recognise the environment variable HADOOP_CMD (see below) Any suggestion? Thank you! Emilio # which hadoop ...
Emilio Torres
Mar 27, 2014 at 11:12 pm
Apr 3, 2014 at 3:44 pm -
Hi, I want to install RHadoop in a Sandbox HDP2 (Virtual Box) (http://d1ekw1iaw660be.cloudfront.net/Hortonworks+Sandbox+2.0+VirtualBox.ova ) Inside Sandbox: # yum install R curl-devel # R In R ...
Yo
Mar 27, 2014 at 3:37 pm
Mar 27, 2014 at 11:15 pm -
Hi, I am running the sample code a <- to.dfs(seq(from=1, to=500, by=3), output="/user/cloudera/numbers") b <- mapreduce(input=a, map=function(k,v){keyval(v,v*v)}) I got the following error, ...
Liang Zhou
Mar 26, 2014 at 9:58 pm
Mar 31, 2014 at 2:58 am -
안녕하세요. RHadoop에 막 입문한 학생입니다. 1. 하둡2.0을 가상서버로 Cluster 설정을 한 상태입니다.(Master : namenode, nodemanager, resourcemanager, secondarynamenode, datanode)(Slave : nodemanager, datanode) 이 위에다 rhdfs, rmr, ...
Soonyong Hong
Mar 21, 2014 at 3:28 pm
Mar 21, 2014 at 3:43 pm -
The hadoop streaming job that I am running via Rhadoop (rmr2) fails with the following out-of-memory error. 14/03/18 21:02:49 INFO mapreduce.Job: map 100% reduce 66% 14/03/18 21:02:54 INFO ...
Harsha V.
Mar 19, 2014 at 3:27 pm
Mar 20, 2014 at 1:15 pm -
I am trying to run a rather large rmr job where the file data is partitioned as follows: /data/folder/partition1 /data/folder/partition2 /data/folder/partitionN When running the following job I am ...
Martin Eggenberger
Mar 18, 2014 at 8:01 pm
Mar 18, 2014 at 11:52 pm -
Hi All, I am trying to run basic mapreduce example in R using rmr2 and rhdfs libraries, but I am getting an error. Single node hadoop cluster is installed properly and all five demeans are starting ...
King horse
Mar 17, 2014 at 4:21 pm
Mar 17, 2014 at 4:28 pm -
Hi, My apologies if this has already been asked, but what are some of the advantages of using RMR2 vs regular R streaming? Are there any performance considerations? Thanks -- post: <span ...
Jan B
Mar 14, 2014 at 11:28 pm
Mar 15, 2014 at 10:39 pm -
HI! my hadoop version is 2.2.0,R version is 3.0.3,when I run R CMD INSTALL /app/rmr2_3.0.0.tar.gz,it shows some errors: ((which hbase && (mkdir -p ../inst; cd hbase-io; sh build_linux.sh; cp ...
Gao quan
Mar 14, 2014 at 2:42 pm
Mar 14, 2014 at 2:46 pm -
Good day, regarding https://github.com/RevolutionAnalytics/rmr2/blob/master/pkg/tests/kmeans.R, I have managed to run the code successfully. I am trying to get the labelling of each data(iris) entry ...
Banana
Mar 14, 2014 at 2:42 pm
Mar 14, 2014 at 2:47 pm -
Hi I installed both rmr and rHBase, I run a map-reduce program in R and it works, I use rhbase commmand, it works very well too. The problem is when I run a sample of MapReduce with HBase, i have a ...
Hoang Le
Mar 11, 2014 at 3:59 pm
Mar 12, 2014 at 4:53 pm -
Hi, I'm trying to run a task in rmr that is element-wise multiplication of two big matrices. Suppose I have two big data matrices A and B and i have sent them to hdfs by to.dfs(). To make it simple, ...
Lan Li
Mar 10, 2014 at 10:46 pm
Mar 11, 2014 at 12:00 am -
I think rmr should clean up the tmp files it creates in the local file system. If I understand correctly, there are global, local, map, combine and reduce environments created, saved and disseminated ...
Saar Golde
Mar 7, 2014 at 4:56 pm
Mar 7, 2014 at 7:46 pm -
Hi all, after a major refactoring in my project I'm facing massive runtime changes and I'm a little clueless what causes them. For development I divided my project into two MR-jobs: 1. The first just ...
Claudio Hartmann
Mar 5, 2014 at 4:45 pm
Mar 5, 2014 at 4:54 pm -
Hi all, I am new to Hadoop, and RHadoop as well. (Before RHadoop, I only had experience using Hadoop streaming to run some simple map/reduce jobs in python.) And now I'm practicing using RHadoop to ...
Yen-Ping Lin
Mar 5, 2014 at 4:12 am
Mar 5, 2014 at 4:42 pm -
Hello all. I"m using newest rmr version (3.0.0 ) with a 'simple' mr job for new install inspection: input.size=1000 input.ga = to.dfs(cbind(1:input.size, rnorm(input.size))) group = function(x) x%%10 ...
Tom tom
Mar 2, 2014 at 5:18 pm
Mar 16, 2014 at 10:03 pm -
Hello, I would like to ask a few questions that are related on the input of rmr2. My questions are: 1. in the tutorial ( https://github.com/RevolutionAnalytics/rmr2/blob/master/docs/tutorial.md ) it ...
Panagiotis Tzirakis
Mar 2, 2014 at 6:30 am
Mar 2, 2014 at 6:55 am -
I have several Hadoop clusters. I use RHadoop with one setup and in that system, rhdfs works as expected from the help files. I can use commands such as hdfs.ls(path="/", fs=hdfs.defaults("fs")) and ...
Dataquerent
Feb 25, 2014 at 7:29 am
Feb 26, 2014 at 1:48 am -
Hi, I am trying to run a R map reduce and getting the below error. Where does rmr2_3.0.0 expect YARN Jars to be present? Thanks! *Error* Exception in thread "main" java.lang.NoClassDefFoundError ...
Ravi
Feb 24, 2014 at 10:09 am
Feb 25, 2014 at 5:54 pm -
I am facing any issue when interacting with HDFS from R shell. RHDFS is properly installed. Does rdhfs support kerberos? The underlying cluster is using Pivotal HD as the hadoop distribution and its ...
Anoop Kumar KM
Feb 21, 2014 at 3:33 pm
May 7, 2014 at 5:05 pm -
Hi, release 3.0.0 of rmr2 is available. Please keep in mind that the package is still called rmr2, despite being version 3. This release is mostly devoted to efficiency, see the ...
Antonio Piccolboni
Feb 10, 2014 at 7:59 pm
Feb 10, 2014 at 7:59 pm -
Hi, I have installed all the necessary components of R on the Hortonworks sandbox environment. The only problem is when I attempt to install the rmr2 package: sudo R CMD INSTALL rmr2_2.3.0.tar.gz ...
Marcus potter
Feb 7, 2014 at 5:49 pm
Feb 7, 2014 at 6:26 pm -
Hi All, I have one dataset named test.txt: A,2,2,6.0 A,2,3,7.0 A,2,4,8.0 A,2,5,9.0 B,1,1,0 B,1,2,1.0 B,1,3,2.0 B,2,1,3.0 B,2,2,4.0 B,2,3,5.0 The map function is: map123<-function(k, values) { ...
S.H. Chou
Feb 5, 2014 at 6:37 pm
Feb 5, 2014 at 8:30 pm -
Hi all, I'm trying to understand the output of mapreduce function without using reduce function. Here is my input data named test_out which stored in HDFS 1 2 3 4 5 6 7 8 9 10 11 12 and here is my ...
S.H. Chou
Feb 5, 2014 at 4:57 am
Feb 5, 2014 at 5:42 am -
Hi All, I have a task to do, it need to pull the test from a file and do check on those words and count number of words( it is almost similar to word count. A few conditions added on them) For this I ...
Sivaji
Feb 2, 2014 at 9:40 am
Feb 3, 2014 at 7:18 pm -
All, Is there a "preferred" way to get my data to HDFS? I'm thinking about rhdfs::hdfs.write versus rmr2::to.dfs. What are the conceptual differences between the two? Thanks, M -- post: <span ...
Michael Smith
Jan 30, 2014 at 11:05 am
Feb 10, 2014 at 8:46 am -
Error while executing to.dfs from R console. Works as root user from client but ypalrecha. DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it ...
YPalrecha
Jan 23, 2014 at 12:00 am
Jan 23, 2014 at 12:23 am -
Hi, I'm testing rmr2 with the following setup: - Hadoop 1.2.1, 5 nodes, each 3.7 GB RAM + 1 GB swap - R 3.0.2 - latest versions of all the RHadoop packages - a 1.6 GB CSV file that contains exactly 1 ...
Daniel Kvasnička
Jan 15, 2014 at 10:44 pm
Feb 10, 2014 at 8:18 pm -
Hi Team, I am new to RHadoop.I have installed R version 3.0.2. I am trying to install RHadoop three R packages: rmr, rhdfs and rhbase. - firstly i have installed R base package using $ sudo apt-get ...
veerendra Pipuru
Jan 15, 2014 at 6:45 am
Jan 15, 2014 at 6:56 am -
I have a MR program which will duplicate given data set (a matrix) K times and for each duplication run some classifier. It was tested and run correctly with Hadoop 1.2.1. I observed that each ...
Lishu Liu
Jan 8, 2014 at 8:30 pm
Feb 19, 2014 at 9:26 pm -
mapreduce(input = small.ints, map = function(k, v) cbind(v, v^2)) 14/01/08 11:10:09 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead. packageJobJar ...
한종빈
Jan 8, 2014 at 5:36 am
Jan 10, 2014 at 4:05 am -
Hi guys, I'd like to know whether is possible to run mapreduce jobs in rmr in asynchronous fashion. When I run the mapreduce() function it always waits for the job to finish. We would like to put ...
Daniel Kvasnička
Jan 7, 2014 at 1:45 pm
Jan 7, 2014 at 2:24 pm -
Hi, I am quite new to the hadoop world. Recently configured rHadoop and started to use qtl library function in side the mapper function and based on certain parameters sometimes there are warning ...
Salman Toor
Jan 7, 2014 at 1:45 pm
Jan 8, 2014 at 8:07 pm
Group Overview
group | rhadoop |
discussions | 326 |
posts | 1,398 |
users | 208 |
Top users
Archives
- May 2014 (37)
- April 2014 (31)
- March 2014 (90)
- February 2014 (32)
- January 2014 (33)
- December 2013 (62)
- November 2013 (76)
- October 2013 (112)
- September 2013 (25)
- August 2013 (25)
- July 2013 (145)
- June 2013 (27)
- May 2013 (38)
- April 2013 (32)
- March 2013 (103)
- February 2013 (96)
- January 2013 (28)
- December 2012 (58)
- November 2012 (98)
- October 2012 (33)
- September 2012 (32)
- August 2012 (44)
- July 2012 (59)
- June 2012 (47)
- May 2012 (33)
- April 2012 (2)