Hi Jen,

It's generally best to keep cc'ing R-help so others can lend a hind

when I step away from my computer:

On Thu, Aug 9, 2012 at 11:49 AM, Jennifer Hobbs wrote:Hi Michael -

thanks for the advice - I did find merge() just after posting but I'm having

difficulty with using it. I've loaded both datasets; then I tried

CombinedData<-merge(MethyData1,ExprData1)

but when I looked at CombinedData, I found there was no actual data in it:

str(CombinedData)

'data.frame': 0 obs. of 20 variables

Take a look at

?merge.data.frame

in particular since there are many different forms of merges. Your

original post suggests you may want to set

all = TRUE

by = "Location"

Hope that helps,

Michael

I thought this might be due to the fact that my column names, as well as the

row names, in both data sets were the same, so I renamed the column names in

ExprData1 and tried again:

colnames(ExprData1)<-NewExprNames

merge(ExprData1,MethyData1)

Error: cannot allocate vector of size 4.2 Gb

In addition: Warning messages:

1: In expand.grid(seq_len(nx), seq_len(ny)) :

Reached total allocation of 8055Mb: see help(memory.size)

2: In expand.grid(seq_len(nx), seq_len(ny)) :

Reached total allocation of 8055Mb: see help(memory.size)

3: In expand.grid(seq_len(nx), seq_len(ny)) :

Reached total allocation of 8055Mb: see help(memory.size)

4: In expand.grid(seq_len(nx), seq_len(ny)) :

Reached total allocation of 8055Mb: see help(memory.size)

I was surprised about this, as I'm using a 64-bit computer and it's managed

You'll also need to be using a 64 bit build of R. Merging is pretty

memory expensive so if you're right on the edge of what R can handle

you might have to look into a more specialized solution (such as an

SQL backend)

to deal with much larger data sets before now (I know that's not the only

criterion, but my understanding of computers isn't extensive). I had

previously run up against a memory problem because I hadn't transformed my

data (I thought I was looking at columns, the computer was looking at rows)

so I tried transforming both data sets and merging again, but I end up with

another empty data frame:

tED1<-t(ExprData1)

tMD1<-t(MethyData1)

CombineData<-merge(tED1,tMD1)

str(CombineData)

'data.frame': 0 obs. of 152247 variables:

This is where I'm stuck. Any advice would be hugely appreciated!

Jen

On Thu, Aug 9, 2012 at 5:28 PM, R. Michael Weylandt

wrote:

Perhaps load them both and ?merge can show you the way.

Michael

On Thu, Aug 9, 2012 at 9:54 AM, JenniferH wrote:Hello everyone,

I have two sets of data, with the following structure:

DataSet1

Location Part Sample 1 Sample 2

A 1 value value

A 2 value value

A 3 value value

B 1 value value

DataSet2

Location Sample 1 Sample 2

A value value

B value value

C value value

I would like to look at the correlations between DataSet1 and DataSet2,

such

that each row in Location A from DataSet1 is paired with the Location A

row

from DataSet2, and so forth. So far, my only ideas involve trying to

copy-paste each of the rows in DataSet2 the number of times each occurs

in

DataSet1 on a spreadsheet before loading the sets into R; however, as I

have

approaching 8000 rows in DataSet2, this is clearly not a workable

solution!

I'm sure there's a simple solution to this, so I'm sorry if this seems

like

a really silly question.

Thanks for your help!

Jen

