contingency matrix. That is, given 2 vectors of

categorical variables (i.e., species and

soil type) and a 3rd vector of a

quantitative variable (i.e. biomass), calculates

the sum of the quant. var. for each pair (i.e.,

the total biomass for each species in each soil type).

With your data, as you just have one categorical

variable, just set the second one to a constant to

calculate the sum of foo for each subject:

matriz<-cbind(sub,foo,bar)

matriz

sub foo barmatriz

[1,] 2 1.7 3.2

[2,] 2 2.3 4.1

[3,] 3 7.6 2.3

[4,] 3 7.1 3.3

[5,] 3 7.3 2.3

[6,] 3 7.4 1.3

[7,] 5 6.2 6.1

[8,] 5 3.4 6.9

a <- w.conti(matriz[,1],rep(1,nrow(matriz)),matriz[,2])

a

v2a

v1 1

2 4.0

3 29.4

5 9.6

Then, using the result of table you can calculate the mean from

the sum:

a/as.vector(table(matriz[,1]))

v2v1 1

2 2.00

3 7.35

5 4.80

From your question I understand that you want new subjects according

to their number of rows, sothat subject 2 and 5 would become a new subject:

new.sub <- as.vector(table(matriz[,1]))

new.sub [1] 2 4 2

new.sub <- rep(new.sub,new.sub)

new.sub

[1] 2 2 4 4 4 4 2 2

a <- w.conti(new.sub,rep(1,nrow(matriz)),matriz[,2])

a

v2new.sub [1] 2 4 2

new.sub <- rep(new.sub,new.sub)

new.sub

[1] 2 2 4 4 4 4 2 2

a <- w.conti(new.sub,rep(1,nrow(matriz)),matriz[,2])

a

v1 1

2 13.6

4 29.4

a/as.vector(table(new.sub))

v2v1 1

2 3.40

4 7.35

>

w.conti is simply:

function (v1,v2,z)

{

xtabs(z~v1+v2)

}

(I could use xtabs() directely, but I never remember that expression,

while w.conti is easier to remember)

Of course, if you always need the mean, just add

the second step to w.conti.

Agus

Dr. Agustin Lobo

Instituto de Ciencias de la Tierra (CSIC)

Lluis Sole Sabaris s/n

08028 Barcelona SPAIN

tel 34 93409 5410

fax 34 93411 0012

alobo at ija.csic.es

On 4 Jun 2002, Russell Senior wrote:

I've got a data frame that looks like this:

subject foo bar

2 1.7 3.2

2 2.3 4.1

3 7.6 2.3

3 7.1 3.3

3 7.3 2.3

3 7.4 1.3

5 6.2 6.1

5 3.4 6.9

...

That is, I've got multiple rows per subject. I need to compute

summaries within categories where the subject has the same number of

rows. For example, subject 2 and 5 both have two rows. I need to

compute mean for those four values of foo. This looks like a good

candidate for index vectors, but I need some help. I've tried

something like:

table(data) -> tmp

and:

tmp[tmp == 2]

and even:

as.numeric(attr(tmp[tmp == 2],"names"))

to get a vector of subject numbers that have two rows in the original

data frame. But I am getting stuck there. I want some kind of

"is.member" function to use in a subsequent index vector expression,

like:

i <- as.numeric(attr(tmp[tmp == 2],"names"))

data[is.member($subject,i)]$foo

but there isn't an is.member() function. Can someone please give me a

pointer on the canonical way to do this?

Thanks!

--

Russell Senior ``The two chiefs turned to each other.

seniorr at aracnet.com Bellison uncorked a flood of horrible

profanity, which, translated meant, `This is

extremely unusual.' ''

