Grokbase Groups R r-help June 2002
FAQ
I would use function w.conti that calculates a weighted
contingency matrix. That is, given 2 vectors of
categorical variables (i.e., species and
soil type) and a 3rd vector of a
quantitative variable (i.e. biomass), calculates
the sum of the quant. var. for each pair (i.e.,
the total biomass for each species in each soil type).
With your data, as you just have one categorical
variable, just set the second one to a constant to
calculate the sum of foo for each subject:
matriz<-cbind(sub,foo,bar)
matriz
sub foo bar
[1,] 2 1.7 3.2
[2,] 2 2.3 4.1
[3,] 3 7.6 2.3
[4,] 3 7.1 3.3
[5,] 3 7.3 2.3
[6,] 3 7.4 1.3
[7,] 5 6.2 6.1
[8,] 5 3.4 6.9
a <- w.conti(matriz[,1],rep(1,nrow(matriz)),matriz[,2])
a
v2
v1 1
2 4.0
3 29.4
5 9.6

Then, using the result of table you can calculate the mean from
the sum:
a/as.vector(table(matriz[,1]))
v2
v1 1
2 2.00
3 7.35
5 4.80
From your question I understand that you want new subjects according
to their number of rows, so
that subject 2 and 5 would become a new subject:
new.sub <- as.vector(table(matriz[,1]))
new.sub [1] 2 4 2
new.sub <- rep(new.sub,new.sub)
new.sub
[1] 2 2 4 4 4 4 2 2
a <- w.conti(new.sub,rep(1,nrow(matriz)),matriz[,2])
a
v2
v1 1
2 13.6
4 29.4
a/as.vector(table(new.sub))
v2
v1 1
2 3.40
4 7.35
>

w.conti is simply:

function (v1,v2,z)
{
xtabs(z~v1+v2)
}

(I could use xtabs() directely, but I never remember that expression,
while w.conti is easier to remember)

Of course, if you always need the mean, just add
the second step to w.conti.

Agus


Dr. Agustin Lobo
Instituto de Ciencias de la Tierra (CSIC)
Lluis Sole Sabaris s/n
08028 Barcelona SPAIN
tel 34 93409 5410
fax 34 93411 0012
alobo at ija.csic.es

On 4 Jun 2002, Russell Senior wrote:


I've got a data frame that looks like this:

subject foo bar
2 1.7 3.2
2 2.3 4.1
3 7.6 2.3
3 7.1 3.3
3 7.3 2.3
3 7.4 1.3
5 6.2 6.1
5 3.4 6.9
...

That is, I've got multiple rows per subject. I need to compute
summaries within categories where the subject has the same number of
rows. For example, subject 2 and 5 both have two rows. I need to
compute mean for those four values of foo. This looks like a good
candidate for index vectors, but I need some help. I've tried
something like:

table(data) -> tmp

and:

tmp[tmp == 2]

and even:

as.numeric(attr(tmp[tmp == 2],"names"))

to get a vector of subject numbers that have two rows in the original
data frame. But I am getting stuck there. I want some kind of
"is.member" function to use in a subsequent index vector expression,
like:

i <- as.numeric(attr(tmp[tmp == 2],"names"))
data[is.member($subject,i)]$foo

but there isn't an is.member() function. Can someone please give me a
pointer on the canonical way to do this?

Thanks!

--
Russell Senior ``The two chiefs turned to each other.
seniorr at aracnet.com Bellison uncorked a flood of horrible
profanity, which, translated meant, `This is
extremely unusual.' ''
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 6 of 6 | next ›
Discussion Overview
groupr-help @
categoriesr
postedJun 5, '02 at 1:00a
activeJun 5, '02 at 8:19a
posts6
users6
websiter-project.org
irc#r

People

Translate

site design / logo © 2017 Grokbase