Grokbase Groups R r-help June 2012
FAQ

[R] selecting rows by maximum value of one variables in dataframe nested by another Variable

Miriam
Jun 26, 2012 at 9:21 pm
How could I select the rows of a dataset that have the maximum value in one variable and to do this nested in another variable. It is a dataframe in long format with repeated measures per subject.
I was not successful using aggregate, because one of the columns has character values (and/or possibly because of another reason).
I would like to transfer something like this:
subject time.ms V3
1 1 stringA
1 12 stringB
1 22 stringC
2 1 stringB
2 14 stringC
2 25 stringA
?.
To something like this:
subject time.ms V3
1 22 stringC
2 25 stringA
?

Thank you very much for you help!
Miriam
reply

Search Discussions

3 responses

  • Petr PIKAL at Jun 27, 2012 at 8:30 am
    Hi
    How could I select the rows of a dataset that have the maximum value in
    one variable and to do this nested in another variable. It is a dataframe
    in long format with repeated measures per subject.
    I was not successful using aggregate, because one of the columns has
    You could do it by aggregate and subsequent selection matching values from
    your data frame but it is perfect example for powerfull list operations
    do.call("rbind",lapply(split(test, test$subject), function(x)
    x[which.max(x[,2]),]))
    subject time.ms V3
    1 1 22 stringC
    2 2 25 stringA
    >

    split splits data frame test according to subject variable into list of
    sub data frames
    function x computes which is maximum value in second column in each sub
    data frame and selects the appropriate row
    do.call takes the list and rbinds it to one final data frame.

    Regards
    Petr
    character values (and/or possibly because of another reason).
    I would like to transfer something like this:
    subject time.ms V3
    1 1 stringA
    1 12 stringB
    1 22 stringC
    2 1 stringB
    2 14 stringC
    2 25 stringA
    ?.
    To something like this:
    subject time.ms V3
    1 22 stringC
    2 25 stringA
    ?

    Thank you very much for you help!
    Miriam
    --

    Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a

    ______________________________________________
    R-help at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.
  • Rui Barradas at Jun 27, 2012 at 10:09 am
    Hello,

    Here's a solution using aggregate and merge. I've kept it in two steps
    for clarity.



    d <- read.table(text="
    subject time.ms V3
    1 1 stringA
    1 12 stringB
    1 22 stringC
    2 1 stringB
    2 14 stringC
    2 25 stringA
    ", header=TRUE)

    ag <- aggregate(time.ms~subject, data=d, max)
    merge(ag, d)

    # It also works if the maximum is not unique
    d2 <- rbind(d, c(1, 22, "stringA"))
    ag2 <- aggregate(time.ms~subject, data=d2, max)
    merge(ag2, d2)


    The split version would have to be slightly modified, to make use of
    'which' and 'max' separately.


    do.call("rbind",lapply(split(d2, d2$subject), function(x)
    x[which(x[, 2] == max(x[, 2])), ]))

    Hope this helps,

    Rui Barradas

    Em 27-06-2012 09:30, Petr PIKAL escreveu:
    Hi
    How could I select the rows of a dataset that have the maximum value in
    one variable and to do this nested in another variable. It is a dataframe
    in long format with repeated measures per subject.
    I was not successful using aggregate, because one of the columns has
    You could do it by aggregate and subsequent selection matching values from
    your data frame but it is perfect example for powerfull list operations
    do.call("rbind",lapply(split(test, test$subject), function(x)
    x[which.max(x[,2]),]))
    subject time.ms V3
    1 1 22 stringC
    2 2 25 stringA
    split splits data frame test according to subject variable into list of
    sub data frames
    function x computes which is maximum value in second column in each sub
    data frame and selects the appropriate row
    do.call takes the list and rbinds it to one final data frame.

    Regards
    Petr
    character values (and/or possibly because of another reason).
    I would like to transfer something like this:
    subject time.ms V3
    1 1 stringA
    1 12 stringB
    1 22 stringC
    2 1 stringB
    2 14 stringC
    2 25 stringA
    ?.
    To something like this:
    subject time.ms V3
    1 22 stringC
    2 25 stringA
    ?

    Thank you very much for you help!
    Miriam
    --

    Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a

    ______________________________________________
    R-help at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.
    ______________________________________________
    R-help at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.
  • Arun at Jun 27, 2012 at 11:53 am
    HI,

    Try this:
    dat1 <- read.table(text="
    subject??? time.ms V3
    1????? 1? stringA
    1????? 12? stringB
    1????? 22??? stringC
    2????? 1??? stringB
    2????? 14? stringC
    2????? 25? stringA
    ", sep="",header=TRUE)
    dat2<-aggregate(dat1$time.ms,list(dat1$subject),max)
    colnames(dat2)<-c("subject","time.ms")


    ?merge(dat2,dat1)
    ? subject time.ms????? V3
    1?????? 1????? 22 stringC
    2?????? 2????? 25 stringA

    A.K.




    ----- Original Message -----
    From: Miriam <mir...@...de>
    To: r-help at r-project.org
    Cc:
    Sent: Tuesday, June 26, 2012 5:21 PM
    Subject: [R] selecting rows by maximum value of one variables in dataframe nested by another Variable

    How could I select the rows of a dataset that have the maximum value in one variable and to do this nested in another variable. It is a dataframe in long format with repeated measures per subject.?
    I was not successful using aggregate, because one of the columns has character values (and/or possibly because of another reason).
    I would like to transfer something like this:
    subject? ? time.ms? V3
    1??? ??? 1??? stringA
    1??? ??? 12??? stringB
    1??? ??? 22? ??? stringC
    2??? ??? 1 ??? stringB
    2??? ??? 14??? stringC
    2??? ??? 25??? stringA
    ?.
    To something like this:
    subject??? ? time.ms??? V3
    1??? ??? 22??? stringC
    2??? ??? 25 ??? stringA
    ?

    Thank you very much for you help!
    Miriam
    --

    Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a

    ______________________________________________
    R-help at r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.

Related Discussions

Discussion Navigation
viewthread | post