Hi,

I would like to select the most frequent value level in a set of three variables.

Three different observators have judged hair color in study subjects. Mostly they judge the same color, sometimes there is a slight difference. I want to know what most of the observators have chosen (so at least 2) from the 3 observations. E.g. If two out of three observators decide the hair is black, then it's likely not to be brown.

Let's say that i have 3 variables: color1, color2, color3. Each have 4 possible levels (fair up to black, 1-4). I would like a new variable containing this 'most frequent judgement'.

I have already searched through the knowledge base and many posts but I haven't found what I'm looking for.

Is this possible?

Sam

at Aug 20, 2012 at 11:47 am
I found a solution to this problem myself by searching further and experimenting with some functions.

I ended up with a for-loop which iterates over the cases. It generates a table per case, and selects the most frequent (i.e. the value corresponding with the highest frequency) choice, which checked to be non-NA and saved to the new variable. I still haven't found a way to re-factor them using the same attributes as the original color-variables... Anyone an idea?

for(i in seq(length(ss\$name))) {
color <- names(which.max(table(c(ss\$color1[i], ss\$color2[i], ss\$color3[i]))))

if(is.null(color))
{
ss\$color_med[i] <- NA
}
else
{
ss\$color_med[i] <- as.integer(color)
}
}

Greetings!
Sam

at Aug 20, 2012 at 2:08 pm
Hi

It is really a typical example of a question which has probably very simple solution but hardly anybody can give you a rasonable answer.

What is the structure of your data?

set.seed(1)
x<-sample(1:4, 60, replace=T)
mat<-as.factor(x)
dim(mat) <- c(20,3)
sapply(apply(mat,1, table), max)
[1] 2 1 2 1 1 2 2 2 2 1 2 1 2 2 2 1 1 1 2 2
names(sapply(apply(mat,1, table), which.max))
[1] "4" "1" "3" "1" "1" "4" "1" "2" "3" "1" "2" "1" "2" "1" "4" "1" "2" "1" "3"
[20] "2"

gives you the most frequent value in each row of matrix mat.

Petr

at Aug 20, 2012 at 4:04 pm
HI,

Slightly different way:
unlist(lapply(apply(mat,1,count),function(x) max(x[2])))
?#[1] 2 1 2 1 1 2 2 2 2 1 2 1 2 2 2 1 1 1 2 2

Hi

It is really a typical example of a question which has probably very simple solution but hardly anybody can give you a rasonable answer.

What is the structure of your data?

set.seed(1)
x<-sample(1:4, 60, replace=T)
mat<-as.factor(x)
dim(mat) <- c(20,3)
sapply(apply(mat,1, table), max)
[1] 2 1 2 1 1 2 2 2 2 1 2 1 2 2 2 1 1 1 2 2
names(sapply(apply(mat,1, table), which.max))
[1] "4" "1" "3" "1" "1" "4" "1" "2" "3" "1" "2" "1" "2" "1" "4" "1" "2" "1" "3"
[20] "2"

gives you the most frequent value in each row of matrix mat.

Petr

at Aug 24, 2012 at 10:03 am
Hi,

Shortly after my first post I posted an answer including the fix I found; which seems to work. Through the archives I found that my code snippet got filtered out and appended as an attachment (which was not my intent).

This was my suggestion:

for(i in seq(length(ss\$name))) {
color <- names(which.max(table(c(ss\$color1[i], ss\$color2[i], ss\$color3[i]))))

if(is.null(color))
{
ss\$color_med[i] <- NA
}
else
{
ss\$color_med[i] <- as.integer(color)
}
}

Sam

Hi

It is really a typical example of a question which has probably very simple solution but hardly anybody can give you a rasonable answer.

What is the structure of your data?

set.seed(1)
x<-sample(1:4, 60, replace=T)
mat<-as.factor(x)
dim(mat) <- c(20,3)
sapply(apply(mat,1, table), max)
[1] 2 1 2 1 1 2 2 2 2 1 2 1 2 2 2 1 1 1 2 2
names(sapply(apply(mat,1, table), which.max))
[1] "4" "1" "3" "1" "1" "4" "1" "2" "3" "1" "2" "1" "2" "1" "4" "1" "2" "1" "3"
[20] "2"

gives you the most frequent value in each row of matrix mat.

Petr

at Aug 22, 2012 at 8:33 am

On 08/20/2012 06:48 PM, Sam Dekeyser wrote:
Hi,

I would like to select the most frequent value level in a set of three variables.

Three different observators have judged hair color in study subjects. Mostly they judge the same color, sometimes there is a slight difference. I want to know what most of the observators have chosen (so at least 2) from the 3 observations. E.g. If two out of three observators decide the hair is black, then it's likely not to be brown.

Let's say that i have 3 variables: color1, color2, color3. Each have 4 possible levels (fair up to black, 1-4). I would like a new variable containing this 'most frequent judgement'.

I have already searched through the knowledge base and many posts but I haven't found what I'm looking for.

Is this possible?
Hi Sam,
Are you looking for the Mode function in the prettyR package?

Jim

