FAQ
Dear Group,

I am trying to simulate a dataset with 200 individuals with random
assignment of Sex (1,0) and Weight from lognormal distribution specific to
Sex. I am intrigued by the behavior of rlnorm function to impute a value
of Weight from the specified distribution. Here is the code:
ID<-1:200
Sex<-sample(c(0,1),200,replace=T,prob=c(0.4,0.6))
fulldata<-data.frame(ID,Sex)
fulldata\$Wt<-ifelse(fulldata\$Sex==1,rlnorm(100, meanlog = log(85.1), sdlog
= sqrt(0.0329)),
rlnorm(100, meanlog = log(73), sdlog = sqrt(0.0442)))

mean(fulldata\$Wt[fulldata\$Sex==0]);to check the mean is close to 73
mean(fulldata\$Wt[fulldata\$Sex==1]);to check the mean is close to 85

I see that the number of simulated values has an effect on the mean
calculated after imputation. That is, the code rlnorm(100, meanlog =
log(73), sdlog = sqrt(0.0442)) gives much better match compared to
rlnorm(1, meanlog = log(73), sdlog = sqrt(0.0442)) in ifelse statement in
the code above.

My understanding is that ifelse will be imputing only one value where the
condition is met as specified. I appreciate your insights on the behavior
for better performance of increasing sample number. I appreciate your

Regards,
Ayyappa

[[alternative HTML version deleted]]

## Search Discussions

•  at Jun 14, 2016 at 3:15 pm ⇧
Dear Ayyappa,

ifelse works on a vector. See the example below.

ifelse(
sample(c(TRUE, FALSE), size = length(letters), replace = TRUE),
letters,
LETTERS
)

However, note that it will recycle short vectors when they are not of equal
length.

ifelse(
sample(c(TRUE, FALSE), size = 2 * length(letters), replace = TRUE),
letters,
LETTERS
)

In your code the length of the condition vector is 200, the length of the
two other vectors is 100.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-06-14 17:02 GMT+02:00 Ayyappa Chaturvedula <ayyappach@gmail.com>:

Dear Group,

I am trying to simulate a dataset with 200 individuals with random
assignment of Sex (1,0) and Weight from lognormal distribution specific to
Sex. I am intrigued by the behavior of rlnorm function to impute a value
of Weight from the specified distribution. Here is the code:
ID<-1:200
Sex<-sample(c(0,1),200,replace=T,prob=c(0.4,0.6))
fulldata<-data.frame(ID,Sex)
fulldata\$Wt<-ifelse(fulldata\$Sex==1,rlnorm(100, meanlog = log(85.1), sdlog
= sqrt(0.0329)),
rlnorm(100, meanlog = log(73), sdlog = sqrt(0.0442)))

mean(fulldata\$Wt[fulldata\$Sex==0]);to check the mean is close to 73
mean(fulldata\$Wt[fulldata\$Sex==1]);to check the mean is close to 85

I see that the number of simulated values has an effect on the mean
calculated after imputation. That is, the code rlnorm(100, meanlog =
log(73), sdlog = sqrt(0.0442)) gives much better match compared to
rlnorm(1, meanlog = log(73), sdlog = sqrt(0.0442)) in ifelse statement in
the code above.

My understanding is that ifelse will be imputing only one value where the
condition is met as specified. I appreciate your insights on the behavior
for better performance of increasing sample number. I appreciate your

Regards,
Ayyappa

[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]
•  at Jun 14, 2016 at 3:42 pm ⇧

Yes. Have a look at this example

ifelse(
sample(c(TRUE, FALSE), size = 0.5 * length(letters), replace = TRUE),
letters,
LETTERS
)

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-06-14 17:31 GMT+02:00 Ayyappa Chaturvedula <ayyappach@gmail.com>:

Thank you very much for your kind support. The length of my condition
vector is ~80 because I want only Sex==1 and else will be the other. I
understand now how ifelse works. If the vector of the simulated vector is
longer than the condition vector, then it takes the first few elements to
match the length of condition vector and discards the rest?

Regards,
Ayyappa

On Tue, Jun 14, 2016 at 10:15 AM, Thierry Onkelinx <
thierry.onkelinx at inbo.be> wrote:
Dear Ayyappa,

ifelse works on a vector. See the example below.

ifelse(
sample(c(TRUE, FALSE), size = length(letters), replace = TRUE),
letters,
LETTERS
)

However, note that it will recycle short vectors when they are not of
equal length.

ifelse(
sample(c(TRUE, FALSE), size = 2 * length(letters), replace = TRUE),
letters,
LETTERS
)

In your code the length of the condition vector is 200, the length of the
two other vectors is 100.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-06-14 17:02 GMT+02:00 Ayyappa Chaturvedula <ayyappach@gmail.com>:
Dear Group,

I am trying to simulate a dataset with 200 individuals with random
assignment of Sex (1,0) and Weight from lognormal distribution specific
to
Sex. I am intrigued by the behavior of rlnorm function to impute a value
of Weight from the specified distribution. Here is the code:
ID<-1:200
Sex<-sample(c(0,1),200,replace=T,prob=c(0.4,0.6))
fulldata<-data.frame(ID,Sex)
fulldata\$Wt<-ifelse(fulldata\$Sex==1,rlnorm(100, meanlog = log(85.1),
sdlog
= sqrt(0.0329)),
rlnorm(100, meanlog = log(73), sdlog = sqrt(0.0442)))

mean(fulldata\$Wt[fulldata\$Sex==0]);to check the mean is close to 73
mean(fulldata\$Wt[fulldata\$Sex==1]);to check the mean is close to 85

I see that the number of simulated values has an effect on the mean
calculated after imputation. That is, the code rlnorm(100, meanlog =
log(73), sdlog = sqrt(0.0442)) gives much better match compared to
rlnorm(1, meanlog = log(73), sdlog = sqrt(0.0442)) in ifelse statement in
the code above.

My understanding is that ifelse will be imputing only one value where the
condition is met as specified. I appreciate your insights on the
behavior
for better performance of increasing sample number. I appreciate your

Regards,
Ayyappa

[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]
•  at Jun 14, 2016 at 3:47 pm ⇧
I am sorry, I missed that. I think I made it more appropriate and not
using unnecessary simulated values. Thank you for your help.

fulldata\$Wt<-ifelse(fulldata\$Sex==1,rlnorm(length(fulldata\$Sex[fulldata\$Sex==1]),
meanlog = log(85.1), sdlog = sqrt(0.0329)),
rlnorm(length(fulldata\$Sex[fulldata\$Sex==0]), meanlog =
log(73), sdlog = sqrt(0.0442)))

On Tue, Jun 14, 2016 at 10:42 AM, Thierry Onkelinx wrote:

Yes. Have a look at this example

ifelse(
sample(c(TRUE, FALSE), size = 0.5 * length(letters), replace = TRUE),
letters,
LETTERS
)

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-06-14 17:31 GMT+02:00 Ayyappa Chaturvedula <ayyappach@gmail.com>:
Thank you very much for your kind support. The length of my condition
vector is ~80 because I want only Sex==1 and else will be the other. I
understand now how ifelse works. If the vector of the simulated vector is
longer than the condition vector, then it takes the first few elements to
match the length of condition vector and discards the rest?

Regards,
Ayyappa

On Tue, Jun 14, 2016 at 10:15 AM, Thierry Onkelinx <
thierry.onkelinx at inbo.be> wrote:
Dear Ayyappa,

ifelse works on a vector. See the example below.

ifelse(
sample(c(TRUE, FALSE), size = length(letters), replace = TRUE),
letters,
LETTERS
)

However, note that it will recycle short vectors when they are not of
equal length.

ifelse(
sample(c(TRUE, FALSE), size = 2 * length(letters), replace = TRUE),
letters,
LETTERS
)

In your code the length of the condition vector is 200, the length of
the two other vectors is 100.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-06-14 17:02 GMT+02:00 Ayyappa Chaturvedula <ayyappach@gmail.com>:
Dear Group,

I am trying to simulate a dataset with 200 individuals with random
assignment of Sex (1,0) and Weight from lognormal distribution specific
to
Sex. I am intrigued by the behavior of rlnorm function to impute a
value
of Weight from the specified distribution. Here is the code:
ID<-1:200
Sex<-sample(c(0,1),200,replace=T,prob=c(0.4,0.6))
fulldata<-data.frame(ID,Sex)
fulldata\$Wt<-ifelse(fulldata\$Sex==1,rlnorm(100, meanlog = log(85.1),
sdlog
= sqrt(0.0329)),
rlnorm(100, meanlog = log(73), sdlog =
sqrt(0.0442)))

mean(fulldata\$Wt[fulldata\$Sex==0]);to check the mean is close to 73
mean(fulldata\$Wt[fulldata\$Sex==1]);to check the mean is close to 85

I see that the number of simulated values has an effect on the mean
calculated after imputation. That is, the code rlnorm(100, meanlog =
log(73), sdlog = sqrt(0.0442)) gives much better match compared to
rlnorm(1, meanlog = log(73), sdlog = sqrt(0.0442)) in ifelse statement
in
the code above.

My understanding is that ifelse will be imputing only one value where
the
condition is met as specified. I appreciate your insights on the
behavior
for better performance of increasing sample number. I appreciate your

Regards,
Ayyappa

[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]
•  at Jun 14, 2016 at 4:08 pm ⇧
You need to study my examples and the helpfile of ifelse more carefully.
Then you'll understand why your code is wrong.

?
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
Op 14 jun. 2016 17:47 schreef "Ayyappa Chaturvedula" <ayyappach@gmail.com>:

I am sorry, I missed that. I think I made it more appropriate and not
using unnecessary simulated values. Thank you for your help.

fulldata\$Wt<-ifelse(fulldata\$Sex==1,rlnorm(length(fulldata\$Sex[fulldata\$Sex==1]),
meanlog = log(85.1), sdlog = sqrt(0.0329)),
rlnorm(length(fulldata\$Sex[fulldata\$Sex==0]), meanlog
= log(73), sdlog = sqrt(0.0442)))

On Tue, Jun 14, 2016 at 10:42 AM, Thierry Onkelinx <
thierry.onkelinx at inbo.be> wrote:

Yes. Have a look at this example

ifelse(
sample(c(TRUE, FALSE), size = 0.5 * length(letters), replace = TRUE),
letters,
LETTERS
)

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-06-14 17:31 GMT+02:00 Ayyappa Chaturvedula <ayyappach@gmail.com>:
Thank you very much for your kind support. The length of my condition
vector is ~80 because I want only Sex==1 and else will be the other. I
understand now how ifelse works. If the vector of the simulated vector is
longer than the condition vector, then it takes the first few elements to
match the length of condition vector and discards the rest?

Regards,
Ayyappa

On Tue, Jun 14, 2016 at 10:15 AM, Thierry Onkelinx <
thierry.onkelinx at inbo.be> wrote:
Dear Ayyappa,

ifelse works on a vector. See the example below.

ifelse(
sample(c(TRUE, FALSE), size = length(letters), replace = TRUE),
letters,
LETTERS
)

However, note that it will recycle short vectors when they are not of
equal length.

ifelse(
sample(c(TRUE, FALSE), size = 2 * length(letters), replace = TRUE),
letters,
LETTERS
)

In your code the length of the condition vector is 200, the length of
the two other vectors is 100.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body of
data. ~ John Tukey

2016-06-14 17:02 GMT+02:00 Ayyappa Chaturvedula <ayyappach@gmail.com>:
Dear Group,

I am trying to simulate a dataset with 200 individuals with random
assignment of Sex (1,0) and Weight from lognormal distribution
specific to
Sex. I am intrigued by the behavior of rlnorm function to impute a
value
of Weight from the specified distribution. Here is the code:
ID<-1:200
Sex<-sample(c(0,1),200,replace=T,prob=c(0.4,0.6))
fulldata<-data.frame(ID,Sex)
fulldata\$Wt<-ifelse(fulldata\$Sex==1,rlnorm(100, meanlog = log(85.1),
sdlog
= sqrt(0.0329)),
rlnorm(100, meanlog = log(73), sdlog =
sqrt(0.0442)))

mean(fulldata\$Wt[fulldata\$Sex==0]);to check the mean is close to 73
mean(fulldata\$Wt[fulldata\$Sex==1]);to check the mean is close to 85

I see that the number of simulated values has an effect on the mean
calculated after imputation. That is, the code rlnorm(100, meanlog =
log(73), sdlog = sqrt(0.0442)) gives much better match compared to
rlnorm(1, meanlog = log(73), sdlog = sqrt(0.0442)) in ifelse statement
in
the code above.

My understanding is that ifelse will be imputing only one value where
the
condition is met as specified. I appreciate your insights on the
behavior
for better performance of increasing sample number. I appreciate your

Regards,
Ayyappa

[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

## Related Discussions

Discussion Overview
 group r-help categories r posted Jun 14, '16 at 3:02p active Jun 14, '16 at 4:08p posts 5 users 2 website r-project.org irc #r

### 2 users in discussion

Content

People

Support

Translate

site design / logo © 2018 Grokbase