FAQ
Dear all,

I found a discrepancy while performing a power calculation for a two sample
t-test in R and S-PLUS, respectively.
For given values of sample number (5 each), sd (0.2) , significance level
(0.01), and a desired power (80%) I looked for the difference in means.
These values differ: 0.5488882 in R and 0.4322771 in S-PLUS (see dump
below).

Did I overlook any detail or confuse some parameters?

Joern Quedenau

Here are the commands & outputs from both tools:

R 1.4.0
power.t.test(n=5, sd=0.2, sig.level=0.01, power=0.8, type="two.sample",
alternative="two.sided")

Two-sample t test power calculation

n = 5
delta = 0.5488882
sd = 0.2
sig.level = 0.01
power = 0.8
alternative = two.sided

NOTE: n is number in *each* group

S-PLUS 2000 Professional Release 2:
normal.sample.size(n1=5, n2=5, mean=0, sd1=0.2, sd2=0.2, power=0.8,
alpha=0.01, alternative="two.sided")
mean1 sd1 mean2 sd2 delta alpha power n1 n2 prop.n2
1 0 0.2 0.4322771 0.2 0.4322771 0.01 0.8 5 5 1

------------------------------------------
Dr. J?rn Quedenau
Coordinator Data Management Bioinformatics
Metanomics GmbH & Co. KGaA
Tegeler Weg 33, D-10589 Berlin, Germany
Tel +49 30 34807 125, Fax +49 30 34807 300

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

## Search Discussions

•  at Mar 1, 2002 at 10:29 am ⇧

On Fri, 1 Mar 2002 joern.quedenau at metanomics.de wrote:

Dear all,

I found a discrepancy while performing a power calculation for a two sample
t-test in R and S-PLUS, respectively.
For given values of sample number (5 each), sd (0.2) , significance level
(0.01), and a desired power (80%) I looked for the difference in means.
These values differ: 0.5488882 in R and 0.4322771 in S-PLUS (see dump
below).

Did I overlook any detail or confuse some parameters?
Yes. normal.sample.size is not for a Student's t test, as its name might
suggest (at least, it did to me). It uses a normal distribution and
assumes the variances are known. On the other hand, power.t.test appears to
be for a conventional equi-variance t-test, and as it needs to estimate the
variances has lower power and hence selects a larger minimum delta.

BTW, the t-test that power.t.test computes the power of is not the default
t-test in R, as I understand the code. (?power.t.test is silent on which
two-sample t-test but the calculations look right for the one that uses the
pooled variance. Not that it necessarily matters.)

This is only evident because n1 and n2 are so small: at those sample sizes
you are relying critically on normality of the samples, and for
power.t.test on equal variances. For n1 = n2 = 25, the difference is much
smaller (0.193 vs 0.200).
Joern Quedenau

Here are the commands & outputs from both tools:

R 1.4.0
power.t.test(n=5, sd=0.2, sig.level=0.01, power=0.8, type="two.sample",
alternative="two.sided")

Two-sample t test power calculation

n = 5
delta = 0.5488882
sd = 0.2
sig.level = 0.01
power = 0.8
alternative = two.sided

NOTE: n is number in *each* group

S-PLUS 2000 Professional Release 2:
normal.sample.size(n1=5, n2=5, mean=0, sd1=0.2, sd2=0.2, power=0.8,
alpha=0.01, alternative="two.sided")
mean1 sd1 mean2 sd2 delta alpha power n1 n2 prop.n2
1 0 0.2 0.4322771 0.2 0.4322771 0.01 0.8 5 5 1

------------------------------------------
Dr. Jörn Quedenau
Coordinator Data Management Bioinformatics
Metanomics GmbH & Co. KGaA
Tegeler Weg 33, D-10589 Berlin, Germany
Tel +49 30 34807 125, Fax +49 30 34807 300

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
•  at Mar 1, 2002 at 11:51 am ⇧

Prof Brian D Ripley <ripley@stats.ox.ac.uk> writes:

BTW, the t-test that power.t.test computes the power of is not the default
t-test in R, as I understand the code. (?power.t.test is silent on which
two-sample t-test but the calculations look right for the one that uses the
pooled variance. Not that it necessarily matters.)
Exactly. For equal sample sizes the equal and non-equal variance
t-tests are essentially the same unless the variance difference is
huge (the t statistic is identical in that case, although the degrees
of freedom can be different). The important cases for deciding between
the Welch test and the equal-variance one are when you're comparing a
small group with a large variance to a large group with a small
variance.

The Welch test relies on an asymptotic expansion, and I wouldn't know
how it behaves in very small sample cases, so it did seem best to
stick with the plain "classical theory" t-test where all the
distributions can be worked out exactly.

The help page does seem to be improvable (said the author...)
This is only evident because n1 and n2 are so small: at those sample sizes
you are relying critically on normality of the samples, and for
power.t.test on equal variances. For n1 = n2 = 25, the difference is much
smaller (0.193 vs 0.200).
Yes. This is mostly to do with the degrees of freedom issue though.
Neither of the procedures are any good at correcting for
non-normality. The central limit theorem does that to some extent, and
for equal sample sizes skewness tends to cancel out (in the two sample
case, of course).

Anyways, in my line of business it is quite common for people to
design studies with single-digit numbers per group, and the
traditional normal approximations can be 10-20% off target on the
sample sizes relative to the exact power calculation using the
noncentral t, so I though it would be a good thing to include that
correction, even though multiple other factors could bias the
calculations.

--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
•  at Mar 1, 2002 at 5:19 pm ⇧

On Fri, 1 Mar 2002 joern.quedenau at metanomics.de wrote:

Dear all,

I found a discrepancy while performing a power calculation for a two sample
t-test in R and S-PLUS, respectively.
For given values of sample number (5 each), sd (0.2) , significance level
(0.01), and a desired power (80%) I looked for the difference in means.
These values differ: 0.5488882 in R and 0.4322771 in S-PLUS (see dump
below).

Did I overlook any detail or confuse some parameters?
The S-PLUS routine references Fisher & van Belle. In that book the
authors use a unified approximate power calculation method that works for
a wide range of studies but is not very accurate for tiny sample sizes.
In most cases this doesn't matter because the assumptions going into a
study design aren't any more accurate, and in tiny sample sizes the power
is sensitive to the assumption that the data are Normally distributed.

The power.t.test formula uses the non-central t distribution and so will
give more accurate, lower power values for small samples.

You can see which one is correct by simulation (which is how I typically
do power calculations)
table(sapply(1:10000,function(i)
t.test(rnorm(5,0,s=0.2),rnorm(5,.5488882,s=0.2),var.equal=TRUE)\$p.value)<=0.01)

FALSE TRUE
2023 7977
table(sapply(1:10000,function(i)
t.test(rnorm(5,0,s=0.2),rnorm(5,.4322771,s=0.2),var.equal=TRUE)\$p.value)<=0.01)

FALSE TRUE
4526 5474

So at 0.548882 there is about 80% power, at 0.4322771 there is about 55%
power (with sampling uncertainties of about +/- 2% in each number).

It is interesting to note that a simulation shows the unequal-variance
t-test to have only 75% power at 0.5488882, indicating the sensitivity of
the power calculations at this sample size.

-thomas

Thomas Lumley Asst. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle

Joern Quedenau

Here are the commands & outputs from both tools:

R 1.4.0
power.t.test(n=5, sd=0.2, sig.level=0.01, power=0.8, type="two.sample",
alternative="two.sided")

Two-sample t test power calculation

n = 5
delta = 0.5488882
sd = 0.2
sig.level = 0.01
power = 0.8
alternative = two.sided

NOTE: n is number in *each* group

S-PLUS 2000 Professional Release 2:
normal.sample.size(n1=5, n2=5, mean=0, sd1=0.2, sd2=0.2, power=0.8,
alpha=0.01, alternative="two.sided")
mean1 sd1 mean2 sd2 delta alpha power n1 n2 prop.n2
1 0 0.2 0.4322771 0.2 0.4322771 0.01 0.8 5 5 1

------------------------------------------
Dr. Jörn Quedenau
Coordinator Data Management Bioinformatics
Metanomics GmbH & Co. KGaA
Tegeler Weg 33, D-10589 Berlin, Germany
Tel +49 30 34807 125, Fax +49 30 34807 300

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Thomas Lumley Asst. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
^^^^^^^^^^^^^^^^^^^^^^^^

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
•  at Mar 1, 2002 at 9:14 pm ⇧
Could someone shed some light on the statement from Thomas
that "(with sampling uncertainties of about +/- 2% in each
number)". What exactly is being said about the accuracy of
the simulation and how is that number +/-2% being
determined.

Thanks,

Alan

So at 0.548882 there is about 80% power, at 0.4322771 there is about 55%
power (with sampling uncertainties of about +/- 2% in each number).

It is interesting to note that a simulation shows the unequal-variance
t-test to have only 75% power at 0.5488882, indicating the sensitivity of
the power calculations at this sample size.

-thomas

Thomas Lumley Asst. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
----------------------
Alan T. Arnholt
Associate Professor
Department of Mathematical Sciences
Appalachian State University

Tel: (828) 262-2863
Fax: (828) 265-8617

http://www1.appstate.edu/~arnholta/

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
•  at Mar 1, 2002 at 10:29 pm ⇧

On Fri, 1 Mar 2002, Alan T. Arnholt wrote:

Could someone shed some light on the statement from Thomas
that "(with sampling uncertainties of about +/- 2% in each
number)". What exactly is being said about the accuracy of
the simulation and how is that number +/-2% being
determined.
The power is estimated by doing 10000 simulated experiments and finding
that 7977 of them give a p-value below 0.01.

The number of p-values less than 0.01 is random (these are simulations),
and has a Binomial(10000, P) distribution, where P is the power. It is
possible to construct an exact 95% confidence interval for P based on the
observed results of the simulations. The binom.test function does this and
says that a 95% ci for the power is (0.790,0.806). Similarly for the
delta specified by S-PLUS the 95% ci for the power is (0.537,0.557). The
sampling error is actually about +/- 1% (I was incorrectly rounding in my
previous email).

If you had good parameter values to use in a power calculation and wanted
to be as precise in your claims as possible you might do a simulation like
this and quote a suitable lower confidence limit for the power. You can
get very tight limits if you want. For example, a lower 99.99% confidence
limit for the power based on this simulation is 0.782.

It matters more when the problem is more complicated and you can't readily
do 10000 replicates in the simulation.

-thomas

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

## Related Discussions

Discussion Overview
 group r-help categories r posted Mar 1, '02 at 10:04a active Mar 1, '02 at 10:29p posts 6 users 5 website r-project.org irc #r

### 5 users in discussion

Content

People

Support

Translate

site design / logo © 2017 Grokbase