FAQ
Hi,
I'm try to compute the minimum sample size needed to have at least an 80% of power, with alpha=0.05. The problem is that empirical proportions are really small: 0.00154 in one case and 0.00234. These are the estimated failure proportion of two medical treatments.
Thomas and Conlon (1992) suggested Fisher's exact test and proposed a computational method, which according to their table gives a sample size of roughly 20000. Unfortunately I cannot find any software applying their method.
-Does anyone know how to estimate sample size on Fisher's exact test by using R?
-Even better, does anybody know other, maybe optimal, methods for such a situation (small p1 and p2) and the corresponding R software?

Giulio

## Search Discussions

• at Nov 8, 2010 at 4:16 pm ⇧ Not with R, but look for G*Power3, a free tool for power calc,
includes FIsher's test.

http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3

On Mon, Nov 8, 2010 at 10:52 AM, Giulio Di Giovanni
wrote:

Hi,
I'm try to compute the minimum sample size needed to have at least an 80% of power, with alpha=0.05. The problem is that empirical proportions are really small: 0.00154 in one case and 0.00234. These are the estimated failure proportion of two medical treatments.
Thomas and Conlon (1992) suggested Fisher's exact test and proposed a computational method, which according to their table gives a sample size of roughly 20000. Unfortunately I cannot find any software applying their method.
-Does anyone know how to estimate sample size on Fisher's exact test by using R?
-Even better, does anybody know other, maybe optimal, methods for such a situation (small p1 and p2) and the corresponding R software?

Giulio

? ? ? ?[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.
• at Nov 8, 2010 at 4:37 pm ⇧ On Nov 8, 2010, at 11:16 AM, Mitchell Maltenfort wrote:

Not with R,
Really?

require(sos)
findFn("power exact test")
found 54 matches; retrieving 3 pages
2 3

These look on point:
http://finzi.psych.upenn.edu/R/library/statmod/html/power.html

http://finzi.psych.upenn.edu/R/library/binom/html/cloglog.sample.size.html

Would also think that methods based on a poisson model of rare events
could be informative:

http://finzi.psych.upenn.edu/R/library/asypow/html/asypow.n.html

--
David.
but look for G*Power3, a free tool for power calc,
includes FIsher's test.

http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3

On Mon, Nov 8, 2010 at 10:52 AM, Giulio Di Giovanni
wrote:

Hi,
I'm try to compute the minimum sample size needed to have at least
an 80% of power, with alpha=0.05. The problem is that empirical
proportions are really small: 0.00154 in one case and 0.00234.
These are the estimated failure proportion of two medical treatments.
Thomas and Conlon (1992) suggested Fisher's exact test and proposed
a computational method, which according to their table gives a
sample size of roughly 20000. Unfortunately I cannot find any
software applying their method.
-Does anyone know how to estimate sample size on Fisher's exact
test by using R?
-Even better, does anybody know other, maybe optimal, methods for
such a situation (small p1 and p2) and the corresponding R software?

Giulio

David Winsemius, MD
West Hartford, CT
• at Nov 8, 2010 at 5:13 pm ⇧ Hi,

I don't have access to the article, but must presume that they are doing something "radically different" if you are "only" getting a total sample size of 20,000. Or is that 20,000 per arm?

Using the G*Power app that Mitchell references below (which I have used previously, since they have a Mac app):

Exact - Proportions: Inequality, two independent groups (Fisher's exact test)

Options: Exact distribution

Analysis: A priori: Compute required sample size
Input: Tail(s) = Two
Proportion p1 = 0.00154
Proportion p2 = 0.00234
? err prob = 0.05
Power (1-? err prob) = 0.8
Allocation ratio N2/N1 = 1
Output: Sample size group 1 = 49851
Sample size group 2 = 49851
Total sample size = 99702
Actual power = 0.8168040
Actual ? = 0.0462658

Using the base R power.prop.test() function:
power.prop.test(p1 = 0.00154, p2 = 0.00234, power = 0.8)
Two-sample comparison of proportions power calculation

n = 47490.34
p1 = 0.00154
p2 = 0.00234
sig.level = 0.05
power = 0.8
alternative = two.sided

NOTE: n is number in *each* group

Using Frank's bsamsize() function in Hmisc:
bsamsize(p1 = 0.00154, p2 = 0.00234, fraction = .5, alpha = .05, power = .8)
n1 n2
47490.34 47490.34

Finally, throwing together a quick Monte Carlo simulation using the FET, I get:

TwoSampleFET <- function(n, p1, p2, power = 0.85,
R = 5000, correct = FALSE)
{
MCSim <- function(n, p1, p2)
{
Control <- rbinom(n, 1, p1)
Treat <- rbinom(n, 1, p2)
fisher.test(cbind(table(Control), table(Treat)))\$p.value
}

# Run MC Replicates
MC.res <- replicate(R, MCSim(n, p1, p2))

# Get p value at power quantile
quantile(MC.res, power)
}

# 50,000 per arm
TwoSampleFET(50000, p1 = 0.00154, p2 = 0.00234, power = 0.8, R = 500)
80%
0.04628263

So all four of these are coming back with numbers in the 48,000 to 50,000 ***per arm***.

HTH,

Marc Schwartz

On Nov 8, 2010, at 10:16 AM, Mitchell Maltenfort wrote:

Not with R, but look for G*Power3, a free tool for power calc,
includes FIsher's test.

http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3

On Mon, Nov 8, 2010 at 10:52 AM, Giulio Di Giovanni
wrote:

Hi,
I'm try to compute the minimum sample size needed to have at least an 80% of power, with alpha=0.05. The problem is that empirical proportions are really small: 0.00154 in one case and 0.00234. These are the estimated failure proportion of two medical treatments.
Thomas and Conlon (1992) suggested Fisher's exact test and proposed a computational method, which according to their table gives a sample size of roughly 20000. Unfortunately I cannot find any software applying their method.
-Does anyone know how to estimate sample size on Fisher's exact test by using R?
-Even better, does anybody know other, maybe optimal, methods for such a situation (small p1 and p2) and the corresponding R software?

Giulio
• at Nov 8, 2010 at 6:00 pm ⇧ Yep, it is 20.000 per arm, sorry. The reference it's about an application of the method, and I cannot download the paper with the main algorithm, so I don't know exactly how they did.
Thanks everybody for the rich and interesting suggestions. Through free web software (PS, others) I found also an N around 47.000 per arm. I guess these are the values (also seen Marc's Monte Carlo).
Maybe the Poisson models approach suggested by David can be an alternative, even if I guess at this point I won't get big differences in numbers. Would I?

Thanks a lot everybody again for your suggestions,
if anybody has other comments, they are always welcome.

Best,

Giulio

Subject: Re: [R] Sample size calculation for differences between two very small proportions (Fisher's exact test or others)?
From: marc_schwartz@me.com
Date: Mon, 8 Nov 2010 11:13:12 -0600
CC: perimessaggini@hotmail.com; r-help@stat.math.ethz.ch
To: mmalten@gmail.com

Hi,

I don't have access to the article, but must presume that they are doing something "radically different" if you are "only" getting a total sample size of 20,000. Or is that 20,000 per arm?

Using the G*Power app that Mitchell references below (which I have used previously, since they have a Mac app):

Exact - Proportions: Inequality, two independent groups (Fisher's exact test)

Options: Exact distribution

Analysis: A priori: Compute required sample size
Input: Tail(s) = Two
Proportion p1 = 0.00154
Proportion p2 = 0.00234
á err prob = 0.05
Power (1-â err prob) = 0.8
Allocation ratio N2/N1 = 1
Output: Sample size group 1 = 49851
Sample size group 2 = 49851
Total sample size = 99702
Actual power = 0.8168040
Actual á = 0.0462658

Using the base R power.prop.test() function:
power.prop.test(p1 = 0.00154, p2 = 0.00234, power = 0.8)
Two-sample comparison of proportions power calculation

n = 47490.34
p1 = 0.00154
p2 = 0.00234
sig.level = 0.05
power = 0.8
alternative = two.sided

NOTE: n is number in *each* group

Using Frank's bsamsize() function in Hmisc:
bsamsize(p1 = 0.00154, p2 = 0.00234, fraction = .5, alpha = .05, power = .8)
n1 n2
47490.34 47490.34

Finally, throwing together a quick Monte Carlo simulation using the FET, I get:

TwoSampleFET <- function(n, p1, p2, power = 0.85,
R = 5000, correct = FALSE)
{
MCSim <- function(n, p1, p2)
{
Control <- rbinom(n, 1, p1)
Treat <- rbinom(n, 1, p2)
fisher.test(cbind(table(Control), table(Treat)))\$p.value
}

# Run MC Replicates
MC.res <- replicate(R, MCSim(n, p1, p2))

# Get p value at power quantile
quantile(MC.res, power)
}

# 50,000 per arm
TwoSampleFET(50000, p1 = 0.00154, p2 = 0.00234, power = 0.8, R = 500)
80%
0.04628263

So all four of these are coming back with numbers in the 48,000 to 50,000 ***per arm***.

HTH,

Marc Schwartz

On Nov 8, 2010, at 10:16 AM, Mitchell Maltenfort wrote:

Not with R, but look for G*Power3, a free tool for power calc,
includes FIsher's test.

http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3

On Mon, Nov 8, 2010 at 10:52 AM, Giulio Di Giovanni
wrote:

Hi,
I'm try to compute the minimum sample size needed to have at least an 80% of power, with alpha=0.05. The problem is that empirical proportions are really small: 0.00154 in one case and 0.00234. These are the estimated failure proportion of two medical treatments.
Thomas and Conlon (1992) suggested Fisher's exact test and proposed a computational method, which according to their table gives a sample size of roughly 20000. Unfortunately I cannot find any software applying their method.
-Does anyone know how to estimate sample size on Fisher's exact test by using R?
-Even better, does anybody know other, maybe optimal, methods for such a situation (small p1 and p2) and the corresponding R software?

Giulio
• at Nov 8, 2010 at 7:27 pm ⇧ On Nov 8, 2010, at 1:00 PM, Giulio Di Giovanni wrote:

Yep, it is 20.000 per arm, sorry. The reference it's about an
application of the method, and I cannot download the paper with the
main algorithm, so I don't know exactly how they did.
Thanks everybody for the rich and interesting suggestions. Through
free web software (PS, others) I found also an N around 47.000 per
arm. I guess these are the values (also seen Marc's Monte Carlo).
Maybe the Poisson models approach suggested by David can be an
alternative, even if I guess at this point I won't get big
differences in numbers. Would I?
I certainly would not expect remarkable differences. With 50,000/arm
you would be expecting:
c(p1 = 0.00154, p2 = 0.00234)*50000
p1 p2
77 117
# with a rate ratio of:
0.00234/0.00154
 1.519481

A difference of 30 in expected counts would seem to give fairly
significant power. It seems that a Poisson structured test might give
you smaller numbers but probably not as small as 20,000
c(p1 = 0.00154, p2 = 0.00234)*20000
p1 p2
30.8 46.8

(The sd() of a Poisson variable is sqrt(mean()) so that 31 is well
within any sensibly constructed CI around 47.)

If you look up Table 7.5 in Breslow and Day (vol2, page 283) with a
relative risk of 1.5, the necessary expected value in the control
group using and equal sized control group ( for 80% power at 5%
significance) is 64.9. So that a bit lower than the 77 above but
implies that 42,207 would be needed.

--
David.

Thanks a lot everybody again for your suggestions,
if anybody has other comments, they are always welcome.

Best,

Giulio

Subject: Re: [R] Sample size calculation for differences between
two very small proportions (Fisher's exact test or others)?
From: marc_schwartz@me.com
Date: Mon, 8 Nov 2010 11:13:12 -0600
CC: perimessaggini@hotmail.com; r-help@stat.math.ethz.ch
To: mmalten@gmail.com

Hi,

I don't have access to the article, but must presume that they are
doing something "radically different" if you are "only" getting a
total sample size of 20,000. Or is that 20,000 per arm?
Using the G*Power app that Mitchell references below (which I have
used previously, since they have a Mac app):
Exact - Proportions: Inequality, two independent groups (Fisher's
exact test)
Options: Exact distribution

Analysis: A priori: Compute required sample size
Input: Tail(s) = Two
Proportion p1 = 0.00154
Proportion p2 = 0.00234
Î± err prob = 0.05
Power (1-Î² err prob) = 0.8
Allocation ratio N2/N1 = 1
Output: Sample size group 1 = 49851
Sample size group 2 = 49851
Total sample size = 99702
Actual power = 0.8168040
Actual Î± = 0.0462658

Using the base R power.prop.test() function:
power.prop.test(p1 = 0.00154, p2 = 0.00234, power = 0.8)
Two-sample comparison of proportions power calculation

n = 47490.34
p1 = 0.00154
p2 = 0.00234
sig.level = 0.05
power = 0.8
alternative = two.sided

NOTE: n is number in *each* group

Using Frank's bsamsize() function in Hmisc:
bsamsize(p1 = 0.00154, p2 = 0.00234, fraction = .5, alpha = .05,
power = .8)
n1 n2
47490.34 47490.34

Finally, throwing together a quick Monte Carlo simulation using
the FET, I get:
TwoSampleFET <- function(n, p1, p2, power = 0.85,
R = 5000, correct = FALSE)
{
MCSim <- function(n, p1, p2)
{
Control <- rbinom(n, 1, p1)
Treat <- rbinom(n, 1, p2)
fisher.test(cbind(table(Control), table(Treat)))\$p.value
}

# Run MC Replicates
MC.res <- replicate(R, MCSim(n, p1, p2))

# Get p value at power quantile
quantile(MC.res, power)
}

# 50,000 per arm
TwoSampleFET(50000, p1 = 0.00154, p2 = 0.00234, power = 0.8, R =
500)
80%
0.04628263

So all four of these are coming back with numbers in the 48,000 to
50,000 ***per arm***.

HTH,

Marc Schwartz

On Nov 8, 2010, at 10:16 AM, Mitchell Maltenfort wrote:

Not with R, but look for G*Power3, a free tool for power calc,
includes FIsher's test.

http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3

On Mon, Nov 8, 2010 at 10:52 AM, Giulio Di Giovanni
wrote:

Hi,
I'm try to compute the minimum sample size needed to have at
least an 80% of power, with alpha=0.05. The problem is that
empirical proportions are really small: 0.00154 in one case and
0.00234. These are the estimated failure proportion of two medical
treatments.
Thomas and Conlon (1992) suggested Fisher's exact test and
proposed a computational method, which according to their table
gives a sample size of roughly 20000. Unfortunately I cannot find
any software applying their method.
-Does anyone know how to estimate sample size on Fisher's exact
test by using R?
-Even better, does anybody know other, maybe optimal, methods
for such a situation (small p1 and p2) and the corresponding R
software?
Giulio
David Winsemius, MD
West Hartford, CT

## Related Discussions

Discussion Overview
 group r-help categories r posted Nov 8, '10 at 3:52p active Nov 8, '10 at 7:27p posts 6 users 4 website r-project.org irc #r

### 4 users in discussion

Content

People

Support

Translate

site design / logo © 2022 Grokbase