Recap the concept of statistical power
Introduce power analysis for study planning
Differentiate analytic from simulation based power calculations
Introduce R tools for both analytic and simulation power calculations
In planning, power analysis often boils down to understanding what sample size is big enough for a study.
What is big enough?
Power analysis can help us to work out how much data we should collect for our studies.
So you don't waste your time.
So you don't waste other people's time.
So you can answer the question(s) you're interested in.
Analytical solutions essentially just take known numbers;
...and solve for an unknown number
If you have an...
If you have an...
For complex designs, simulation can be used to estimate power.
If enough simulations are conducted, simulation-based estimates of power will approximate estimates of power from analytical solutions.
Why is it needed?
People commonly talk about underpowered studies.
What we are conducting our power analysis on, is a particular combination of design, statistical test, and effect size.
Read more here
Often researchers are asked to calculate power after they have collected data.
This is generally not a meaningful thing to do.
Consider a definition of power (bold added):
"The power of a test to detect a correct alternative hypothesis is the pre-study probability that the test will reject the test hypothesis (e.g., the probability that P will not exceed a pre-specified cut-off such as 0.05)." (Greenland et al., 2016, p. 345)
Conventionally, alpha is fixed, most commonly at .05.
A common conventional value for power is .8.
However, both of these conventions are arbitrary cut-offs on continuous scales.
As such, sometimes researchers will calculate power curves
It is important to justify your decisions, including the alpha., effect size and power levels you choose (see Lakens et al., 2017).
pwr
packageLet's use an an independent samples t-test.
In our imaginary study, we will compare the high striker scores of two population-representative groups:
We hypothesise that the training group will have higher scores.
To begin, let's work with typical sample and effect sizes.
The median total sample size across four APA journals in 2006 was 40 (Marsazalek, Barber, Kohlhart, & Holmes, 2011, Table 1).
A Cohen's d effect size of 0.5 has been considered a medium effect size (Cohen, 1992, Table 1). We can plug these values in and work out what our statistical power would be with them if we use the conventional alpha level of .05.
We will use the pwr
package.
library(pwr)pwr.t.test(n = 20, d = 0.5, sig.level = .05, type= "two.sample", alternative = "two.sided")
pwr.t.test(n = 20,d = 0.5, sig.level = .05, type= "two.sample", alternative = "two.sided")
## ## Two-sample t test power calculation ## ## n = 20## d = 0.5## sig.level = 0.05## power = 0.337939## alternative = two.sided## ## NOTE: n is number in *each* group
Our power is only .34!
So if there is a true standardised effect of .5 in the population, studies with 20 participants per group should only expect to reject the null hypothesis about 34% of the time.
pwr.t.test(n = 20, d = 0.5, sig.level = .05, type= "two.sample", alternative = "greater")
## ## Two-sample t test power calculation ## ## n = 20## d = 0.5## sig.level = 0.05## power = 0.4633743## alternative = greater## ## NOTE: n is number in *each* group
pwr.t.test(power = 0.95, d = 0.5, sig.level = .05, type= "two.sample", alternative = "greater")
## ## Two-sample t test power calculation ## ## n = 87.26261## d = 0.5## sig.level = 0.05## power = 0.95## alternative = greater## ## NOTE: n is number in *each* group
pwr.t.test(power = 0.95, d = 0.5, sig.level = .005, type= "two.sample", alternative = "greater")
## ## Two-sample t test power calculation ## ## n = 144.1827## d = 0.5## sig.level = 0.005## power = 0.95## alternative = greater## ## NOTE: n is number in *each* group
res <- pwr.t.test(power = 0.95, d = 0.5, sig.level = .005, type= "two.sample", alternative = "greater")plot(res)
pwr
for correlationpwr
for correlations, we use the pwr.r.test()
functionpwr.r.test(n = , r = , sig.level = , power = )
r
is the effect size, and the other three arguments are as we have defined previously.pwr
for correlationpwr.r.test(r = 0.15, sig.level = .05, power = .90)
## ## approximate correlation power calculation (arctangh transformation) ## ## n = 462.0711## r = 0.15## sig.level = 0.05## power = 0.9## alternative = two.sided
pwr
for F-testspwr.f2.test()
pwr.f2.test(u = , #numerator degrees of freedom (model) v = , #denominator degrees of freedom (residual) f2 = , #stat to be calculated (below) sig.level = , power = )
u
and v
come from study design.
u
= predictors in the model ( k ) v
= n-k-1There are two versions of f2
pwr
for F-testsf2=R21−R2
This should be used when we want to see the overall power of a set of predictors
For example, if we wanted sample size for an overall R2 of 0.10, with 5 predictors, power of 0.8 and α = .05
pwr.f2.test(u = 5, #numerator degrees of freedom (model) #v = , #denominator degrees of freedom (residual) f2 = 0.10/(1-0.10), #stat to be calculated (below) sig.level = .05, power = .80 )
pwr
for F-testspwr.f2.test(u = 5, #numerator degrees of freedom (model) #v = , #denominator degrees of freedom (residual) f2 = 0.10/(1-0.10), #stat to be calculated (below) sig.level = .05, power = .80 )
## ## Multiple regression power calculation ## ## u = 5## v = 115.1043## f2 = 0.1111111## sig.level = 0.05## power = 0.8
pwr
for F-testsf2=R2AB−R2A1−R2AB
This is the power for the incremental-F or the difference between a restricted ( R2A ) and a full ( R2AB ) model.
For example, if we wanted sample size for a difference between 0.10 (model with 2 predictors) and 0.15 (model with 5 predictors), power of 0.8 and α = .05
pwr.f2.test(u = 3, #numerator degrees of freedom (model) #v = , #denominator degrees of freedom (residual) f2 = (0.15 - 0.10)/(1-0.15), #stat to be calculated (below) sig.level = .05, power = .80 )
pwr
for F-testspwr.f2.test(u = 3, #numerator degrees of freedom (model) #v = , #demoninator degrees of freedom (residual) f2 = (0.15 - 0.10)/(1-0.15), #stat to be calculated (below) sig.level = .05, power = .80 )
## ## Multiple regression power calculation ## ## u = 3## v = 185.2968## f2 = 0.05882353## sig.level = 0.05## power = 0.8
WebPower
)I want to do no more than tell you WebPower
exists.
There is an R package that does all of the simple analytic tests
Power analysis is an important step in study design
For simple models, we can make use of the pwr._
functions for analytic solutions
For complex models, we can estimate power by simulation.
Both can be done in R.
Recap the concept of statistical power
Introduce power analysis for study planning
Differentiate analytic from simulation based power calculations
Introduce R tools for both analytic and simulation power calculations
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |