April \(30^{th}\) 2020

- It says
*when*you should do your stats - It gives an idea of the actual resources we should put into a project
**It tells us when the numerosity of our sample is actually representative of the statistical population**

In census the sample size is the whole population!

Generally speaking, the larger the sample size, the more accurate are your estimates and your statistics.

However we have a lot of limitations: availability of participants, time contraints, money constraints, etc…

Therefore the *optimum* sample size is the minimum number of participants needed to have statistically representative data.

In frequentist statistics (NHST), some fundamental concepts are:

- first type error (\(\alpha\)) – typically set at 5% (0.05) (False Positives)
- second type error (\(\beta\)) (False Negatives)

The power of a statistical test is given by \(1-\beta\).

**It is the probability to reject the null hypothesis when it is false.**

Generally, accepted values of power are: 80%, 90%, 95% and 99%, which means \(\beta = 20\%, 10\%, 5\%, 1\%\).

**The sample size is the number of observations required, to obtain a statistically significant result in the power% of cases, given a specific effect size.**

First introduced by Cohen 1988, they demonstrated their importance in short time.

The effect size of a test is a (usually) standardized measure of the dimension of an effect (difference between groups, covariation, etc…) that should always been presented altogether with a (significant) p-value.

If the p-value says that there is an effect, the effect size says us if the effect is worth to be mentioned.

The effect size is an indication of how much of the control observation is lower (or higher) than the experimental observations.

Nominal size | Effect size | % control < experimental observations |
---|---|---|

0.0 | 50% | |

Small | 0.2 | 58% |

Medium | 0.5 | 69% |

Large | 0.8 | 79% |

1.4 | 92% |

https://rpsychologist.com/d3/cohend/ \(\leftarrow\) interesting visualization

The sample size is computed as a function of the (desired) power (\(1-\beta\)) and effect size (\(ES\)).

It is the *optimum* sample size needed to have the desired power, specifying a certain effect size.

Taking for example the case of a single sample and one-tail t-test, with the Cohen's \(d\) as effect size, it solves for \(n\) the following system:

\[ \begin{cases} Pr_t(q,\nu,c) = 1-\beta\\ \nu = (n-1)\\q = Q_t(\alpha , \nu)\\c = \sqrt{n}\times d \end{cases} \]

\(\nu = n-1\) these are the degrees of freedom of the Student's t distribution

\(q = Q_t(\alpha,\nu)\) this is the quantile of the t distribution at \(\alpha\) (usually 0.05) with these d.o.f.

\(c = \sqrt{n}\times d\) is the non-centrality parameter of the t distribution. Larger the sample, more deviated is the distribution.

\(Pr_t(q, \nu, c)\) is the integrate of the curve of the t distribution stopped at quantile \(q\).

*EXAMPLE*

\(d = 0.3\); \(1-\beta = 0.8\); \(\alpha = 0.05\)

\(n = 70\)

\(\nu = 69\)

\(q = Q_t(\alpha,\nu) = Q_t(0.05,69) \simeq 1.67\)

\(c = \sqrt{n}\times d = \sqrt{69}\times 0.3 \simeq 8.31\times0.3 \simeq 2.49\)

\(Pr_t(q, \nu, c) = Pr_t(1.67, 69, 2.49) = 0.7932814\)

- Values are mirrored.

Recalling that: *the sample size is the number of observations required, to obtain a statistically significant result in the power% of cases, given a specific effect size.*

Do you think that if I reach a statistically significant result with less observations than those required by the sample size, it is OK?

Why?

**No, it is not OK.**

The motivation is that the sample size if the minimum number of observations (subjects) required to have a representative sample of the statistical population.

Smaller samples can be more prone to (less obvious) outliers, therefore there is not only the risk to have a greater second type error, but also a first type error.

Here I present three main ways:

- By means of the functions first proposed by Cohen (1988)
- By means of the functions proposed by Chow SC, Shao J, Wang H. (2008)
- By means of simulations

When we compute the sample size, we need to think to our hypotheses and to the specific contrasts of interest.

We also need to think to the possible effect size.

How to determine it?

- pilot data
- data in literature
- standard tables

In order to use the Cohen's formular, there is the package *pwr* in R.

In this package there are several functions that allow us to compute the sample size in different cases.

function | power calculations for | ES standard values |
---|---|---|

`pwr.2p.test` |
two proportions (equal n) | h - S: 0.2; M: 0.5; L: 0.8 |

`pwr.2p2n.test` |
two proportions (unequal n) | h - S: 0.2; M: 0.5; L: 0.8 |

`pwr.anova.test` |
balanced one way ANOVA | f - S: 0.1; M: 0.25; L: 0.4 |

`pwr.chisq.test` |
chi-square test | w - S: 0.1; M: 0.3; L: 0.5 |

`pwr.f2.test` |
general linear model | f2 - S: 0.02; M: 0.15; L: 0.35 |

function | power calculations for | ES standard values |
---|---|---|

`pwr.p.test` |
proportion (one sample) | h - S: 0.2; M: 0.5; L: 0.8 |

`pwr.r.test` |
correlation | r - S: 0.1; M: 0.3; L: 0.5 |

`pwr.t.test` |
t-tests (one sample, 2 sample, paired) | d - S: 0.2; M: 0.5; L: 0.8 |

`pwr.t2n.test` |
t-test (two samples with unequal n) | d - S: 0.2; M: 0.5; L: 0.8 |

with the function *cohen.ES* you can have all the standard effect sizes.

More or less, all the pwr.?.test have the same parameter to be specified:

- power: a value within 0 and 1, stating the desired power
- sig.level: the \(\alpha\). If you do not specify it, by default is = 0.05
- alternative: "two.sided", "greater", "less"
- (different for each function): the effect size
- (only in some function) type: "two.sample" (comparing two independent groups), "one.sample" (one group against \(\mu\)), "paired" (comparing the same group in two different times)

Let's compute the sample size previously seen

library(pwr) pwr.t.test(d = 0.3, power = 0.8, sig.level = 0.05, type = "one.sample", alternative = "greater")

## ## One-sample t test power calculation ## ## n = 70.06791 ## d = 0.3 ## sig.level = 0.05 ## power = 0.8 ## alternative = greater

The parameters are:

- d: the effect size
- sig.level
- power
- type: "two.sample", "one.sample", "paired"
- alternative: "two.sided", "less", "greater"

*EXAMPLE*

We want to train black and white cats in jumping at a command.

If our hypothesis is that black cats will jump more, how many cats we have to train?

Let's use a power of 80%, and a medium effect size.

*EXAMPLE*

## ## Two-sample t test power calculation ## ## n = 50.1508 ## d = 0.5 ## sig.level = 0.05 ## power = 0.8 ## alternative = greater ## ## NOTE: n is number in *each* group

The function is pwr.anova.test, and it works only with one-way balanced designs.

That means that if we have a multifactorial design, we have to force it as a one-way design.

The null hypothesis is that in all levels of the factor the means are equal, the alternative hypothesis is that there are at least two levels that are statistically different from each other.

*EXAMPLE*

We have an "Empathy for pain" experiment. In this experiment, we collect physiological data during the vision of 3 typologies of videos: - Control video, A video of a Syringe penetrating an hand, A video of a Q-tip touching a hand In two perspectives: - 1PP, 3PP The colour of the Q-tip or syringe can change: - Blue, green, pink

Therefore, the design is a \(3\times2\times3\). If we translate it as a one.way design, the total number of "groups" (k) is \(18\).

However, our hypothesis is that there is a difference between the Syringe videos in the 1PP and 3PP, therefore touching two factors: \(3\times2\). The total number of groups to take into account is \(6\).

*EXAMPLE*

Let's use a medium effect size \(f = 0.25\)

pwr.anova.test(k = 6, f = 0.25, power = 0.8, sig.level = 0.05)

## ## Balanced one-way analysis of variance power calculation ## ## k = 6 ## n = 35.14095 ## f = 0.25 ## sig.level = 0.05 ## power = 0.8 ## ## NOTE: n is number in each group

The function is pwr.f2.test.

Formally, it is for multiple regressions, but it does the same things seen in pwr.anova.test with much more flexibility.

You do not need to force your design into a "one-way" study, and you can take into account covariates.

The parameter that takes this function are:

- u: degrees of freedom for numerator
- v: degrees of freedom for denominator
- f2: effect size
- sig.level
- power

You need to give to the function the number of degrees of freedom for the numerator. In this way the function will return the number of degrees of freedom of the demoninator. From this value we can estimate the sample size.

\(u\) is equal to the number of levels of the factor/interaction minus 1. \(v\) is the degrees of freedom of the residuals.

The sample size is \(u + v + 1\).

*EXAMPLE*

At the previous study, we add the evaluation of embodiment on a likert scale of the hand seen in the videos.

The d.o.f. at the numerator are now: \(2\times3\times1 - 1\)

Let's use a medium effect size \(f2 = 0.15\)

pwr.f2.test(u = 5, f2 = 0.15, power = 0.8, sig.level = 0.05)

## ## Multiple regression power calculation ## ## u = 5 ## v = 85.21369 ## f2 = 0.15 ## sig.level = 0.05 ## power = 0.8

Compute the required sample size to understand if females are more creative than males, using a small effect size and a power = 80%.

Estimate the sample size necessary to understand if people feel more energic after a hot shower, with a moderate effect size and a power = 90%.

Repeat, with a small effect size and power = 80%.

- We have three groups of people: volleyball players, basketball players and normal people. Your hypothesis is that normal people are less reactive in a go-no-go experiment. Find the total number of partecipants required with a medium effect size and a power of 99%.

- In an experiment concerning memory we have three groups: the coffee group, the tea group and the water group, and two between-subjects conditions: sleeping deprivation and normal sleeping. Find the sample size per group, with a power of 80% and a large effect size.

We can use the *TrialSize* package.

It has 80 different functions to compute the sample size.

Some of interest are:

`ANOVA.Repeat.Measure`

`CrossOver.ISV.Equality`

`CrossOver.ISV.NIS`

\(H_0: \forall \mu_{i} = \mu_{(j:k) \neq i}\); \(H_1: \exists \mu_{i} \neq \mu_{(j:k) \neq i}\)

This function needs:

- alpha
- beta: be careful, the power is \(1-\beta\). If you want power = \(80\%\), beta = \(0.2\)
- sigma: the sum of the variance components
- delta: the difference that we consider as meaningful
- m: the total number of Bonferroni adjustments needed for post-hoc tests

**No space for standard ES!**

*Suggestion: in terms of Cohen's d, delta = ES \(\times \sqrt{sigma}\)*

*EXAMPLE*

We use again the same example seen before.

Our physiological data have been transformed into z-scores, therefore a meaningful difference may be 1.5

We set the sum of the variances as \(\sqrt{delta\div ES^2}\) = \(\sqrt{1.5\div0.5^2} \simeq 2.45\)

s.size <- ANOVA.Repeat.Measure(alpha = 0.05, beta = 0.2, sigma = 2.45, delta = 1.5, m = 6) s.size

## [1] 64.6112

\(sigma^T\) is the within-subjects variance for treatment T

\(H_0: \forall \sigma^{T^1}_{1} = \sigma^{T^2}_{2}\); \(H_1: \exists \sigma^{T^1}_{1} \neq \sigma^{T^2}_{2}\)

This function needs:

- alpha
- beta: be careful, the power is \(1-\beta\). If you want power = \(80\%\), beta = \(0.2\)
- sigma1: within-subject variance of treatment 1
- sigma2: within-subject variance of treatment 1
- m: for each subject, there are m replicates

*Suggestion: think sigmas in terms of percentages or z-scores*

*EXAMPLE*

Cross-Over design with treatment and placebo condition, 5 trainings per week, done for two weeks.

Data are in z-scores

CrossOver.ISV.Equality(alpha = 0.05, beta = 0.2, sigma1 = 1, sigma2 = 2, m = 10)

## [1] 2.0000000 0.2573179 ## [1] 3.0000000 0.3880039 ## [1] 4.0000000 0.4632526 ## [1] 5.0000000 0.5143397 ## [1] 6.0000000 0.5521888 ## [1] 7.0000000 0.5818003 ## [1] 8.0000000 0.6058496 ## [1] 9.0000000 0.6259223 ## [1] 10.0000000 0.6430289 ## [1] 11.0000000 0.6578498 ## [1] 12.0000000 0.6708628 ## [1] 13.0000000 0.6824154 ## [1] 14.000000 0.692767 ## [1] 15.0000000 0.7021163 ## [1] 16.0000000 0.7106183 ## [1] 17.000000 0.718396 ## [1] 18.0000000 0.7255486 ## [1] 19.0000000 0.7321571 ## [1] 20.0000000 0.7382884 ## [1] 21.0000000 0.7439981 ## [1] 22.0000000 0.7493334 ## [1] 23.0000000 0.7543342 ## [1] 24.0000000 0.7590344 ## [1] 25.0000000 0.7634637 ## [1] 26.0000000 0.7676474 ## [1] 27.0000000 0.7716077 ## [1] 28.0000000 0.7753643 ## [1] 29.0000000 0.7789342 ## [1] 30.0000000 0.7823326 ## [1] 31.0000000 0.7855731 ## [1] 32.0000000 0.7886677 ## [1] 33.0000000 0.7916271 ## [1] 34.0000000 0.7944611 ## [1] 35.0000000 0.7971784 ## [1] 36.0000000 0.7997869 ## [1] 37.0000000 0.8022938 ## [1] 38.0000000 0.8047056 ## [1] 39.0000000 0.8070282 ## [1] 40.0000000 0.8092671 ## [1] 41.0000000 0.8114273 ## [1] 42.0000000 0.8135133 ## [1] 43.0000000 0.8155293 ## [1] 44.0000000 0.8174792 ## [1] 45.0000000 0.8193666 ## [1] 46.0000000 0.8211948 ## [1] 47.0000000 0.8229669 ## [1] 48.0000000 0.8246857 ## [1] 49.0000000 0.8263538 ## [1] 50.0000000 0.8279738 ## [1] 51.0000000 0.8295479 ## [1] 52.0000000 0.8310783 ## [1] 53.000000 0.832567 ## [1] 54.0000000 0.8340158 ## [1] 55.0000000 0.8354267 ## [1] 56.0000000 0.8368011 ## [1] 57.0000000 0.8381406 ## [1] 58.0000000 0.8394468 ## [1] 59.000000 0.840721 ## [1] 60.0000000 0.8419644 ## [1] 61.0000000 0.8431784 ## [1] 62.0000000 0.8443641 ## [1] 63.0000000 0.8455226 ## [1] 64.0000000 0.8466548 ## [1] 65.0000000 0.8477619 ## [1] 66.0000000 0.8488447 ## [1] 67.0000000 0.8499041 ## [1] 68.000000 0.850941 ## [1] 69.0000000 0.8519561 ## [1] 70.0000000 0.8529502 ## [1] 71.000000 0.853924 ## [1] 72.0000000 0.8548783 ## [1] 73.0000000 0.8558136 ## [1] 74.0000000 0.8567306 ## [1] 75.0000000 0.8576299 ## [1] 76.0000000 0.8585121 ## [1] 77.0000000 0.8593776 ## [1] 78.0000000 0.8602271 ## [1] 79.000000 0.861061 ## [1] 80.0000000 0.8618798 ## [1] 81.0000000 0.8626839 ## [1] 82.0000000 0.8634738 ## [1] 83.0000000 0.8642499 ## [1] 84.0000000 0.8650126 ## [1] 85.0000000 0.8657623 ## [1] 86.0000000 0.8664994 ## [1] 87.0000000 0.8672241 ## [1] 88.0000000 0.8679369 ## [1] 89.0000000 0.8686381 ## [1] 90.000000 0.869328 ## [1] 91.0000000 0.8700069 ## [1] 92.0000000 0.8706751 ## [1] 93.0000000 0.8713328 ## [1] 94.0000000 0.8719804 ## [1] 95.0000000 0.8726181 ## [1] 96.0000000 0.8732461 ## [1] 97.0000000 0.8738647 ## [1] 98.0000000 0.8744742 ## [1] 99.0000000 0.8750747 ## [1] 100.0000000 0.8756665 ## [1] 101.0000000 0.8762498 ## [1] 102.0000000 0.8768248 ## [1] 103.0000000 0.8773916 ## [1] 104.0000000 0.8779506 ## [1] 105.0000000 0.8785018 ## [1] 106.0000000 0.8790454 ## [1] 107.0000000 0.8795817 ## [1] 108.0000000 0.8801107 ## [1] 109.0000000 0.8806327 ## [1] 110.0000000 0.8811478 ## [1] 111.0000000 0.8816561 ## [1] 112.0000000 0.8821579 ## [1] 113.0000000 0.8826531 ## [1] 114.0000000 0.8831421 ## [1] 115.0000000 0.8836249 ## [1] 116.0000000 0.8841016 ## [1] 117.0000000 0.8845724 ## [1] 118.0000000 0.8850374 ## [1] 119.0000000 0.8854967 ## [1] 120.0000000 0.8859504 ## [1] 121.0000000 0.8863987 ## [1] 122.0000000 0.8868416 ## [1] 123.0000000 0.8872793 ## [1] 124.0000000 0.8877118 ## [1] 125.0000000 0.8881393 ## [1] 126.0000000 0.8885618 ## [1] 127.0000000 0.8889796 ## [1] 128.0000000 0.8893925 ## [1] 129.0000000 0.8898008 ## [1] 130.0000000 0.8902046 ## [1] 131.0000000 0.8906038 ## [1] 132.0000000 0.8909986 ## [1] 133.0000000 0.8913891 ## [1] 134.0000000 0.8917754 ## [1] 135.0000000 0.8921575 ## [1] 136.0000000 0.8925355 ## [1] 137.0000000 0.8929094 ## [1] 138.0000000 0.8932795 ## [1] 139.0000000 0.8936456 ## [1] 140.0000000 0.8940079 ## [1] 141.0000000 0.8943665 ## [1] 142.0000000 0.8947214 ## [1] 143.0000000 0.8950727 ## [1] 144.0000000 0.8954204 ## [1] 145.0000000 0.8957647 ## [1] 146.0000000 0.8961055 ## [1] 147.0000000 0.8964429 ## [1] 148.000000 0.896777 ## [1] 149.0000000 0.8971078 ## [1] 150.0000000 0.8974354 ## [1] 151.0000000 0.8977598 ## [1] 152.0000000 0.8980811 ## [1] 153.0000