Jihong Zhang, Ph.D. – Lecture 10: Two-way ANOVA II

F-statistics for Two-Way ANOVA

F-statistics for Two-Way ANOVA: R code

R Code
Interpretation

Source	df	F
Main Effect of Tutoring	2	89.065***
Main Effect of School	2	0.636
Interaction Effect of School x Tutoring	4	0.102
Residual	261	—

F-statistics for Two-Way ANOVA: Degree of Freedom

IVs:
1. Tutoring Programs: (1) No tutor; (2) Once a week; (3) Daily
2. Types of schools: (1) Public (2) Private-secular (3) Private-religious
Degree of freedom (DF):
- = (Number of levels of Tutoring) - 1 = 3 - 1 = 2
- = (Number of levels of School) - 1 = 3 - 1 = 2
- = = = 4
- = 270 - 1 = 269
- = = 269 - 2 - 2 - 4= 261

Calculation of p-values

Based on DF and observed F statistics, we can locate observed F-statistic onto the F-distribution can calculate the p-values.

F-statistics for Two-Way ANOVA: SS and DF

Why residual DF is calculated by subtracting the total DFs with effects’ DFs?

Because DF is linked to SS.

Source	SS	df
Main Effect of Tutoring	4240	2
Main Effect of School	22	2
Interaction Effect of School x Tutoring	186	4
Residual	6213	261
Total	4240 + 22 + 186 + 6213	2 + 2 + 4 + 261

sum((data$Grades - mean(data$Grades))^2) # manually calculate total Sum of squares
nrow(data) - 1 # nrow() calculate the number of rows / observations

[1] 10660.77
[1] 269

4240 + 22 + 186 + 6213
2 + 2 + 4 + 261

[1] 10661
[1] 269

Difference between one-way and two-way ANOVA

One-way ANOVA

Total = Between + Within

Two-way ANOVA

Total = FactorA + FactorB + A*B + Residual

Within-factor effects (one-way ANOVA) and Residuals (two-way ANOVA) have same meaning, both of which are unexplained part of DV by the model.

In Factor ANOVA Design, the overall between-group variability is divided up into variance terms for each unique source of the factors.

Source	2 IVs	3 IVs	4 IVs
Main effect	A, B	A, B, C	A, B, C, D
Interaction effect	AB	AB, BC, AC, ABC	AB, AC, AD, BC, BD, CD, ABC, ABD, BCD, ACD, ABCD

Calculate the number of interaction effects

##Number of interaction for 4 IVs
choose(4, 2) + choose(4, 3) + choose(4, 4)

[1] 11

F-statistics for Two-way ANOVA: F values

F-observed values:
- Main effect of factor A:
- Main effect of factor B:
- Interaction effect of AxB:
Degree of freedom:
- = (Number of levels of Tutoring) - 1
- = (Number of levels of School) - 1
- =
- = N - 1
- =

Exercise

A new samples with N = 300
- Tutoring still has three levels,
- School types only have two levels: Public vs. Private.

Based on the SS provide below, calculate DF and F values for three effects.

Loading webR...

You turn
Result

F-statistics for Two-way ANOVA: Sum of squres

Note

In a old-fashion way, you can calculate Sum of Squares based on marginal means and sample size for each cell.

Interpretation

Source	Df	Sum Sq	Mean Sq	F value	P
Tutoring	2	4240	2120.1	89.065	<2e-16 ***
School	2	22	10.8	0.453	0.636
Tutoring:School	4	186	46.5	1.953	0.102
Residuals	261	6213	23.8

Main Effect A: Ignoring type of school, are there differences in grades across type of tutoring?
- Under alpha=.05 level, because F-observed (F_observed=89.065) exceeds the critical value (p < .001), we reject the null that all means are equal across tutoring type (ignoring the effect of school type).
- There is a significant main effect of tutoring type on grade.

Assumptions for conductiong 2-way ANOVA

In order to compare our sample to the null distribution, we need to make sure we are meeting some assumptions for each CELL:
1. Variance of DV in each cell is about equal. → Homogeneity of variance
2. DV is normally distributed within each cell. → Normality
3. Observations are independent. → Independency
Robustness of assumption violations:
1. Violations of independence assumption: bad news! → Not robust to this!
2. Having a large N and equal cell sizes protects you against violations of the normality assumption → Rough suggestion: have at least 15 participants per cell → If you don’t have large N or equal groups, check cell normality 2 ways: (1) skew/kurtosis values, (2) histograms
3. Use Levene’s test to check homogeneity of cell variance assumption → If can’t assume equal variances, use Welch or Brown-Forsyth. → However, F is somewhat robust to violations of HOV as long as within-cell standard deviations are approximately equal.

Two-way ANOVA: Calculation based on Grand & Marginal & Cell Means

Background
Data screening

This research study is an adaptation of Gueguen (2012) described in Andy Field’s text. Specifically, the researchers hypothesized that people with tattoos and piercings were more likely to engage in riskier behavior than those without tattoos and piercings. In addition, the researcher wondered whether this difference varied, depending on whether a person was male or female.

Question: How many IVs? How many levels for each.

Answer: 2 IVs: (1) Whether or not having Tattos and Piercings (2) Male of Female. DV: Frequency of risk behaviors

Data input in R

Either you can import a CSV file, or manually import the data points (for small samples).

tatto_piercing = rep(c(TRUE, FALSE), each = 8)
gender = rep(c("Male", "Female", "Male", "Female"), each = 4)
outcome = c(5.4, 6.7, 1.8, 6.1, 5.9, 4.6, 2.7, 3.8,
            2.6, 5.8, 1.5, 2.1, .6, .7, .7, 1.8)

dat <- data.frame(
  tatto_piercing = tatto_piercing,
  gender = gender,
  outcome = outcome
)
dat

   tatto_piercing gender outcome
1            TRUE   Male     5.4
2            TRUE   Male     6.7
3            TRUE   Male     1.8
4            TRUE   Male     6.1
5            TRUE Female     5.9
6            TRUE Female     4.6
7            TRUE Female     2.7
8            TRUE Female     3.8
9           FALSE   Male     2.6
10          FALSE   Male     5.8
11          FALSE   Male     1.5
12          FALSE   Male     2.1
13          FALSE Female     0.6
14          FALSE Female     0.7
15          FALSE Female     0.7
16          FALSE Female     1.8

Your turn

Import the following data set into R manually.

Tutor	AM	PM
Bobby	19	16
Julia	15	10
Monique	14	18
Ned	13	11

Hint:

rep(c("A", "B"), 2)
rep(c("A", "B"), each = 2)

[1] "A" "B" "A" "B"
[1] "A" "A" "B" "B"

Exercise
Answer

Tutor = rep(c("Bobby", "Julia", "Monique", "Ned"), 2)
Time = rep(c("AM", "PM"), each = 4)
Grade = c(19, 15, 14, 13, 16, 10, 18, 11)

dat_ex11 <- data.frame(
  Tutor = Tutor,
  Time = Time, 
  Grade = Grade
)
dat_ex11

    Tutor Time Grade
1   Bobby   AM    19
2   Julia   AM    15
3 Monique   AM    14
4     Ned   AM    13
5   Bobby   PM    16
6   Julia   PM    10
7 Monique   PM    18
8     Ned   PM    11

Step 2: Sum of Squares of main effects

marginal mean of main effect A at each level
: sample size for factor A at each level
: grand mean of outcome

M_A_table <-  dat |> 
  group_by(tatto_piercing) |>  
  summarise(
    N = n(),
    Mean = mean(outcome)
  ) 
M_T = mean(dat$outcome)
M_A_table

# A tibble: 2 × 3
  tatto_piercing     N  Mean
  <lgl>          <int> <dbl>
1 FALSE              8  1.98
2 TRUE               8  4.62

M_A_table$N[1] * (M_A_table$Mean[1] - M_T)^2 + 
  M_A_table$N[2] * (M_A_table$Mean[2] - M_T)^2

[1] 28.09

or

sum(M_A_table$N * (M_A_table$Mean - M_T)^2)

[1] 28.09

Your turn: Sum of squares of Gender

Loading webR...

Your turn
Answer

dat object has already been loaded into R. Please, calculate the gender’s sum of squares:

Step 3: Sum of Squares of Interaction Effect

cell_means <- dat |> 
  group_by(tatto_piercing, gender) |> 
  summarise(
    N = n(),
    Mean = mean(outcome)
  )
cell_means

# A tibble: 4 × 4
# Groups:   tatto_piercing [2]
  tatto_piercing gender     N  Mean
  <lgl>          <chr>  <int> <dbl>
1 FALSE          Female     4  0.95
2 FALSE          Male       4  3   
3 TRUE           Female     4  4.25
4 TRUE           Male       4  5

Your turn

Loading webR...

Exercise
Answer

Step 4: Degree of freedom and F-statistics

Your turn
Answer

Loading webR...

Interpretation

#                       Df Sum Sq Mean Sq F value Pr(>F)   
# gender                 1   7.84   7.840   2.942  0.112   
# tatto_piercing         1  28.09  28.090  10.540  0.007 **
# gender:tatto_piercing  1   1.69   1.690   0.634  0.441   
# Residuals             12  31.98   2.665

Main Effect of Tattoo-Piercing: Reject the null → Ignoring the gender, there is a significant main effect of tattoo on risky behavior.
Main Effect of Gender: Retain the null → Ignoring the tattoo, there is NO significant main effect of gender on risky behavior.
Interaction: Retain the null → Under alpha=.05 level, because F-observed (F=0.63) does not exceed the critical value (F=4.75), we fail to reject the null that the effect of Factor A depends on Factor B. → “There is NO significant interaction between gender and tattoo on risky behavior.”

ShinyApp for ANOVA

Other extensions about 2-way ANOVA: I

Type I? Type II? Type III?
- Effect sums of squares (SSA, SSB, SSAB) are a decomposition of the total sum of squared deviations from the overall mean (SST).
- How the SST is decomposed depends on characteristics of the data as well as the hypotheses of interest to the researcher.
Type I sums of squares (SS) are based on a sequential decomposition.
- For example, if your ANOVA model statement is “MODEL Y = A B A*B”, then, the sum of squares are considered in effect order A, B, A*B, with each effect adjusted for all preceding effects in the model.
- Thus, any variance that is shared between the various effects will be sub-summed by the variable entered earlier.

Pros:

Nice property: balanced or not, SS for all the effects add up to the total SS, a complete decomposition of the predicted sums of squares for the whole model. This is not generally true for any other type of sums of squares.
Preferable when some factors (such as nesting) should be taken out before other factors. For example, with unequal number of male and female, factor “gender” should precede “subject” in an unbalanced design.

Cons: 1. Order matters! 2. Not appropriate for factorial designs, but might be ok for Blocking designs.

Other extensions about 2-way ANOVA: II

With Type II SS, each main effect is considered as though it were added after the other main effects but before the interaction.
- Any interaction effects are calculated based on a model already containing the main effects.
- Any variance that is shared between A and B is not considered part of A or B.
- Thus, interaction variance that is shared with A or with B will be counted as part of the main effect, and not as part of the interaction effect.

Pros:

appropriate for model building, and natural choice for regression
most powerful when there is no interaction
invariant to the order in which effects are entered into the model

Cons:

For factorial designs with unequal cell samples, Type II sums of squares test hypotheses that are complex functions of the cell ns that ordinarily are not meaningful.
Not appropriate for factorial designs.

Other extensions about 2-way ANOVA: III

Type III SS considers all effects as though they are added last.
- Any shared variance ends up not being counted in any of the effects.
- In ANOVA, when the data are balanced (equal cell sizes) and the factors are orthogonal and all three types of sums of squares are identical.
- Orthogonal, or independent, indicates that there is no variance shared across the various effects, and the separate sums of squares can be added to obtain the model sums of squares.

Pros:

Not sample size dependent: effect estimates are not a function of the frequency of observations in any group (i.e., for unbalanced data, where we have unequal numbers of observations in each group).

Cons:

Not appropriate for designs with missing cells: for ANOVA designs with missing cells, Type III sums of squares generally do not test hypotheses about least squares means, but instead test hypotheses that are complex functions of the patterns of missing cells in higher-order containing interactions and that are ordinarily not meaningful.

In general, for the factorial design, we usually report Type III SS, unless you have missing cells in your model.

Final discussion

In this example,

The interaction effect shows several issues:
- Violation of independency assumption
- Violation of normality assumption in the condition of “Male” and “Tattoo=Yes”
We can guess/think about one possibility that it might be cause by several reasons. (e.g., too small sample sizes, not randomized assignments of the samples, not a significant interaction effect, etc.)
- From the two-way ANOVA results, we found that the interaction effect is not significant.
Thus, in this case, it is more reasonable to conduct
1. one-way ANOVA for “Group” variable, and
2. one-way ANOVA for “Gender” variable, separately.
- For each of one-way ANOVAs, we should check the assumptions, conduct one-way ANOVA, and post-hoc test separately as well.

Lecture 10: Two-way ANOVA II

R History Command Contents