Lecture 09: Two-way ANOVA

Experimental Design in Education

Author
Affiliation

Jihong Zhang*, Ph.D

Educational Statistics and Research Methods (ESRM) Program*

University of Arkansas

Published

March 7, 2025

Modified

March 18, 2025

Overview of Lecture 09 & Lecture 10

  1. The rest of the semester
  2. Advantage of Factorial Design
  3. Two-way ANOVA:
    • Steps for conducting Two-way ANOVA
    • Hypothesis testing
    • Assumptions for two-way ANOVA
    • Visualization
    • Difference between one-way and two-way ANOVA

Question: How about the research scenario when more than two independent variables?

1 Previous lectures

Note
  • In experimental design, when we are interested in comparing the means of dependent variable among several groups:

    • z-test
      1. A group vs. the population
    • t-test
      1. One group vs. another group
    • One-way ANOVA
      1. More than two groups
Examples of z-test, t-test, and one-way ANOVA

Z-test:. A school claims that the average score of students on a national exam is 75. A sample of 40 students has a mean score of 78. A z-test is used to compare the sample mean to the population mean of 75.

T-test:. A researcher wants to compare the test scores of two different groups of students: one using online learning and the other using traditional classroom learning. A t-test is used to determine if there is a significant difference between the two groups’ scores.

One-way ANOVA:. A scientist is testing the effectiveness of three different drug treatments on blood pressure reduction. Three groups of patients receive different treatments, and a one-way ANOVA is used to determine if there are significant differences in blood pressure reduction among the three groups.

2 Block Design Review: IV and Extraneous Factor

Important
  1. When comparing the means of DV among several groups with more than two IVs, we can use :
    1. Blocking Design
      • Independent variable: the factor of the interest (→ the factor that we expect to have any meaningful impact on the dependent variable)
      • Extraneous factor → the factors that impacts on DV
Note

Independent Variable: In a study investigating the impact of different teaching methods (e.g., traditional, online, and blended) on student performance, the independent variable would be the type of teaching method, as it’s the factor being manipulated to assess its effect on student performance.

Extraneous Variable: In the same study, an extraneous variable could be the students’ prior knowledge or their baseline academic performance, as this could influence their learning outcomes but is not the main focus of the research. This variable might affect the dependent variable (student performance) but is not part of the experimental manipulation.

3 Block Design Review: IV and Extraneous Factor

Important
  1. If one of extraneous factors is expected to have any impact on IV as well, then, it is treated as “confounding” factor
  2. If one of extraneous factors is NOT expected to have any impact on IV, then, it is treated as “nuisance” factor
    1. If the nuisance variable is known and controllable, we use blocking and control it by including a “blocking” factor in our experiment.
    2. If the nuisance factor is known but uncontrollable, sometimes we can use analysis of covariance (ANCOVA) to remove the effect of the nuisance factor from the analysis.
    3. If the nuisance factors that are unknown and uncontrollable (sometimes called a “lurking” variable; e.g., teachers’ personality/emotions). We use randomization to balance out their impact.
Examples of IV, Confounding Factor, and Nuisance Factor
  1. Confounding Factor
    • A study examines the effect of a new medication (IV) on blood pressure (DV). Dietary habits influence both the medication adherence and blood pressure, making it a confounding factor.
  2. Nuisance Factors
    • Known and Controllable (Blocking Factor)
      • A study on fertilizer effectiveness (IV) on crop yield (DV) includes different farms as a blocking factor to account for variations in soil quality.
    • Known but Uncontrollable (Covariate in ANCOVA)
      • A study on the effect of a math intervention (IV) on test scores (DV) includes students’ prior math knowledge as a covariate in ANCOVA.
    • Unknown and Uncontrollable (Lurking Variable, Balanced via Randomization)
      • A study on online learning effectiveness (IV) on student performance (DV) may have students’ intrinsic motivation as an unknown, uncontrollable nuisance factor. Randomization helps mitigate its influence.

4 Types of Factorial Design

  1. When we are interested in comparing the means of dependent variable among several groups with more than two independent variables:
  2. Factorial Design
    • Independent or between-subject design
      • Each individual is in only 1 group or receives only 1 treatment
    • Dependent and/or repeated-measure design
      • Each individual is in each of the groups or receives each of the treatments
    • Mixed design (e.g., split-plot design, etc.)
      • For one IV, each individual is in 1 group or receives 1 treatment
      • For the other IV, each individual is in each of the groups or receives each of the treatments
Examples of multiple-way ANOVA
  • For 2 IVs, each person only 1 group = two-way independent ANOVA
  • For 3 IVs, each person receives each treatment = three-way repeated measures ANOVA
  • For 2 IVs, each person receives 1 treatment for 1 IV and each treatment for other IV = two-way mixed ANOVA

Note. “A (number of IVs)=way (type of design) ANOVA” → Number of IVs- the name changes with each additional factor:

5 Difference of Factorial Design

Design Type Key Feature Example Statistical Analysis
Split-Plot Hierarchical treatment assignment (whole plots and subplots) Agriculture: Irrigation (whole plot) & Fertilizer (subplot) Mixed-effects model
Independent Each participant/unit is exposed to one condition only Teaching Method 1 vs. Method 2 Independent t-test, ANOVA
Dependent Same participant/unit exposed to all conditions Measuring reaction time with different caffeine levels Repeated-measures ANOVA, mixed-effects model

6 Next lectures III

Example of Factorial Design
  • Research Question: Is there an interaction between school type and tutoring program on academic achievement?
    • For this RQ, we might think that the effect of tutoring program depends on the type of school the students attend.
    • As shown below, there are three levels for the tutoring program (e.g., no tutor, once a week, and daily) and three levels for the school type (e.g., public, private-secular, and private-religious).
    • Then, we examine ALL combinations of the levels of these two IVs.
School\Program No Tutor Once Per Week Daily
Public Y11 Y12 Y13
Private-secular Y21 Y22 Y23
Private-religious Y31 Y32 Y33
  • This would be a 3 x 3 design: two factors, one with 3 levels and one with 3 levels
    • 3 levels of tutoring program
    • 3 levels of school type
  • Complete term: 3 x 3 two-way ANOVA or 3 x 3 independent factorial ANOVA

7 Next lectures IV: Exercise

Example of Factorial Design

Question: What kind of design is used in the following research question:

  • Does the effect of three different drug treatments (separate groups for Drug A, Drug B, Placebo) on the level of self-reported positive mood depend on whether the participant received psychotherapy (treatment group) treatment or not (control group)?
    1. One-way repeated-measure ANOVA
    2. 2 x 3 split-plot factorial ANOVA
    3. 2 x 3 independent factorial ANOVA
    4. 2 x 3 repeated-measure factorial ANOVA
  • Answer: C
    • We have two IVs; drug treatment and therapy.
    • One factor, the therapy, has two levels; treatment (therapy=yes), control (therapy=no)
    • The other factor, the drug treatment, has three levels; Drug A, Drug B, and Placebo
    • And this design does not have the dependent (or repeated-measure) components.
    • Generally, the smaller number of levels comes the first.
    • So, this is an example of 2 x 3 independent factorial ANOVA.

8 Next lectures V: Exercise

Example of Factorial Design

Question: What kind of design is used in the following research question:]

  • Is there a difference in the number of hours studied depending on what university was attended (UA, UAFS, UALR) and whether it was Fall, Spring, or Summer semester across academic cohorts (2012 – 2015)? A different sample of students was taken each semester?
    1. 3 x 3 x 4 independent factorial ANOVA
    2. 3 x 3 x 4 split-plot factorial ANOVA
    3. 3 x 4 independent factorial ANOVA
    4. 3 x 4 repeated-measure factorial ANOVA
  • Answer: A
    • We have three IVs; school, semester, and cohort; school has three levels (UA, UAFS, UALR), semester has three levels (Fall, Spring, Summer), and cohort has four levels (2012, 2013, 2014, 2015).
    • And this design does not have the dependent (or repeated-measure) components, because “the different sample of students was taken from each semester.”
    • Generally, the smaller number of levels comes the first.
    • So, this is an example of 3 x 3 x 4 independent factorial ANOVA.

9 Advantages of Factorial Design I

  • Factorial design offers several advantages over a one-way ANOVA:
    1. It allows for greater generalizability of results
      • More realistic (external validity)
      • Example: One study of using participants with the age groups including Youth and Childhood vs. One study using only Youth
    2. It allows for investigation of interactions (answer more complicated RQs)
      • We can ask whether the effect of one variable depends on the level of another variable
      • Example: cancer trials; Does the effect of treatment method depends on stage of cancer?
    3. It requires fewer participants than two 1-way treatment designs for the same level of statistical power.
      • Smaller error terms means more statistical power!

9.1 Example: Cancer Trials

  • Below is an R script demonstrating a factorial design for a cancer trial where the goal is to examine whether the effect of a treatment method depends on the stage of cancer (i.e., testing for an interaction effect).

10 Interpretation of the Results

Interpretation of Results
  • If the main effect of Tutoring is significant, then grades differ across tutoring conditions, regardless of school type.

  • If the main effect of School Type is significant, then grades differ across school types, regardless of tutoring.

  • If the interaction effect (Tutoring × School) is significant, then the impact of tutoring depends on school type, suggesting that different tutoring programs work better in different school settings.

11 Advantages of Factorial Design II

  1. First advantage of Factorial Design is reducing the error variance so that we can have higher power to examine group differences.
Example: the effect of level of arousal on math performance
  • Let’s say we are interested in testing the effect of level of arousal (e.g., anxiety) on math performance, but we already think that gender is associated with performance.
    • Adding gender as an IV explains more variance, thus, reducing the amount of error variance of DV.
    • Note. We define error as anything NOT related to (explained by) the IVs.

12 Advantages of Factorial Design III

  1. One key advantage of factorial design is that it allows us to consider more complicated scenarios by including multiple factors in a single experiment. This means we can analyze not only the main effects of each factor but also their interactions, providing deeper insights into real-world complexities.
Cancer Trial Example: Expanding the Factorial Design
  • In the previous example, we considered a 2×2 factorial design with two factors:

    • Treatment Method (Standard vs. New)
    • Cancer Stage (Early vs. Advanced)
  • However, real-world cancer treatments involve more than two factors. For instance, we might also want to consider:

    • Dosage Level (Low vs. High)
    • Patient Age Group (Young vs. Elderly)
  • By extending the factorial design to a 2×2×2×2 factorial structure, we can investigate more complex research questions, such as:

    1. Does the effectiveness of a treatment vary not only by cancer stage but also by dosage level?
    2. Do younger and older patients respond differently to a specific combination of treatment and dosage?
    3. Is there a three-way interaction, where the best treatment strategy depends on a combination of treatment type, cancer stage, and dosage level?

12.1 Cancer Trial Example: Why is this important

Why Is This Important?
  • Better Understanding of Individual Differences:
    • A one-factor design (e.g., comparing only treatment types) would ignore how different patient groups respond differently to the same treatment.
    • By incorporating multiple factors, we can identify which subgroups benefit most from a specific intervention.
  • More Realistic Clinical Applications:
    • In real-world medicine, treatment effectiveness depends on multiple interacting factors.
    • A factorial design helps us replicate complex clinical settings more accurately than a simple one-factor experiment.
  • Efficient Use of Resources:
    • Instead of conducting multiple separate studies to test each factor independently, factorial design allows us to study all factors simultaneously in a single experiment.
    • This reduces costs and increases statistical power by leveraging shared data across conditions.

13 Steps for conducting two-way ANOVA

  • Similar to one-way ANOVA:
    1. Set the null hypothesis and the alternative hypothesis → Research question?
    2. Find the critical value of test statistics (i.e., F-critical) based on alpha and df
    3. Calculate the observed value of test statistics (i.e., F-observed) based on the information about the collected data (i.e., the sample)
    4. Make the statistical conclusion → either reject or retain the null hypothesis
    5. State the research conclusion regarding the research question

14 Hypothesis testing for two-way ANOVA (I)

  • For two-way ANOVA, there are three distinct hypothesis tests :
    • Main effect of Factor A
    • Main effect of Factor B
    • Interaction of A and B
Definition
F-test

Three separate F-tests are conducted for each!

Main effect

Occurs when there is a difference between levels for one factor

Interaction

Occurs when the effect of one factor on the DV depends on the particular level of the other factor

Said another way, when the difference in one factor is moderated by the other

Said a third way, if the difference between levels of one factor is different, depending on the other factor

15 Hypothesis testing for two-way ANOVA (II)

Example

Background: A researcher is investigating how study method (Factor A: Lecture vs. Interactive) and test format (Factor B: Multiple-Choice vs. Open-Ended) affect student performance (dependent variable: test scores). The study involves randomly assigning students to one of the two study methods and then assessing their performance on one of the two test formats.

  • Main Effect of Study Method (Factor A):
    • H0: There is no difference in test scores between students who used the Lecture method and those who used the Interactive method.
    • H1: There is a significant difference in test scores between the two study methods.
  • Main Effect of Test Format (Factor B):
    • H0: There is no difference in test scores between students taking a Multiple-Choice test and those taking an Open-Ended test.
    • H1: There is a significant difference in test scores between the two test formats.
  • Interaction Effect (Study Method × Test Format):
    • H0: The effect of study method on test scores is the same regardless of test format.
    • H1: The effect of study method on test scores depends on the test format (i.e., there is an interaction).

16 Hypothesis testing for two-way ANOVA (III)

  1. Hypothesis test for the main effect with more than 2 levels
    • Mean differences among levels of one factor:

      1. Differences are tested for statistical significance
      2. Each factor is evaluated independently of the other factor(s) in the study
    • Factor A’s Main effect: “Controlling Factor B, are there differences in the DV across Factor A?”

      H_0: \mu_{A_1}=\mu_{A_2}=\cdots=\mu_{A_k}

      H_1: At least one \mu_{A_i} is different from the control group in Factor A

    • Factor B’s Main effect : “Controlling Factor A, are there differences in the DV across Factor B?”

      H_0: \mu_{B_1}=\mu_{B_2}=\cdots=\mu_{B_k}

      H_1: At least one \mu_{B_i} is different from the control group in Factor A

Caution

Question: what dose “controlling” mean in “Controlling Factor A, …”?

Answer: “Controlling” means the decrease of error variances by incorporating the effects of factor A on outcome?

17 Hypothesis testing for two-way ANOVA (IV)

Example: Tutoring Program and Types of Schools on Grades
  • IVs:
    1. Tutoring Programs: (1) No tutor; (2) Once a week; (3) Daily
    2. Types of schools: (1) Public (2) Private-secular (3) Private-religious
  • Research purpose: to examine the effect of tutoring program (no tutor, once a week, and daily) AND types of school (e.g., public, private-secular, and private-religious) on the students’ grades
  • Question: What are the null and alternative hypotheses for the main effects in the example?:
    • Factor A’s Main effect: “Controlling school types, are there differences in the students’ grade across three tutoring programs?

      H_0: \mu_{\mathrm{no\ tutor}}=\mu_{\mathrm{once\ a\ week}}=\mu_{\mathrm{daily}}

    • Factor B’s Main effect : “Controlling tutoring programs, are there differences in the students’ grades across three school types?

      H_0: \mu_{\mathrm{public}}=\mu_{\mathrm{private-religious}}=\mu_{\mathrm{private-secular}}

18 Visualization of Two-Way ANOVA (No Interaction)

  • The main-effect-only no-interaction statistical form

\mathrm{Grade} = \beta_0 + \beta_1 \mathrm{Toturing_{Once}} + \beta_2 \mathrm{Toturing_{Daily}} \\ + \beta_3 \mathrm{SchoolType_{PvtS}} + \beta_4 \mathrm{SchoolType_{PvtR}}

  • Main effect of Tutoring Programs
    • Collapsing across School Type
    • Ignoring the difference levels of School Type
    • Averaging DV regarding Tutoring Programs across the levels of School Type
  • Main effect of School Type
    • Collapsing across Tutoring Program
    • Ignoring the difference levels of Tutoring Program
    • Averaging DV regarding School Type across the levels of Tutoring Programs

Click to see R code
# Compute mean and standard error for each tutoring group
tutoring_summary <- data |>
  group_by(Tutoring) |>
  summarise(
    Mean_Grade_byTutor = mean(Grades),
    School = factor(c("Public","Private-Secular", "Private-Religious"), 
                    levels = c("Public","Private-Secular", "Private-Religious"))
  ) 

school_summary <- data |>
  group_by(School) |>
  summarise(
    Mean_Grade_bySchool = mean(Grades)
  ) 

total_summary <- tutoring_summary |> 
  left_join(school_summary, by = "School")

# Plot main effect of tutoring
ggplot(total_summary, aes(x = School)) +
  geom_point(aes(y = Mean_Grade_byTutor, color = Tutoring), size = 5) +
  geom_hline(aes(yintercept = Mean_Grade_byTutor, color = Tutoring), linewidth = 1.3) +
  scale_y_continuous(limits = c(70, 90), breaks = seq(70, 90, 5)) +
  labs(title = "Main Effect of Tutoring on Student Grades",
       x = "School Types",
       y = "Mean Grade") +
  theme_minimal() +
  theme(legend.position = "none")  # Remove legend for cleaner visualization

distinct(tutoring_summary[, c("Tutoring", "Mean_Grade_byTutor")])
# A tibble: 3 × 2
# Groups:   Tutoring [3]
  Tutoring    Mean_Grade_byTutor
  <fct>                    <dbl>
1 No Tutor                  76.6
2 Once a Week               81.1
3 Daily                     86.3
  • The x-axis represents three school types.
  • The y-axis represents the mean student grades for the different tutoring programs: No Tutor, Once a Week, and Daily.
  • If the main effect of tutoring is significant, we expect to see noticeable differences in mean grades across tutoring conditions.

19 You turn: Main Effect of School Type

20 Combined Visualization

  • There is the LARGE effect of Factor Tutoring and very small effect of School Type:
    • Effect of Factor Tutoring: the LARGE vertical distance
    • Effect of Factor School Type: the very small horizontal distance

21 Summary

  1. Overview of factorial design
    • Independent / between-subject (today’s focus)
    • Dependent / within-subject
    • Mixed design
  2. Advantages of factor design
    • Reduction of residuals and higher statistical power
    • Consider more complicated scenarios and answer more complex research questions
    • Better understanding group differences depending on other characteristics
    • Save research resources
  3. Key components of Two-way ANOVA
    • Main Effects of Single Factors (today’s focus)
    • Interaction Effects
Back to top