Educational Statistics and Research Methods (ESRM) Program*
University of Arkansas
Published
March 7, 2025
Modified
March 18, 2025
Overview of Lecture 09 & Lecture 10
The rest of the semester
Advantage of Factorial Design
Two-way ANOVA:
Steps for conducting Two-way ANOVA
Hypothesis testing
Assumptions for two-way ANOVA
Visualization
Difference between one-way and two-way ANOVA
Question: How about the research scenario when more than two independent variables?
1 Previous lectures
Note
In experimental design, when we are interested in comparing the means of dependent variable among several groups:
z-test
A group vs. the population
t-test
One group vs. another group
One-way ANOVA
More than two groups
Examples of z-test, t-test, and one-way ANOVA
Z-test:. A school claims that the average score of students on a national exam is 75. A sample of 40 students has a mean score of 78. A z-test is used to compare the sample mean to the population mean of 75.
T-test:. A researcher wants to compare the test scores of two different groups of students: one using online learning and the other using traditional classroom learning. A t-test is used to determine if there is a significant difference between the two groups’ scores.
One-way ANOVA:. A scientist is testing the effectiveness of three different drug treatments on blood pressure reduction. Three groups of patients receive different treatments, and a one-way ANOVA is used to determine if there are significant differences in blood pressure reduction among the three groups.
2 Block Design Review: IV and Extraneous Factor
Important
When comparing the means of DV among several groups with more than two IVs, we can use :
Blocking Design
Independent variable: the factor of the interest (→ the factor that we expect to have any meaningful impact on the dependent variable)
Extraneous factor → the factors that impacts on DV
Note
Independent Variable: In a study investigating the impact of different teaching methods (e.g., traditional, online, and blended) on student performance, the independent variable would be the type of teaching method, as it’s the factor being manipulated to assess its effect on student performance.
Extraneous Variable: In the same study, an extraneous variable could be the students’ prior knowledge or their baseline academic performance, as this could influence their learning outcomes but is not the main focus of the research. This variable might affect the dependent variable (student performance) but is not part of the experimental manipulation.
3 Block Design Review: IV and Extraneous Factor
Important
If one of extraneous factors is expected to have any impact on IV as well, then, it is treated as “confounding” factor
If one of extraneous factors is NOT expected to have any impact on IV, then, it is treated as “nuisance” factor
If the nuisance variable is known and controllable, we use blocking and control it by including a “blocking” factor in our experiment.
If the nuisance factor is known but uncontrollable, sometimes we can use analysis of covariance (ANCOVA) to remove the effect of the nuisance factor from the analysis.
If the nuisance factors that are unknown and uncontrollable (sometimes called a “lurking” variable; e.g., teachers’ personality/emotions). We use randomization to balance out their impact.
Examples of IV, Confounding Factor, and Nuisance Factor
Confounding Factor
A study examines the effect of a new medication (IV) on blood pressure (DV). Dietary habits influence both the medication adherence and blood pressure, making it a confounding factor.
Nuisance Factors
Known and Controllable (Blocking Factor)
A study on fertilizer effectiveness (IV) on crop yield (DV) includes different farms as a blocking factor to account for variations in soil quality.
Known but Uncontrollable (Covariate in ANCOVA)
A study on the effect of a math intervention (IV) on test scores (DV) includes students’ prior math knowledge as a covariate in ANCOVA.
Unknown and Uncontrollable (Lurking Variable, Balanced via Randomization)
A study on online learning effectiveness (IV) on student performance (DV) may have students’ intrinsic motivation as an unknown, uncontrollable nuisance factor. Randomization helps mitigate its influence.
4 Types of Factorial Design
When we are interested in comparing the means of dependent variable among several groups with more than two independent variables:
Factorial Design
Independent or between-subject design
Each individual is in only 1 group or receives only 1 treatment
Dependent and/or repeated-measure design
Each individual is in each of the groups or receives each of the treatments
Mixed design (e.g., split-plot design, etc.)
For one IV, each individual is in 1 group or receives 1 treatment
For the other IV, each individual is in each of the groups or receives each of the treatments
Examples of multiple-way ANOVA
For 2 IVs, each person only 1 group = two-way independent ANOVA
For 3 IVs, each person receives each treatment = three-way repeated measures ANOVA
For 2 IVs, each person receives 1 treatment for 1 IV and each treatment for other IV = two-way mixed ANOVA
Note. “A (number of IVs)=way (type of design) ANOVA” → Number of IVs- the name changes with each additional factor:
5 Difference of Factorial Design
Design Type
Key Feature
Example
Statistical Analysis
Split-Plot
Hierarchical treatment assignment (whole plots and subplots)
Each participant/unit is exposed to one condition only
Teaching Method 1 vs. Method 2
Independent t-test, ANOVA
Dependent
Same participant/unit exposed to all conditions
Measuring reaction time with different caffeine levels
Repeated-measures ANOVA, mixed-effects model
6 Next lectures III
Example of Factorial Design
Research Question: Is there an interaction between school type and tutoring program on academic achievement?
For this RQ, we might think that the effect of tutoring program depends on the type of school the students attend.
As shown below, there are three levels for the tutoring program (e.g., no tutor, once a week, and daily) and three levels for the school type (e.g., public, private-secular, and private-religious).
Then, we examine ALL combinations of the levels of these two IVs.
School\Program
No Tutor
Once Per Week
Daily
Public
Y11
Y12
Y13
Private-secular
Y21
Y22
Y23
Private-religious
Y31
Y32
Y33
This would be a 3 x 3 design: two factors, one with 3 levels and one with 3 levels
3 levels of tutoring program
3 levels of school type
Complete term: 3 x 3 two-way ANOVA or 3 x 3 independent factorial ANOVA
7 Next lectures IV: Exercise
Example of Factorial Design
Question: What kind of design is used in the following research question:
Does the effect of three different drug treatments (separate groups for Drug A, Drug B, Placebo) on the level of self-reported positive mood depend on whether the participant received psychotherapy (treatment group) treatment or not (control group)?
One-way repeated-measure ANOVA
2 x 3 split-plot factorial ANOVA
2 x 3 independent factorial ANOVA
2 x 3 repeated-measure factorial ANOVA
Answer: C
We have two IVs; drug treatment and therapy.
One factor, the therapy, has two levels; treatment (therapy=yes), control (therapy=no)
The other factor, the drug treatment, has three levels; Drug A, Drug B, and Placebo
And this design does not have the dependent (or repeated-measure) components.
Generally, the smaller number of levels comes the first.
So, this is an example of 2 x 3 independent factorial ANOVA.
8 Next lectures V: Exercise
Example of Factorial Design
Question: What kind of design is used in the following research question:]
Is there a difference in the number of hours studied depending on what university was attended (UA, UAFS, UALR) and whether it was Fall, Spring, or Summer semester across academic cohorts (2012 – 2015)? A different sample of students was taken each semester?
3 x 3 x 4 independent factorial ANOVA
3 x 3 x 4 split-plot factorial ANOVA
3 x 4 independent factorial ANOVA
3 x 4 repeated-measure factorial ANOVA
Answer: A
We have three IVs; school, semester, and cohort; school has three levels (UA, UAFS, UALR), semester has three levels (Fall, Spring, Summer), and cohort has four levels (2012, 2013, 2014, 2015).
And this design does not have the dependent (or repeated-measure) components, because “the different sample of students was taken from each semester.”
Generally, the smaller number of levels comes the first.
So, this is an example of 3 x 3 x 4 independent factorial ANOVA.
9 Advantages of Factorial Design I
Factorial design offers several advantages over a one-way ANOVA:
It allows for greater generalizability of results
More realistic (external validity)
Example: One study of using participants with the age groups including Youth and Childhood vs. One study using only Youth
It allows for investigation of interactions (answer more complicated RQs)
We can ask whether the effect of one variable depends on the level of another variable
Example: cancer trials; Does the effect of treatment method depends on stage of cancer?
It requires fewer participants than two 1-way treatment designs for the same level of statistical power.
Smaller error terms means more statistical power!
9.1 Example: Cancer Trials
Below is an R script demonstrating a factorial design for a cancer trial where the goal is to examine whether the effect of a treatment method depends on the stage of cancer (i.e., testing for an interaction effect).
10 Interpretation of the Results
Interpretation of Results
If the main effect of Tutoring is significant, then grades differ across tutoring conditions, regardless of school type.
If the main effect of School Type is significant, then grades differ across school types, regardless of tutoring.
If the interaction effect (Tutoring × School) is significant, then the impact of tutoring depends on school type, suggesting that different tutoring programs work better in different school settings.
11 Advantages of Factorial Design II
First advantage of Factorial Design is reducing the error variance so that we can have higher power to examine group differences.
Example: the effect of level of arousal on math performance
Let’s say we are interested in testing the effect of level of arousal (e.g., anxiety) on math performance, but we already think that gender is associated with performance.
Adding gender as an IV explains more variance, thus, reducing the amount of error variance of DV.
Note. We define error as anything NOT related to (explained by) the IVs.
12 Advantages of Factorial Design III
One key advantage of factorial design is that it allows us to consider more complicated scenarios by including multiple factors in a single experiment. This means we can analyze not only the main effects of each factor but also their interactions, providing deeper insights into real-world complexities.
Cancer Trial Example: Expanding the Factorial Design
In the previous example, we considered a 2×2 factorial design with two factors:
Treatment Method (Standard vs. New)
Cancer Stage (Early vs. Advanced)
However, real-world cancer treatments involve more than two factors. For instance, we might also want to consider:
Dosage Level (Low vs. High)
Patient Age Group (Young vs. Elderly)
By extending the factorial design to a 2×2×2×2 factorial structure, we can investigate more complex research questions, such as:
Does the effectiveness of a treatment vary not only by cancer stage but also by dosage level?
Do younger and older patients respond differently to a specific combination of treatment and dosage?
Is there a three-way interaction, where the best treatment strategy depends on a combination of treatment type, cancer stage, and dosage level?
12.1 Cancer Trial Example: Why is this important
Why Is This Important?
Better Understanding of Individual Differences:
A one-factor design (e.g., comparing only treatment types) would ignore how different patient groups respond differently to the same treatment.
By incorporating multiple factors, we can identify which subgroups benefit most from a specific intervention.
More Realistic Clinical Applications:
In real-world medicine, treatment effectiveness depends on multiple interacting factors.
A factorial design helps us replicate complex clinical settings more accurately than a simple one-factor experiment.
Efficient Use of Resources:
Instead of conducting multiple separate studies to test each factor independently, factorial design allows us to study all factors simultaneously in a single experiment.
This reduces costs and increases statistical power by leveraging shared data across conditions.
13 Steps for conducting two-way ANOVA
Similar to one-way ANOVA:
Set the null hypothesis and the alternative hypothesis → Research question?
Find the critical value of test statistics (i.e., F-critical) based on alpha and df
Calculate the observed value of test statistics (i.e., F-observed) based on the information about the collected data (i.e., the sample)
Make the statistical conclusion → either reject or retain the null hypothesis
State the research conclusion regarding the research question
14 Hypothesis testing for two-way ANOVA (I)
For two-way ANOVA, there are three distinct hypothesis tests :
Main effect of Factor A
Main effect of Factor B
Interaction of A and B
Definition
F-test
Three separate F-tests are conducted for each!
Main effect
Occurs when there is a difference between levels for one factor
Interaction
Occurs when the effect of one factor on the DV depends on the particular level of the other factor
Said another way, when the difference in one factor is moderated by the other
Said a third way, if the difference between levels of one factor is different, depending on the other factor
15 Hypothesis testing for two-way ANOVA (II)
Example
Background: A researcher is investigating how study method (Factor A: Lecture vs. Interactive) and test format (Factor B: Multiple-Choice vs. Open-Ended) affect student performance (dependent variable: test scores). The study involves randomly assigning students to one of the two study methods and then assessing their performance on one of the two test formats.
Main Effect of Study Method (Factor A):
H0: There is no difference in test scores between students who used the Lecture method and those who used the Interactive method.
H1: There is a significant difference in test scores between the two study methods.
Main Effect of Test Format (Factor B):
H0: There is no difference in test scores between students taking a Multiple-Choice test and those taking an Open-Ended test.
H1: There is a significant difference in test scores between the two test formats.
Interaction Effect (Study Method × Test Format):
H0: The effect of study method on test scores is the same regardless of test format.
H1: The effect of study method on test scores depends on the test format (i.e., there is an interaction).
16 Hypothesis testing for two-way ANOVA (III)
Hypothesis test for the main effect with more than 2 levels
Mean differences among levels of one factor:
Differences are tested for statistical significance
Each factor is evaluated independently of the other factor(s) in the study
Factor A’s Main effect: “Controlling Factor B, are there differences in the DV across Factor A?”
H_0: \mu_{A_1}=\mu_{A_2}=\cdots=\mu_{A_k}
H_1: At least one \mu_{A_i} is different from the control group in Factor A
Factor B’s Main effect : “Controlling Factor A, are there differences in the DV across Factor B?”
H_0: \mu_{B_1}=\mu_{B_2}=\cdots=\mu_{B_k}
H_1: At least one \mu_{B_i} is different from the control group in Factor A
Caution
Question: what dose “controlling” mean in “Controlling Factor A, …”?
Answer: “Controlling” means the decrease of error variances by incorporating the effects of factor A on outcome?
17 Hypothesis testing for two-way ANOVA (IV)
Example: Tutoring Program and Types of Schools on Grades
IVs:
Tutoring Programs: (1) No tutor; (2) Once a week; (3) Daily
Types of schools: (1) Public (2) Private-secular (3) Private-religious
Research purpose: to examine the effect of tutoring program (no tutor, once a week, and daily) AND types of school (e.g., public, private-secular, and private-religious) on the students’ grades
Question: What are the null and alternative hypotheses for the main effects in the example?:
Factor A’s Main effect: “Controlling school types, are there differences in the students’ grade across three tutoring programs?”
---title: "Lecture 09: Two-way ANOVA"subtitle: "Experimental Design in Education"date: "2025-03-07"date-modified: "`{r} Sys.Date()`"execute: eval: true echo: true warning: false message: falseformat: html: code-tools: true code-line-numbers: false code-fold: false code-summary: "Click to see R code" number-offset: 1 fig.width: 10 fig-align: center message: false grid: sidebar-width: 350px uark-revealjs: chalkboard: true embed-resources: false code-fold: false number-sections: true number-depth: 1 footer: "ESRM 64503" slide-number: c/t tbl-colwidths: auto scrollable: true output-file: slides-index.html mermaid: theme: forest ---::: objectives## Overview of Lecture 09 & Lecture 10 {.unnumbered}1. The rest of the semester2. Advantage of Factorial Design3. Two-way ANOVA: - Steps for conducting Two-way ANOVA - Hypothesis testing - Assumptions for two-way ANOVA - Visualization - Difference between one-way and two-way ANOVA**Question**: How about the research scenario when more than two independent variables?:::## Previous lectures::::: columns::: {.callout-note .column width="40%"}- In experimental design, when we are interested in comparing the means of dependent variable among several groups: - **z-test** i) A group vs. the population - **t-test** i) One group vs. another group - **One-way ANOVA** i) More than two groups:::::: {.column .callout-note width="60%"}## Examples of z-test, t-test, and one-way ANOVA**Z-test:**. A school claims that the average score of students on a national exam is 75. A sample of 40 students has a mean score of 78. A z-test is used to compare the sample mean to the population mean of 75.**T-test:**. A researcher wants to compare the test scores of two different groups of students: one using online learning and the other using traditional classroom learning. A t-test is used to determine if there is a significant difference between the two groups' scores.**One-way ANOVA:**. A scientist is testing the effectiveness of three different drug treatments on blood pressure reduction. Three groups of patients receive different treatments, and a one-way ANOVA is used to determine if there are significant differences in blood pressure reduction among the three groups.::::::::## Block Design Review: IV and Extraneous Factor:::::: columns:::: {.column width="50%"}::: callout-important1. When comparing the means of DV among several groups *with more than two IVs*, we can use : i) Blocking Design - **Independent variable**: the factor of the interest (→ the factor that we expect to have any meaningful impact on the dependent variable) - **Extraneous factor** → the factors that impacts on DV:::::::::: {.column .callout-note width="50%"}**Independent Variable:** In a study investigating the impact of different teaching methods (e.g., traditional, online, and blended) on student performance, the **independent variable** would be the type of teaching method, as it's the factor being manipulated to assess its effect on student performance.**Extraneous Variable:** In the same study, an **extraneous variable** could be the students' prior knowledge or their baseline academic performance, as this could influence their learning outcomes but is not the main focus of the research. This variable might affect the dependent variable (student performance) but is not part of the experimental manipulation.:::::::::## Block Design Review: IV and Extraneous Factor::::::: columns:::: {.column width="50%"}::: callout-important1. If one of [extraneous factors]{style="color:tomato"} is expected to have any impact on IV as well, then, it is treated as "[confounding]{style="color:purple"}" factor2. If one of [extraneous factors]{style="color:tomato"} is NOT expected to have any impact on IV, then, it is treated as "[nuisance]{style="color:royalblue"}" factor i) If the [nuisance]{style="color:royalblue"} variable is [known]{style="color:green; font-weight:bold"} and [controllable]{style="color:green; font-weight:bold"}, we use blocking and control it by including a “blocking” factor in our experiment. ii) If the [nuisance]{style="color:royalblue"} factor is [known]{style="color:green; font-weight:bold"} but [uncontrollable]{style="color:red; font-weight:bold"}, sometimes we can use analysis of covariance (ANCOVA) to remove the effect of the [nuisance]{style="color:royalblue"} factor from the analysis. iii) If the [nuisance]{style="color:royalblue"} factors that are [unknown]{style="color:red; font-weight:bold"} and [uncontrollable]{style="color:red; font-weight:bold"} (sometimes called a “lurking” variable; e.g., teachers' personality/emotions). We use randomization to balance out their impact.::::::::::: {.column width="50%"}::: callout-note## Examples of IV, Confounding Factor, and Nuisance Factor1. **Confounding Factor** - A study examines the effect of a new medication (**IV**) on blood pressure (**DV**). **Dietary habits** influence both the medication adherence and blood pressure, making it a confounding factor.2. **Nuisance Factors** - **Known and Controllable (Blocking Factor)** - A study on fertilizer effectiveness (**IV**) on crop yield (**DV**) includes **different farms** as a blocking factor to account for variations in soil quality.\ - **Known but Uncontrollable (Covariate in ANCOVA)** - A study on the effect of a math intervention (**IV**) on test scores (**DV**) includes **students' prior math knowledge** as a covariate in ANCOVA.\ - **Unknown and Uncontrollable (Lurking Variable, Balanced via Randomization)** - A study on online learning effectiveness (**IV**) on student performance (**DV**) may have **students’ intrinsic motivation** as an unknown, uncontrollable nuisance factor. Randomization helps mitigate its influence.\::::::::::::::## Types of Factorial Design2. When we are interested in comparing the means of dependent variable among several groups **with more than two independent variables**:3. **Factorial Design** - Independent or between-subject design - Each individual is in only 1 group or receives only 1 treatment - Dependent and/or repeated-measure design - Each individual is in each of the groups or receives each of the treatments - Mixed design (e.g., split-plot design, etc.) - For one IV, each individual is in 1 group or receives 1 treatment - For the other IV, each individual is in each of the groups or receives each of the treatments::: callout-tip## Examples of multiple-way ANOVA- For 2 IVs, each person only 1 group = two-way independent ANOVA- For 3 IVs, each person receives each treatment = three-way repeated measures ANOVA- For 2 IVs, each person receives 1 treatment for 1 IV and each treatment for other IV = two-way mixed ANOVANote. “A (number of IVs)=way (type of design) ANOVA” → Number of IVs- the name changes with each additional factor::::## Difference of Factorial Design| Design Type | Key Feature | Example | Statistical Analysis ||------------------|------------------|------------------|------------------|| Split-Plot | Hierarchical treatment assignment (whole plots and subplots) | Agriculture: Irrigation (whole plot) & Fertilizer (subplot) | Mixed-effects model || Independent | Each participant/unit is exposed to one condition only | Teaching Method 1 vs. Method 2 | Independent t-test, ANOVA || Dependent | Same participant/unit exposed to all conditions | Measuring reaction time with different caffeine levels | Repeated-measures ANOVA, mixed-effects model |:::::: columns::: {.column width="33%"}:::::: {.column width="33%"}:::::: {.column width="33%"}:::::::::## Next lectures III:::::: callout-important## Example of Factorial Design- Research Question: Is there an interaction between school type and tutoring program on academic achievement? - For this RQ, we might think that the effect of tutoring program depends on the type of school the students attend. - As shown below, there are three levels for the tutoring program (e.g., no tutor, once a week, and daily) and three levels for the school type (e.g., public, private-secular, and private-religious). - Then, we examine ALL combinations of the levels of these two IVs.::::: columns::: column| School\\Program | No Tutor | Once Per Week | Daily ||-------------------|----------|---------------|-------|| Public | Y11 | Y12 | Y13 || Private-secular | Y21 | Y22 | Y23 || Private-religious | Y31 | Y32 | Y33 |:::::: column- This would be a 3 x 3 design: two factors, one with 3 levels and one with 3 levels - 3 levels of tutoring program - 3 levels of school type- Complete term: 3 x 3 two-way ANOVA or 3 x 3 independent factorial ANOVA::::::::::::::## Next lectures IV: Exercise::: callout-caution## Example of Factorial Design*Question*: **What kind of design is used in the following research question:**- Does the effect of three different [drug treatments]{style="color:red; font-weight:bold"} (separate groups for Drug A, Drug B, Placebo) on the level of self-reported [positive mood]{style="color:green;font-weight:bold"} depend on whether the participant received [psychotherapy]{style="color:purple; font-weight:bold"} (treatment group) treatment or not (control group)? A. One-way repeated-measure ANOVA B. 2 x 3 split-plot factorial ANOVA C. 2 x 3 independent factorial ANOVA D. 2 x 3 repeated-measure factorial ANOVA- Answer: [C]{.mohu} - [We have two IVs; drug treatment and therapy.]{.mohu} - [One factor, the therapy, has two levels; treatment (therapy=yes), control (therapy=no)]{.mohu} - [The other factor, the drug treatment, has three levels; Drug A, Drug B, and Placebo]{.mohu} - [And this design does not have the dependent (or repeated-measure) components.]{.mohu} - [Generally, the smaller number of levels comes the first.]{.mohu} - [So, this is an example of 2 x 3 independent factorial ANOVA.]{.mohu}:::## Next lectures V: Exercise::: callout-caution## Example of Factorial Design*Question*: **What kind of design is used in the following research question:**\]- Is there a difference in the [number of hours]{style="color:purple; font-weight:bold"} studied depending on what university was attended (UA, UAFS, UALR) and whether it was Fall, Spring, or Summer semester across academic cohorts (2012 – 2015)? A different sample of students was taken each semester? A. 3 x 3 x 4 independent factorial ANOVA B. 3 x 3 x 4 split-plot factorial ANOVA C. 3 x 4 independent factorial ANOVA D. 3 x 4 repeated-measure factorial ANOVA- Answer: [A]{.mohu} - [We have three IVs; school, semester, and cohort; school has three levels (UA, UAFS, UALR), semester has three levels (Fall, Spring, Summer), and cohort has four levels (2012, 2013, 2014, 2015).]{.mohu} - [And this design does not have the dependent (or repeated-measure) components, because “the different sample of students was taken from each semester.”]{.mohu} - [Generally, the smaller number of levels comes the first.]{.mohu} - [So, this is an example of 3 x 3 x 4 independent factorial ANOVA.]{.mohu}:::## Advantages of Factorial Design I- Factorial design offers several advantages over a one-way ANOVA: 1. It allows for greater generalizability of results - More realistic (**external validity**) - Example: One study of using participants with the age groups including Youth and Childhood vs. One study using only Youth 2. It allows for investigation of interactions (**answer more complicated RQs**) - We can ask whether the effect of one variable depends on the level of another variable - Example: cancer trials; *Does the effect of treatment method depends on stage of cancer?* 3. It requires fewer participants than two 1-way treatment designs for the same level of statistical power. - Smaller error terms means more statistical power!------------------------------------------------------------------------### Example: Cancer Trials- Below is an R script demonstrating a factorial design for a cancer trial where the goal is to examine whether the effect of a **treatment method** depends on the **stage of cancer** (i.e., testing for an interaction effect).```{webr-r}#| context: setup# Load necessary packageslibrary(tidyverse)# Set seed for reproducibilityset.seed(123)# Simulate data for a factorial design (Treatment Method × Cancer Stage)n <- 30 # Number of participants per group# Define factorstreatment <- rep(c("Standard", "New"), each = 2 * n) # Two treatment methodsstage <- rep(c("Early", "Advanced"), times = n * 2) # Two cancer stages# Simulate survival rates (DV) with main effects & interactionsurvival_rate <- c( rnorm(n, mean = 70, sd = 10), # Standard treatment, Early stage rnorm(n, mean = 60, sd = 10), # Standard treatment, Advanced stage rnorm(n, mean = 75, sd = 10), # New treatment, Early stage rnorm(n, mean = 55, sd = 10) # New treatment, Advanced stage)# Create dataframecancer_data <- data.frame(Treatment = treatment, Stage = stage, Survival = survival_rate)# Convert to factorscancer_data$Treatment <- factor(cancer_data$Treatment, levels = c("Standard", "New"))cancer_data$Stage <- factor(cancer_data$Stage, levels = c("Early", "Advanced"))``````{webr-r}table(cancer_data$Treatment, cancer_data$Stage)# Perform two-way ANOVAanova_model <- aov(Survival ~ Treatment * Stage, data = cancer_data)summary(anova_model)# Interaction plotinteraction.plot(cancer_data$Stage, cancer_data$Treatment, cancer_data$Survival, col = c("blue", "red"), lty = 1, pch = 19, main = "Interaction Between Treatment and Cancer Stage", xlab = "Cancer Stage", ylab = "Mean Survival Rate")```## Interpretation of the Results::: callout-note### Interpretation of Results- If the main effect of Tutoring is significant, then grades differ across tutoring conditions, regardless of school type.- If the main effect of School Type is significant, then grades differ across school types, regardless of tutoring.- If the interaction effect (Tutoring × School) is significant, then the impact of tutoring depends on school type, suggesting that different tutoring programs work better in different school settings.\:::## Advantages of Factorial Design II1. First advantage of Factorial Design is reducing the error variance so that we can have higher power to examine group differences.::: callout-important### Example: the effect of level of arousal on math performance- Let’s say we are interested in testing the effect of level of arousal (e.g., anxiety) on math performance, but we already think that gender is associated with performance. - Adding gender as an IV explains more variance, thus, reducing the amount of error variance of DV. - **Note.** We define error as anything NOT related to (explained by) the IVs.\:::## Advantages of Factorial Design III1. One key advantage of factorial design is that it allows us to consider more complicated scenarios by including multiple factors in a single experiment. This means we can analyze not only the main effects of each factor but also their interactions, providing deeper insights into real-world complexities.::: callout-note### Cancer Trial Example: Expanding the Factorial Design- In the previous example, we considered a 2×2 factorial design with two factors: - Treatment Method (Standard vs. New) - Cancer Stage (Early vs. Advanced)- However, real-world cancer treatments involve more than two factors. For instance, we might also want to consider: - Dosage Level (Low vs. High) - Patient Age Group (Young vs. Elderly)- By extending the factorial design to a 2×2×2×2 factorial structure, we can investigate more complex research questions, such as: 1. Does the effectiveness of a treatment vary not only by cancer stage but also by dosage level? 2. Do younger and older patients respond differently to a specific combination of treatment and dosage? 3. Is there a three-way interaction, where the best treatment strategy depends on a combination of treatment type, cancer stage, and dosage level?:::------------------------------------------------------------------------### Cancer Trial Example: Why is this important::: callout-important### Why Is This Important?- **Better Understanding of Individual Differences:** - A one-factor design (e.g., comparing only treatment types) would ignore how different patient groups respond differently to the same treatment. - By incorporating multiple factors, we can identify which subgroups benefit most from a specific intervention.- **More Realistic Clinical Applications:** - In real-world medicine, treatment effectiveness depends on multiple interacting factors. - A factorial design helps us replicate complex clinical settings more accurately than a simple one-factor experiment.- **Efficient Use of Resources:** - Instead of conducting multiple separate studies to test each factor independently, factorial design allows us to study all factors simultaneously in a single experiment. - This reduces costs and increases statistical power by leveraging shared data across conditions.:::## Steps for conducting two-way ANOVA- **Similar to one-way ANOVA**: 1. Set the null hypothesis and the alternative hypothesis → Research question? 2. Find the critical value of test statistics (i.e., F-critical) based on alpha and df 3. Calculate the observed value of test statistics (i.e., F-observed) based on the information about the collected data (i.e., the sample) 4. Make the statistical conclusion → either reject or retain the null hypothesis 5. State the research conclusion regarding the research question## Hypothesis testing for two-way ANOVA (I)- For two-way ANOVA, there are three distinct hypothesis tests : - Main effect of Factor A - Main effect of Factor B - Interaction of A and B::: callout-note### DefinitionF-test: Three separate F-tests are conducted for each!Main effect: Occurs when there is a difference between levels for one factorInteraction: Occurs when the effect of one factor on the DV depends on the particular level of the other factor: Said another way, when the difference in one factor is moderated by the other: Said a third way, if the difference between levels of one factor is different, depending on the other factor:::## Hypothesis testing for two-way ANOVA (II)::: callout-important## Example**Background**: A researcher is investigating how study method (Factor A: Lecture vs. Interactive) and test format (Factor B: Multiple-Choice vs. Open-Ended) affect student performance (dependent variable: test scores). The study involves randomly assigning students to one of the two study methods and then assessing their performance on one of the two test formats.- **Main Effect of Study Method (Factor A):** - [H0:]{style="color:royalblue; font-weight:bold"} There is no difference in test scores between students who used the Lecture method and those who used the Interactive method. - [H1:]{style="color:tomato; font-weight:bold"} There is a significant difference in test scores between the two study methods.- **Main Effect of Test Format (Factor B):** - [H0:]{style="color:royalblue; font-weight:bold"} There is no difference in test scores between students taking a Multiple-Choice test and those taking an Open-Ended test. - [H1:]{style="color:tomato; font-weight:bold"} There is a significant difference in test scores between the two test formats.- **Interaction Effect (Study Method × Test Format):** - [H0:]{style="color:royalblue; font-weight:bold"} The effect of study method on test scores is the same regardless of test format. - [H1:]{style="color:tomato; font-weight:bold"} The effect of study method on test scores depends on the test format (i.e., there is an interaction).:::## Hypothesis testing for two-way ANOVA (III)1. Hypothesis test for the main effect with more than 2 levels - Mean differences among levels of one factor: i) Differences are tested for statistical significance ii) Each factor is evaluated independently of the other factor(s) in the study - Factor A’s Main effect: “Controlling Factor B, are there differences in the DV across Factor A?” $H_0$: $\mu_{A_1}=\mu_{A_2}=\cdots=\mu_{A_k}$ $H_1$: At least one $\mu_{A_i}$ is different from the control group in Factor A - Factor B’s Main effect : “Controlling Factor A, are there differences in the DV across Factor B?” $H_0$: $\mu_{B_1}=\mu_{B_2}=\cdots=\mu_{B_k}$ $H_1$: At least one $\mu_{B_i}$ is different from the control group in Factor A::: callout-caution**Question**: what dose "controlling" mean in “Controlling Factor A, ..."?**Answer**: ["Controlling" means the decrease of error variances by incorporating the effects of factor A on outcome?]{.mohu}:::## Hypothesis testing for two-way ANOVA (IV)::: callout-note## Example: Tutoring Program and Types of Schools on Grades- IVs: 1. Tutoring Programs: (1) No tutor; (2) Once a week; (3) Daily 2. Types of schools: (1) Public (2) Private-secular (3) Private-religious- **Research purpose**: to examine the effect of tutoring program (no tutor, once a week, and daily) AND types of school (e.g., public, private-secular, and private-religious) on the students’ grades- **Question**: What are the null and alternative hypotheses for the main effects in the example?: - Factor A’s Main effect: “[Controlling school types, are there differences in the students’ grade across three tutoring programs?]{.mohu}” $H_0$: $\mu_{\mathrm{no\ tutor}}=\mu_{\mathrm{once\ a\ week}}=\mu_{\mathrm{daily}}$ - Factor B’s Main effect : “[Controlling tutoring programs, are there differences in the students’ grades across three school types?]{.mohu}” $H_0$: $\mu_{\mathrm{public}}=\mu_{\mathrm{private-religious}}=\mu_{\mathrm{private-secular}}$:::## Visualization of Two-Way ANOVA (No Interaction)- The main-effect-only no-interaction statistical form$$\mathrm{Grade} = \beta_0 + \beta_1 \mathrm{Toturing_{Once}} + \beta_2 \mathrm{Toturing_{Daily}} \\ + \beta_3 \mathrm{SchoolType_{PvtS}} + \beta_4 \mathrm{SchoolType_{PvtR}}$$```{webr-r}#| context: setuplibrary(tidyverse)# Set seed for reproducibilityset.seed(123)# Define sample size per groupn <- 30 # Define factor levelstutoring <- rep(c("No Tutor", "Once a Week", "Daily"), each = 3 * n)school <- rep(c("Public", "Private-Secular", "Private-Religious"), times = n * 3)# Simulate student grades with assumed effectsgrades <- c( rnorm(n, mean = 75, sd = 5), # No tutor, Public rnorm(n, mean = 78, sd = 5), # No tutor, Private-Secular rnorm(n, mean = 76, sd = 5), # No tutor, Private-Religious rnorm(n, mean = 80, sd = 5), # Once a week, Public rnorm(n, mean = 83, sd = 5), # Once a week, Private-Secular rnorm(n, mean = 81, sd = 5), # Once a week, Private-Religious rnorm(n, mean = 85, sd = 5), # Daily, Public rnorm(n, mean = 88, sd = 5), # Daily, Private-Secular rnorm(n, mean = 86, sd = 5) # Daily, Private-Religious)# Create a dataframedata <- data.frame( Tutoring = factor(tutoring, levels = c("No Tutor", "Once a Week", "Daily")), School = factor(school, levels = c("Public", "Private-Secular", "Private-Religious")), Grades = grades)``````{r}#| include: falselibrary(tidyverse)# Set seed for reproducibilityset.seed(123)# Define sample size per groupn <-30# Define factor levelstutoring <-rep(c("No Tutor", "Once a Week", "Daily"), each =3* n)school <-rep(c("Public", "Private-Secular", "Private-Religious"), times = n *3)# Simulate student grades with assumed effectsgrades <-c(rnorm(n, mean =75, sd =5), # No tutor, Publicrnorm(n, mean =78, sd =5), # No tutor, Private-Secularrnorm(n, mean =76, sd =5), # No tutor, Private-Religiousrnorm(n, mean =80, sd =5), # Once a week, Publicrnorm(n, mean =83, sd =5), # Once a week, Private-Secularrnorm(n, mean =81, sd =5), # Once a week, Private-Religiousrnorm(n, mean =85, sd =5), # Daily, Publicrnorm(n, mean =88, sd =5), # Daily, Private-Secularrnorm(n, mean =86, sd =5) # Daily, Private-Religious)# Create a dataframedata <-data.frame(Tutoring =factor(tutoring, levels =c("No Tutor", "Once a Week", "Daily")),School =factor(school, levels =c("Public", "Private-Secular", "Private-Religious")),Grades = grades)```- Main effect of Tutoring Programs - Collapsing across School Type - Ignoring the difference levels of School Type - Averaging DV regarding Tutoring Programs across the levels of School Type- Main effect of School Type - Collapsing across Tutoring Program - Ignoring the difference levels of Tutoring Program - Averaging DV regarding School Type across the levels of Tutoring Programs------------------------------------------------------------------------::: panel-tabset### R Plot```{r}#| code-fold: true# Compute mean and standard error for each tutoring grouptutoring_summary <- data |>group_by(Tutoring) |>summarise(Mean_Grade_byTutor =mean(Grades),School =factor(c("Public","Private-Secular", "Private-Religious"), levels =c("Public","Private-Secular", "Private-Religious")) ) school_summary <- data |>group_by(School) |>summarise(Mean_Grade_bySchool =mean(Grades) ) total_summary <- tutoring_summary |>left_join(school_summary, by ="School")# Plot main effect of tutoringggplot(total_summary, aes(x = School)) +geom_point(aes(y = Mean_Grade_byTutor, color = Tutoring), size =5) +geom_hline(aes(yintercept = Mean_Grade_byTutor, color = Tutoring), linewidth =1.3) +scale_y_continuous(limits =c(70, 90), breaks =seq(70, 90, 5)) +labs(title ="Main Effect of Tutoring on Student Grades",x ="School Types",y ="Mean Grade") +theme_minimal() +theme(legend.position ="none") # Remove legend for cleaner visualization```### Summary Statistics```{r}distinct(tutoring_summary[, c("Tutoring", "Mean_Grade_byTutor")])```### Interpretation of the Plot- The x-axis represents three school types.- The y-axis represents the mean student grades for the different tutoring programs: No Tutor, Once a Week, and Daily.- If the main effect of tutoring is significant, we expect to see noticeable differences in mean grades across tutoring conditions.:::## You turn: Main Effect of School Type::: panel-tabset### Marginal Means for each tutoring / school type group```{webr-r}School_Levels <- c("Public","Private-Secular", "Private-Religious")tutoring_summary <- data |> group_by(Tutoring) |> summarise( Mean_Grade_byTutor = mean(Grades), School = factor(School_Levels, levels = School_Levels) ) school_summary <- data |> group_by(School) |> summarise( Mean_Grade_bySchool = mean(Grades) ) total_summary <- tutoring_summary |> left_join(school_summary, by = "School")```### Plot main effect of School Type```{webr-r}ggplot(total_summary, aes(x = Tutoring)) + geom_point(aes(y = Mean_Grade_bySchool, color = School)) + geom_hline(aes(yintercept = Mean_Grade_bySchool, color = School), linewidth = 1.3) + scale_y_continuous(limits = c(70, 90), breaks = seq(70, 90, 5)) + labs(title = "Main Effect of School Type on Student Grades", x = "Tutoring Type", y = "Mean Grade") + theme_minimal() + theme(legend.position = "none") # Remove legend for cleaner visualization```:::## Combined Visualization::: panel-tabset### R Plot```{webr-r}fit <- lm(Grades ~ Tutoring + School, data = data)Line_No_Tutor <- c(fit$coefficients[1], fit$coefficients[1] + c(fit$coefficients[4], fit$coefficients[5]))Line_Once_A_Week <- Line_No_Tutor + fit$coefficients[2]Line_Daily <- Line_No_Tutor + fit$coefficients[3]School_Levels <- c("Public","Private-Secular", "Private-Religious")combined_vis_data <- data.frame( School_Levels, Line_No_Tutor, Line_Once_A_Week, Line_Daily)ggplot(data = combined_vis_data, aes(x = School_Levels)) + geom_path(aes(y = as.numeric(Line_No_Tutor)), group = 1, color = "red", linewidth = 1.4) + geom_path(aes(y = as.numeric(Line_Once_A_Week)), group = 1, color = "green", linewidth = 1.4) + geom_path(aes(y = as.numeric(Line_Daily)), group = 1, color = "blue", linewidth = 1.4) + geom_point(aes(y = as.numeric(Line_No_Tutor))) + geom_point(aes(y = as.numeric(Line_Once_A_Week))) + geom_point(aes(y = as.numeric(Line_Daily))) + labs(title = "Main Effect of Tutoring and School Type on Student Grades", x = "Tutoring Type", y = "Mean Grade") + theme_minimal() + theme(legend.position = "none") ```### In the graph- There is the LARGE effect of Factor [Tutoring]{.mohu} and very small effect of [School Type]{.mohu}: - Effect of Factor [Tutoring]{.mohu}: the LARGE vertical distance - Effect of Factor [School Type]{.mohu}: the very small horizontal distance:::## Summary1. Overview of factorial design - **Independent / between-subject** (today's focus) - Dependent / within-subject - Mixed design2. Advantages of factor design - Reduction of residuals and higher statistical power - Consider more complicated scenarios and answer more complex research questions - Better understanding group differences depending on other characteristics - Save research resources3. Key components of Two-way ANOVA - **Main Effects of Single Factors** (today's focus) - Interaction Effects