Jihong Zhang, Ph.D. – Lecture 05: ANOVA Comparisons and Contrasts

Group	Contrast 1	Contrast 2	Product
G1	+1	-1	-1
G2	+1	+1	+1
G3	-2	0	0
Sum	0	0	0

Background

Hypothesis: STEM students have different growth mindset scores(score) than non-STEM students.
Weights assigned:
- STEM (Engineering, Chemistry):
- Non-STEM (Education, Political Sci, Psychology):
Compute contrast value and test using t-statistic.

Set Contrasts in R

Code

library(tidyverse)
library(kableExtra)
library(here)
# Set seed for reproducibility
set.seed(42)
dt <- read.csv(here("teaching/2025-01-13-Experiment-Design/Lecture05","week5_example.csv"))
options(digits = 5)
summary_tbl <- dt |> 
  group_by(group) |> 
  summarise(
    N = n(),
    Mean = mean(score),
    SD = sd(score),
    shapiro.test.p.values = shapiro.test(score)$p.value
  )
kable(summary_tbl)

group	N	Mean	SD	shapiro.test.p.values
g1	28	4.2500	3.15054	0.07759
g2	28	2.7589	2.19478	0.07605
g3	28	3.5446	2.86506	0.00623
g4	28	3.8568	0.58325	0.03023
g5	28	2.0243	1.30911	0.06147

HOV Assumption: Levene’s Test

aov_fit <- aov(score ~ group, data = dt)
car::leveneTest(aov_fit) |> as.data.frame() |> kable()

	Df	F value	Pr(>F)
group	4	12.966	0
	135	NA	NA

Even though assumption checkings did not pass using original categorical levels, we may be still interested in different group contrasts.

Complex Contrast Matrix

There are multiple “canned” contrasts: Helmert, Sum (Effective Coding), Treatment

For example, Helmert Four contrasts:

g1 vs. g2:
vs. g3:
vs. g4:
vs. g5:

Summary Statistics:

Code

dt$group <- factor(dt$group, levels = c("g1", "g2", "g3", "g4", "g5"))
groups <- levels(dt$group)
cH <- contr.helmert(groups) # pre-defined four contrasts
colnames(cH) <- paste0("Ctras", 1:4)
summary_ctras_tbl <- cbind(summary_tbl, cH)
kable(summary_ctras_tbl)

	group	N	Mean	SD	shapiro.test.p.values	Ctras1	Ctras2	Ctras3	Ctras4
g1	g1	28	4.2500	3.15054	0.07759	-1	-1	-1	-1
g2	g2	28	2.7589	2.19478	0.07605	1	-1	-1	-1
g3	g3	28	3.5446	2.86506	0.00623	0	2	-1	-1
g4	g4	28	3.8568	0.58325	0.03023	0	0	3	-1
g5	g5	28	2.0243	1.30911	0.06147	0	0	0	4

Orthogonal contrast matrix

apply(cH, 2, sum)

Ctras1 Ctras2 Ctras3 Ctras4 
     0      0      0      0

crossprod(cH) # diagonal -- columns are orthogonal

       Ctras1 Ctras2 Ctras3 Ctras4
Ctras1      2      0      0      0
Ctras2      0      6      0      0
Ctras3      0      0     12      0
Ctras4      0      0      0     20

summary(aov(score ~ group, dt))

             Df Sum Sq Mean Sq F value Pr(>F)   
group         4     89    22.3    4.47  0.002 **
Residuals   135    675     5.0                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Planed Contrasts and Coding Schema

The relationship between planned contrasts in ANOVA and coding in regression lies in how categorical variables are represented and interpreted in statistical models.
Both approaches aim to test specific hypotheses about group differences, but their implementation varies based on the framework
- ANOVA focuses on partitioning variance,
- while regression interprets categorical predictors through coding schemes.

ANOVA: t-value formula for Defined Contrast Matrix

Sum_C2_n <- colSums(cH^2 / summary_tbl$N)
C <- crossprod(summary_tbl$Mean, cH)
MSE <- 5.0
t <- as.numeric(C / sqrt(MSE * Sum_C2_n))
t

[1] -2.495040  0.077632  0.694597 -3.340639

tibble(
  t_value = t,
  p_value = pt(t, df = 135) ## p-values
)

# A tibble: 4 × 2
  t_value  p_value
    <dbl>    <dbl>
1 -2.50   0.00690 
2  0.0776 0.531   
3  0.695  0.756   
4 -3.34   0.000541

g1 vs. g2: We reject the null and determine that the mean of the Education is different from the mean of Engineering in their growth mindset scores (p = 0.531).
vs. g3: We retain the null and determine that the mean of the Chemistry is not significant different from the mean of Education and Engineering in their growth mindset scores (p = 0.531).

Helmert Contrast

Remember that Planned Contrast: g1 vs. g2 from Helmert Contrast:

t-value: -2.495
p-value: 0.0069
df: 134

contrasts(dt$group) <- "contr.helmert"
fit_helmert <- lm(score ~ group, dt)
contr.helmert(levels(dt$group))

   [,1] [,2] [,3] [,4]
g1   -1   -1   -1   -1
g2    1   -1   -1   -1
g3    0    2   -1   -1
g4    0    0    3   -1
g5    0    0    0    4

summary(fit_helmert)$coefficients |> round(3)

            Estimate Std. Error t value Pr(>|t|)
(Intercept)    3.287      0.189  17.391    0.000
group1        -0.746      0.299  -2.495    0.014
group2         0.013      0.173   0.078    0.938
group3         0.085      0.122   0.695    0.489
group4        -0.316      0.095  -3.340    0.001

Planned Contrast Connected to Linear Regression

Planned contrast can be done using linear regression + contrasts
Let’s look at the default contrasts plan: treatment contrasts == dummy coding

## treatment contrast matrix 
attributes(C(dt$group, treatment, 4))$contrasts

   g2 g3 g4 g5
g1  0  0  0  0
g2  1  0  0  0
g3  0  1  0  0
g4  0  0  1  0
g5  0  0  0  1

## sum contrast matrix 
attributes(C(dt$group, sum, 4))$contrasts

   [,1] [,2] [,3] [,4]
g1    1    0    0    0
g2    0    1    0    0
g3    0    0    1    0
g4    0    0    0    1
g5   -1   -1   -1   -1

attributes(C(dt$group, helmert, 4))$contrasts

   [,1] [,2] [,3] [,4]
g1   -1   -1   -1   -1
g2    1   -1   -1   -1
g3    0    2   -1   -1
g4    0    0    3   -1
g5    0    0    0    4

crossprod(attributes(C(dt$group, treatment, 4))$contrasts)

   g2 g3 g4 g5
g2  1  0  0  0
g3  0  1  0  0
g4  0  0  1  0
g5  0  0  0  1

Treatment Contrasts

For treatment contrasts, four dummy variables are created to compared:
- G1 (ref) vs. G2
- G1 (ref) vs. G3
- G1 (ref) vs. G4
- G1 (ref) vs. G5

Intercept: G1’s mean
group2: G2 vs. G1
group3: G3 vs. G1
group4: G4 vs. G1
group5: G5 vs. G1

library(multcomp)
contrasts(dt$group) <- "contr.treatment"
fit <- lm(score ~ group, dt)
unique(cbind(model.matrix(fit), group = dt$group))

    (Intercept) groupg2 groupg3 groupg4 groupg5 group
1             1       0       0       0       0     1
29            1       1       0       0       0     2
57            1       0       1       0       0     3
85            1       0       0       1       0     4
113           1       0       0       0       1     5

summary(fit)$coefficients

            Estimate Std. Error t value   Pr(>|t|)
(Intercept)  4.25000    0.42262 10.0562 4.2275e-18
groupg2     -1.49107    0.59768 -2.4948 1.3810e-02
groupg3     -0.70536    0.59768 -1.1802 2.4001e-01
groupg4     -0.39321    0.59768 -0.6579 5.1172e-01
groupg5     -2.22571    0.59768 -3.7239 2.8718e-04

Sum Contrasts

Another type of coding is effect coding. In R, the corresponding contrast type are the so-called sum contrasts.
A detailed post about sum contrasts can be found here
With sum contrasts the reference level is in fact the grand mean.
- vs. g1/g2/g3/g4: the difference between mean score of g1 with grand mean across all five groups

contrasts(dt$group) <- "contr.sum"
fit2 <- lm(score ~ group, dt)
contr.sum(levels(dt$group))

   [,1] [,2] [,3] [,4]
g1    1    0    0    0
g2    0    1    0    0
g3    0    0    1    0
g4    0    0    0    1
g5   -1   -1   -1   -1

summary(fit2)$coefficients

            Estimate Std. Error  t value   Pr(>|t|)
(Intercept)  3.28693    0.18900 17.39087 2.8188e-36
group1       0.96307    0.37801  2.54777 1.1962e-02
group2      -0.52800    0.37801 -1.39680 1.6476e-01
group3       0.25771    0.37801  0.68177 4.9655e-01
group4       0.56986    0.37801  1.50753 1.3401e-01

mean(dt$score) # (Intercept) grand mean

[1] 3.2869

tibble(
  Label = paste0("group", 1:4),
  Estimate = summary_tbl$Mean[1:4] - mean(dt$score) 
)

# A tibble: 4 × 2
  Label  Estimate
  <chr>     <dbl>
1 group1    0.963
2 group2   -0.528
3 group3    0.258
4 group4    0.570

Effect Coding (Deviation Coding)

In modern statistics, Regression-style coding is statistically equivalent as ANOVA-style contrast matrix.
- Equivalent to ANOVA-style contrasts. (we will use this in R to reproduce ANOVA-style contrast matrix)
Compares each level to the grand mean.

Note

Effect coding is a method of encoding categorical variables in regression models, similar to dummy coding, but with a different interpretation of the resulting coefficients. It is particularly useful when researchers want to compare each level of a categorical variable to the overall mean rather than to a specific reference category.

1. Definition and Representation

In effect coding, categorical variables are transformed into numerical variables, typically using values of -1, 0, and 1. The key difference from dummy coding is that the reference category is represented by -1 instead of 0, and the coefficients indicate deviations from the grand mean.

For a categorical variable with k levels, effect coding requires k-1 coded variables. If we have a categorical variable X with three levels: , the effect coding scheme could be:

Category
A	1	0
B	0	1
C (reference)	-1	-1

The last category () is the reference group, coded as -1 for all indicator variables.

2. Interpretation of Coefficients

When effect coding is used in a regression model:

and are coded varaibles. They have no much meaning, but their coefficients are important
represents the grand mean of across all categories.
and represent the deviation of categories and from the grand mean.
The reference group () does not have a separate coefficient; instead, its deviation can be inferred as .

Code

library(ggplot2)
# Create a data frame for text labels
text_data <- data.frame(
  x = rep(0.25, 3),  # Repeating the same x-coordinate
  y = c(0.3, 0.7, 0.9),  # Different y-coordinates
  label = c("C: beta[0] - beta[1] - beta[2]", 
            "A: beta[0] + 1*'×'*beta[1] + 0*'×'*beta[2]", 
            "B: beta[0] + 0*'×'*beta[1] + 1*'×'*beta[2]")  # Labels
)

# Create an empty ggplot with defined limits
ggplot() +
  geom_text(data = text_data, aes(x = x, y = y, label = label), parse = TRUE, size = 11) +
  # Add a vertical line at x = 0.5
  # geom_vline(xintercept = 0.5, color = "blue", linetype = "dashed", linewidth = 1) +
  # Add two horizontal lines at y = 0.3 and y = 0.7
  geom_hline(yintercept = c(0.35, 0.75, 0.95), color = "red", linetype = "solid", linewidth = 1) +
  geom_hline(yintercept = 0.5, color = "grey", linetype = "solid", linewidth = 1) +
  geom_text(aes(x = .25, y = .45, label = "grand mean of Y"), color = "grey", size = 11) +
  # Set axis limits
  xlim(0, 1) + ylim(0, 1) +
  labs(y = "Y", x = "") +
  # Theme adjustments
  theme_minimal() +
  theme(text = element_text(size = 20))

3. Comparison to Dummy Coding

Dummy Coding: Compares each category to a specific reference category (e.g., comparing A and B to C).

Category
A	1	0
B	0	1
C (reference)	0	0

Effect Coding: Compares each category to the grand mean rather than a single reference category.

4. Use Cases

Effect coding is beneficial when:

There is no natural baseline category, and comparisons to the overall mean are more meaningful.
Researchers want to maintain sum-to-zero constraints for categorical variables in linear models.
In ANOVA-style analyses, where main effects and interaction effects are tested under an equal-weight assumption.

5. Implementation in R

Effect coding can be set in R using the contr.sum function:

X <- factor(c("A", "B", "C"))
contrasts(X) <- contr.sum(3) # set up effect coding in R
model <- lm(Y ~ X, data = mydata) # use linear regression to mimic ANOVA-style results
summary(model)

Self-defined Contrast

Extended Example 2 : Assume now that I think the average of the STEM groups is different than the average of the non-STEM groups

Method 1: Calculation by Hand

  group  N   Mean      SD shapiro.test.p.values Contrasts
1    g1 28 4.2500 3.15054             0.0775874   0.50000
2    g2 28 2.7589 2.19478             0.0760542  -0.33333
3    g3 28 3.5446 2.86506             0.0062253   0.50000
4    g4 28 3.8568 0.58325             0.0302312  -0.33333
5    g5 28 2.0243 1.30911             0.0614743  -0.33333

weighted mean difference:

(C <- sum(summary_tbl_ext$Contrasts*summary_tbl_ext$Mean))

[1] 1.0173

(Sum_C2_n <- sum(summary_tbl_ext$Contrasts^2 / summary_tbl$N))

[1] 0.029762

(MSE = sum((residuals(aov(score ~ group, dt)))^2) / (nrow(dt) - 5))

[1] 5.0011

(t = as.numeric(C / sqrt(MSE * Sum_C2_n)))

[1] 2.6369

pt(t, df = 135, lower.tail = FALSE) * 2

[1] 0.0093476

Method 2: Linear Regression Contrasts by R

# set first contrast
contrasts(dt$group) <- matrix(
  c(1/2, -1/3, 1/2, -1/3, -1/3)
)
fit_extended <- lm(score ~ group, dt)
unique(model.matrix(fit_extended))[, 1:2]

    (Intercept)   group1
1             1  0.50000
29            1 -0.33333
57            1  0.50000
85            1 -0.33333
113           1 -0.33333

summary(fit_extended)$coefficient |> round(3)

            Estimate Std. Error t value Pr(>|t|)
(Intercept)    3.287      0.189  17.391    0.000
group1         1.221      0.463   2.637    0.009
group2        -0.518      0.423  -1.227    0.222
group3         0.948      0.423   2.243    0.027
group4        -0.885      0.423  -2.093    0.038