NEET Analysis

Author

Jihong Zhang

1 Analysis Plan

  1. We can conduct descriptive statistics by three NEET groups to examine groups differences regarding demographic information, STG proportion, CLD outcomes.
  2. We can use odds ratio (OR) significance tests to test the significance of groups differences regarding STGs. This can help us test the hypotheses that whether some specific STGs associate with long-term NEET or long-term exit.
  3. To test the significance of specific pair of groups (such as long-term NEET vs. long-term exit), we can conduct multinominal logistic regression with regression coefficient (b1) is the estimated increase in the log odds.

2 Data Analysis

YM = Young mothers; HY = Hidden Youth; SD = School dropouts; PSD = Potential school dropouts; YASB = Youth with anti-social behavior(s); EMY = Ethnic minority youth; YRCS = Youth living in residential care settings; YO = Youth offender; YCR = Youth with criminal record(s); SEN = Youth with special education needs;

R Code
source("Code/data_Preparation_NEET0609.R")

2.1 Descriptive statistics

We have relatively high missing rate for education, because of respondents choose option 7 “none of the above”. Those samples will be removed for further OR analysis and multinomial regression analysis.

R Code
kable(res$desc, digits = 2) |> 
  row_spec(which(rownames(res$desc) %in% c("EduF", "EduM")), background = softcolors[3]) |> 
  kable_styling()
Table 1: Sample Size for demographic variables
N (NA) Mean SD | Median Min Max Skewness Kurtosis
Male 443 1.50 0.50 | 2 1 2 0.00 -2.00
Age 443 18.73 3.26 | 18 14 29 0.72 -0.05
EduF 170 273 2.56 1.31 | 2 1 6 0.88 0.11
EduM 184 259 2.44 1.17 | 2 1 6 0.84 0.40
EduS 439 4 2.91 0.85 | 3 1 6 1.24 2.39
Residence 443 17.60 4.48 | 17 1 29 -0.70 1.98
Assistance 438 5 1.84 0.37 | 2 1 2 -1.80 1.26

Remove missing values of residence and assistance. The final sample is N = 434.

Table 2: Frequency table for demographic variables
Var Levels N %
Male 1 216 49.8
2 218 50.2
Education 1 2 0.5
2 130 30.0
3 240 55.3
4 35 8.1
5 19 4.4
6 8 1.8
Assistance 1 70 16.1
2 364 83.9
Table 3: Sample size by groups
Subgroup N Prop(%)
Long-Term NEET 153 35.25
Long-Term Exit 50 11.52
Temporary NEET 231 53.23
Missing 1709 79.96

Table 3: the last column Prop(%) represents differnt things for NEET subgroups and Missing. For the missing group, it denotes the proportion of missing rate out of the whole samples (N= 2143). For NEET groups, those numbers denote proportion of the group size out of the NEET group (N = 434).

R Code
datNeetWide <- NEET_Subgroup_Cleaned_basic |> 
  dplyr::select(Male, Age, EduS, 
                Residence, Assistance,
                YM:SEN, CE1:CLDH1, Subgroup) |> 
  filter(!is.na(EduS), !is.na(Assistance))
datNeetLong <-  datNeetWide |> 
  rownames_to_column('id') |> 
  pivot_longer(-c(id, Subgroup), names_to = "Vars", 
               values_to = "Resp", values_transform = as.numeric)

datNeetLongTbl <- datNeetLong |> 
  group_by