1
Coding for the future. Effects of a computer assisted learning program on STEM and socio- emotional learning outcomes for girls.
Research Questions and Hypothesis Question 1: Is there an effect of the “Coding for the future” computer-assisted learning program on the choice of pursuing a STEM degree in college for girls? Question 2: Is there an effect of the “Coding for the Future” program on girl’s self-esteem and self- efficacy in middle school? Question 3: To what extent does the relationship with self-esteem and self-efficacy persist across time? Question 4: How do these relationships vary by race? H1: Given existing literature that sustains the positive relationship between computer-assisted learning (CAL) programs and STEM-related skills (Escueta et al., 2020; Malamud & Pop-Eleches, 2011; Campuzano et al., 2009; Rutherford et al., 2014; Van Klaveren et al., 2017), coupled with the literature that supports the fact that the younger age girls develop STEM-related skills the higher chance they will pursue a STEM-related field (Unkovic et al., 2016; Chetty & Barlow-Jones, 2018; Eagly and Karau, 2002), we hypothesize that a CAL learning program targeted to young girls will positively affect the decision of pursuing a STEM degree in college. H2: Existing studies that look into the STEM gender gap point out that senses of self-efficacy and self- esteem during school years are critical determinants of the choice of pursuing STEM fields (Unkovic et al., 2016; Chetty & Barlow-Jones, 2018). We hypothesize that by exposing young girls to STEM abilities at a young age, their sense of self-esteem and self-efficacy will increase. H3: Given that the long-term relationship of CAL and non-cognitive traits such as self-esteem and self- efficacy has not been explored yet, this hypothesis will remain non-directional. However, if H1 is not refuted, we would expect these non-cognitive traits to be higher. H4: Amongst women – who are already underrepresented in STEM fields – Black and Hispanic women are particularly under-represented (Farinde & Lewis, 2012; Price, 2010). We want to explore if the effects captured in RQ1 and RQ2 vary by race. Since our study is innovative, it will be the first to explore the effects of a girl-targeted CAL in pursuing STEM degrees and related non-cognitive traits. No literature breaks up this unique relationship by race. Although we expect to find differentiated effects, the hypothesis is non-directional. Data source: South Bend School District Long Term Data (SBSD-LTD) The South Bend School District Long Term Data is a longitudinal dataset that follows four cohorts of students across critical stages of their lives. It is an initiative by the Kellog Institute at Notre Dame University. Its primary goal is to follow educational, social, health, and economic trends across two large generational pools of students in the city to inform policy decisions better. It was inspired by the Young Lives international study, which follows a similar premise but in several countries across the world. The dataset consists of four cohorts of students: 2000, 2001, 2010, and the 2011 cohort. The year of the cohort represents the first year that data was collected for each cohort, which is the year they were finishing 1st grade. The follow-ups were done when most of each cohort was expected to hit steppingstones in their education paths: 4th grade, 6th grade, 8th grade, 12th grade, freshman year in college, and four years after expected graduation. Follow-ups were conducted even if members of the cohort did not fulfill the expected steppingstones (for example, students who failed a grade or who did not attend college were still contacted, and information was still collected from them). Each cohort originally consisted of approx. 1000 students from each of the 30 schools in South Bend, who, due to attrition and drop out, decreased with each round of data gathering. The sample is statistically representative for that age group and in South Bend. Further detail of the sample and number of participants is presented lines below. The SBSD-LTD collects information on a variety of topics. For their school years, the dataset implements parent, teacher, and student questionnaires. Each questionnaire is split into subjects. The parent questionnaire has questions on socioeconomic variables, such as education level, degree whenever it corresponds, nationality, parent’s ethnicity, and race, household income, employment, house material,
2
household family members, time allocation, mother’s health, attitudes towards school, and attitudes towards technology. Student’s questionnaire contains sections on student identification such as race, ethnicity, date of birth; students’ attitudes in school such as motivation towards school subjects, self- efficacy, self-esteem, agreeableness, openness to experience, conscientiousness, and growth mindset; attitudes and relationship with their peers with scales related to bullying, anxiety, and sense of belonging. Once they grow up and enter high school, questionnaires begin to include questionnaires related to risk behaviors such as drug and alcohol consumption and sexual activity, initial labor activities, and possible degree preferences. When they enter college, questionnaires include college and degree preferences, acceptances, applications, decisions, employment, and family planning. The Kellog Institute coordinates with the School District to collect information regarding regular grades, standardized test scores of ELA and Math, class sizes, participation in any state, district, or school program and matches that information in the SBSD-LTD dataset. Furthermore, teacher questionnaires collect information regarding teacher experience, specialization, race, certification, growth mindset, attitudes towards technology, and scales related to their relationships with school personnel and their students. Participants. For the purpose of this paper, we will focus on participant girls of cohorts 2000 and 2001, each of which finished 6th grade in 2005 and 2006, respectively. We chose this because the “Coding for the Future” program began implementing 6th-grade students in 2006. Table 1 presents the sample size corresponding to each study-relevant year. As Table 1 shows, the rounds we take into account corresponding to the years in which the students were in 4th, 6th, and freshman year. The reasoning behind the choice of these years will be further explained in the analysis section of this proposal.
Table 1. Sample for study Cohort Sample at Round 2
(4th grade) Sample at Round 3
(6th grade) Sample at Round 6
(freshman year) Sample used
(Common across Rounds) 2000 515
Year 2003 510
Year 2005 503
Year 2012 502
2001 578 Year 2004
570 Year 2006
543 Year 2013 538
Our final sample comprises 502 girls from the 2000 cohort and 538 students from the 2001 cohort, giving us a total of 1040 girls. We have information for all three relevant rounds of all the members of the sample. It is essential to point out that we are reducing the sample to girls because the program in question targets girls. We aim to see the effects of the program within the girl-identifying population. Outcome 1 – pursue of STEM degree. Our first relevant outcome will come from the Round 6 questionnaire. We will construct two dummy variables:
– Strong STEM choice: this will be a dummy variable that will take the value of 1 if the student is enrolled in a STEM major at the end of their corresponding freshman year in college and 0 otherwise. This is called a strong choice because the 0 condition includes girls who are pursuing a STEM degree as a minor and mixes those girls with other girls who may not be enrolled in college whatsoever.
– Weak STEM choice: this will be a dummy variable that will take the value of 1 if the student is enrolled in a STEM major or minor at the end of their corresponding freshman year in college and 0 otherwise. This is a more flexible definition.
Outcome 2 – self-efficacy and self-esteem. The measures of self-efficacy and self-esteem will come from all rounds of the database. This is possible because the same scale was used across the rounds. We will use the self-efficacy scale related to math comprised of 13 items and the self-esteem scale comprised of 11 items. We will conduct principal component analysis for both scales, predict the main factor and construct an index for each scale. The index will be continuous and range from 0 to 1. “Coding for the Future.” The primary treatment variable consists of the participation in the program “Coding for the Future.” This is a city-wide computer-assisted learning program targeted at girls in middle schools across the city. The program’s primary goals were familiarizing young girls with coding interfaces, getting them motivated with STEM fields at a young age, and increasing their self-efficacy and self-esteem. This was a 1-year program that was meant to be implemented once a week during 6th grade. It began implementation in 2006, and it was implemented in all of the middle schools in South Bend. This means that girls who cursed 6th grade in 2005 did not receive the program, and girls who
3
cursed 6th grade in 2006 did receive the program. This piece of information is critical for our identification strategy. Control variables. The SBSD-LTD is a very rich dataset that will allow us to use several control variables. However, we do not want to overload the analysis with irrelevant variables to the analysis. The main controls we will use are the following. To control the possibility that parents’ degree and education level play an important role in influencing their children’s degree choices, we will use variables that account for whether the parents hold a STEM degree themselves and control if they hold a college degree in general. Furthermore, we want to make sure that income level is not driving our analysis, for which we will control for that variable as well. Since parent involvement can also be a driver of the outcomes we are studying, we will use the time allocation measures to control for how much time parents spend doing schoolwork with their kids. Regarding students’ characteristics, we will control for race and student achievement, and peer relations. Given the longitudinal character of the dataset, we will only control specific teacher characteristics when the program was implemented, such as race, whether the race matches the student, and years of experience. Analysis
The fact that “Coding for the future” was implemented across all of the schools in 2006 is critical for our identification strategy. All of the 2001 cohort received the program, whereas the 2005 cohort did not. However, the girls who attended the same schools from different cohorts should be reasonably similar. This is a strong assumption, but one can be proved using previous data from the SBSD-LTD Round 1 and Round 2. The Kellog Institute considers Cohorts 1 and 2 as a “super-cohort,” given their similarities in trends and overall characteristics. For this reason, Cohort 2 (2001, who was in 6th grade in 2006) will serve as our treatment group, and Cohort 1 (2000, who was in 6th grade in 2005) will serve as our control group. Question 1: Is there an effect of the “Coding for the future” computer-assisted learning program on the choice of pursuing a STEM degree in college for girls? For this question, we want to look at the differences in our outcome 1 (enrollment in STEM degree at the end of freshman year) between girls in our treatment and control groups. Given that we only have one occasion for this outcome variable, we will be using a propensity score matching strategy to answer this question. This is a non-parametric strategy that finds subjects that are as statistically close to each other as possible in every single observable variable except for the treatment variable and compares the outcome variable between them. The steps we will follow are outlined:
i. We will estimate the propensity score matching using the control variables described in the previous section. We will use a Probit specification to estimate the propensity score for each sample member. The propensity score can be translated into “the probability of being selected into the treatment”. The Probit model is as follows: 𝑃𝑃𝑆𝑆𝑖𝑖 = 𝐵𝐵𝑋𝑋′ + 𝜖𝜖, where PSM is the propensity score for each i participant, which can be predicted by a matrix of X predictors that include all of our control variables. We will not include either participation in the program, the outcome variable, or age in this regression.
ii. Once we have the PS for each participant, we want to find area of common support. This means that we want to look for participants who have similar propensity scores, but who belong to both Cohort 1 and Cohort 2. The area of common support guarantees that our groups are comparable, since they are statistically the same in almost everything except for the participation in the program.
iii. After we have common support, we will use Stata to non-parametrically estimate both nearest neighbor match average treatment effect and kernel density. We will estimate both matching methods for robustness. The outcome of this estimation will yield the difference between our dummy variables of being enrolled in a STEM field at the end of freshman year. Since the outcome is a variable between 0 and 1, we can interpret our results as how much participating in the program affects the probability of being enrolled in a STEM degree at the end of freshman year.
4
To the extent that the sample size allows it, we will try that our matches are limited to students within the same schools. This will be done to control for school characteristics such as school climate, principal, and even neighborhood. Question 2: Is there an effect of the “Coding for the future” program on girl’s self-esteem and self- efficacy in middle school? For this question, we have several occasions of measure for our outcome variables, which correspond to outcome 2. Although our treatment and control groups remain the same, we will be employing a difference in difference approach. This is a rather stronger approach than the PSM because it allows us to control for non-observable characteristics that are stable across time. Although we are assuming our cohorts are similar, even if we admit there is a potential difference in non-observable characteristics that could drive differences in our outcome variable, we expect that the trends are similar across time. Therefore, a change in trends would be due to the treatment – in this case, the “Coding for the Future” program. The equation we will estimate is the following:
𝑦𝑦𝑖𝑖 = 𝛽𝛽0 + 𝛽𝛽1𝑡𝑡 + 𝛽𝛽2𝑐𝑐𝑐𝑐𝑐𝑐 + 𝛽𝛽3(𝑡𝑡 ∗ 𝑐𝑐𝑐𝑐𝑐𝑐) + 𝐵𝐵𝑋𝑋′ + 𝜖𝜖𝑖𝑖 Where 𝑦𝑦 represents either the self-efficacy or self-esteem outcome for each participant, 𝛽𝛽0 is the intercept, 𝑡𝑡 is time, which takes the value of 0 if the measures come from 4th grade and 1 if the measures come from 6th grade, 𝑐𝑐𝑐𝑐𝑐𝑐 is the treatment variable which represents the participation in the program (also a dummy variable), and our interest parameter is 𝛽𝛽3. 𝐵𝐵𝑋𝑋′ is a matrix of the control variables mentioned in the previous section and 𝜖𝜖𝑖𝑖 is the error term. A positive statistically significant value of 𝛽𝛽3 would mean that the program had a positive effect in the measures of self-efficacy and self-esteem for participating girls in the year the program took place. It is important to note that although our control and treatment groups answer questionnaires in different years, we are not incorporating the year of response into the analysis. Rather, we are using the grade they were in as the common characteristic that makes them comparable.
Robustness check – fixed effects at the school level. In order to control for school characteristics such as school climate, principal, and even neighborhood, we will also estimate the same equation including a fixed effects term 𝜆𝜆𝑠𝑠 for schools :
𝑦𝑦𝑖𝑖 = 𝛽𝛽0 + 𝛽𝛽1𝑡𝑡 + 𝛽𝛽2𝑐𝑐𝑐𝑐𝑐𝑐 + 𝛽𝛽3(𝑡𝑡 ∗ 𝑐𝑐𝑐𝑐𝑐𝑐) + 𝐵𝐵𝑋𝑋′ + 𝜆𝜆𝑠𝑠 + 𝜖𝜖𝑖𝑖 Question 3: To what extent does the relationship with self-esteem and self-efficacy persist across time? For this question, we will follow a similar strategy to the one used in the previous question. The only difference will be that of the time variable. Instead of taking the value of 0 when they were in 4th grade and 1 in 6th grade, it will take the value of 1 in their freshman year in college. This is due to the fact that we want to look at the long-term effects of the program. We will also conduct the robustness check using fixed effects at the school level. Question 4: How do these relationships vary by race? For choice of STEM degree. Since this is a non-parametric estimation, we will conduct the same analysis within subsamples. Specifically, for White, Black, Hispanic, and Asian women. We will first filter the data to keep all girls who mainly identify with each race, and then conduct the PSM analysis for that group. This way, we will compare Black women who participated in the program with Black women who did not participate (and from each racial group) without averaging out the effect. For self-efficacy and self-esteem. Although we can also filter the sample and use the same equations for the analysis, we can include interaction terms with race dummies in the estimation equations to include racial considerations. This would make the equations look like this:
𝑦𝑦𝑖𝑖 = 𝛽𝛽0 + 𝛽𝛽1𝑡𝑡 + 𝛽𝛽2𝑐𝑐𝑐𝑐𝑐𝑐 + 𝛽𝛽3(𝑡𝑡 ∗ 𝑐𝑐𝑐𝑐𝑐𝑐) + 𝛽𝛽4𝑐𝑐𝑐𝑐𝑐𝑐 ∗ 𝐵𝐵𝐵𝐵𝐵𝐵𝑐𝑐𝐵𝐵 + 𝛽𝛽5 ∗ 𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐵𝐵𝐻𝐻𝐻𝐻𝑐𝑐 + 𝐵𝐵𝑋𝑋′ + 𝜆𝜆𝑠𝑠 + 𝜖𝜖𝑖𝑖 The interaction of treatment and the race dummies will allow us to interpret the specific isolated effects for each race in our non-cognitive outcomes, both in the short and long terms.