Nudges to Improve Learning and Gender Parity Preliminary findings on supporting parent–child educational engagement during Covid-19 using mobile phones

In this study we evaluate a digital intervention to improve low-literate caregivers’ engagement with their children’s education and development in rural Ghana during the Covid-19 pandemic. The programme was a text-message-based behavioural change intervention for parents / caregivers that aimed to improve caregiver engagement in children’s educational activities, caregiver beliefs about returns to education, as well as children’s learning, enrollment, attendance, and gender parity in education. This household-randomised trial, conducted in the North East, Northern, Savannah, Upper East, and Upper West regions of Ghana, tested four variations of the intervention, varying both duration and a gender-parity focus. Households were randomised to one of five conditions: (i) regular behavioural nudges, 12 weeks; (ii) gender-boost behavioural nudges, 12 weeks; (iii) regular behavioural nudges, 24 weeks; (ii) gender-boost behavioural-nudges, 24 weeks; or (v) control. The interventions were implemented from January to April 2021 (for the 12-week groups) and January to June 2021 (for the 24-week groups). We collected data at midline (April–June 2021) and endline (August–September 2021). Our preliminary results suggest that a short, light-touch, SMS-based intervention can change caregiver behaviours and child outcomes in a rural, low-literate sample. However, the results were complex and intervention effectiveness depended on the caregiver having minimum levels of schooling. For caregivers with no education (65% of the sample), the intervention only increased caregiver expectations on reaching the desired level of education, especially among girls, but reduced educational engagement and some measures of children’s school enrollment and attendance. Educational engagement among Ghanaian caregivers is low relative to peer countries (⇡Bornstein & Putnick, 2012; ⇡Bornstein & Putnick, 2012). The findings suggest that caregivers may need a base level of capital and resource (e.g., exposure to formal education) to enact the messages and increase their educational engagement with their children. Without this base level of capital, messages may increase caregivers’ aspirations for their children without providing enough support to change educational investments in positive ways. Nudges To Improve Learning and Gender Parity 2


List of
Background information on the primary caregiver collected at baseline. 16 Table 2. Treatment impacts on primary outcomes at midline and endline. 35 Table 3. Impact heterogeneity by caregiver education. 39 Table 4. Heterogeneity -child gender. 44 Table 5. Impact heterogeneity by child age group. 47

Short Message Service -text message
Nudges To Improve Learning and Gender Parity

Introduction
In this section, we first provide background to the project, followed by details of the study and context, as well as a detailed presentation of the research questions and implications of the project for policy and practice.

Background to the study
The Covid-19 pandemic led to unprecedented extended school closures around the globe. Ghana's schools were closed from March 2020 through January 2021. In addition to evidence of growing inequalities in access to remote learning (⇡Innovations for Poverty Action, 2020), there is concern that as schools have re-opened, the most vulnerable children are the ones least likely to have returned to school.
Negative macroeconomic shocks, such as recessions, health crises, or droughts increase schooling opportunity costs, inducing decreases in educational investments, especially for girls (⇡Björkman-Nyqvist, 2013). Further, emerging evidence shows that the increased caregiving responsibilities stemming from the Covid-19 crisis is disproportionally assumed by women and girls (⇡Nesbitt-Ahmed & Subrahmanian, 2020), risking progress towards gender parity and chances of returning to school. The need for low-cost, gender-sensitive solutions to minimise disruptions to learning is urgent.
Caregiver involvement in children's education in Ghana is low (⇡Bornstein & Putnick, 2012;⇡McCoy et al., 2018), particularly in the more disadvantaged northern regions. Engaging parents / guardians / caregivers (henceforth defined as 'caregivers') to ensure an equitable return to school for boys and girls and learning opportunities at home is key. Yet disadvantaged caregivers face informational and social norms / expectations, which are barriers to supporting learning (⇡Bergman, 2019). Many have differing perspectives about educational investment returns, especially for girls, or about what is socially expected regarding educational engagement. Providing timely, actionable information to poor and low-educated caregivers, including via text messages as a low-cost intervention, can attenuate these barriers and improve caregiver engagement across child-age groups and gender (⇡Bergman, 2019). Whether such interventions work during a pandemic, where stressors are greater than under non-emergency circumstances, is unknown.

Purpose / aims of this evaluation
We evaluate an intervention that aims to improve caregiver engagement in education among low-literate parents in rural Ghana and provide the first Nudges To Improve Learning and Gender Parity evidence on heterogeneity by implementation duration. First, we provided timely, actionable information to caregivers via text messages ('nudges') to attenuate caregiver behavioural and informational barriers to learning. Second, for a subset of caregivers, messages were tailored to address differing perspectives and norms about girls' education, aiming to equalise educational opportunities, caregivers' investments, and time use between learning and care-work. Finally, we tested whether there are differences in impacts based on implementation duration.

Context of the study
Our study took place in three rural regions of Ghana. Ghana's Human Capital Index is 0.44, meaning that a child born today can only be expected to reach 44% of their potential. Further, 70% and 80% of children in Grades 2 and 4 respectively, cannot read a simple word or perform basic arithmetic operations (⇡World Bank, 2018). Gender inequalities in learning are wide, as shown in ⇡World Bank (2018). For instance, while national re-enrollment rates as schools re-opened are 97%, 60% of dropouts were girls (⇡World Bank et al., 2021).
Our sample includes households with school-aged children ( Our implementing partner, Movva, sent biweekly nudges via text messages (SMS) to caregivers in simplified English, with reminders and encouragement messages targeted at habit formation. Messages included suggestions of activities that aid in social-emotional development and nurturing education and do not require curricular knowledge. Specific suggestions related to supporting children's return to school were included. The messages included reminders, encouragement and activities engaging information gaps, different understandings behind gender inequalities in education, and broader development. Each core module was delivered over two weeks, with two messages per week, and structured as a specific sequence based in behavioural economic theory to induce behaviour change -a motivating fact (Message 1), a suggested activity (Message 2), an interactive message (Message 3), and a growth message (Message 4). Movva's programme has been implemented in several countries such as Brazil (including during the Covid-19 crisis) and Cote d'Ivoire for as little as one month to an entire school year (⇡Bettinger et al., 2021;⇡Lichand & Wolf, 2021;, but Nudges To Improve Learning and Gender Parity variation in the duration of exposure to messages inducing belief and behaviour change has never been tested.

What this study adds to the knowledge base
Our study makes several contributions to the knowledge base about nudge-based caregiver / parenting programmes. First, most studies testing SMS-nudge interventions to parents have been conducted in middle-or high-income countries, where caregivers likely have a base level of human capital and formal education; we conducted this study in a low-literature and low-education rural sample, growing the evidence base on whether and when these types of intervention to support child education and caregiver engagement are effective. Second, we tested whether a focus on gender-parity can make a larger difference for girls' education compared to behavioural nudges generally about parenting for all children. Third, we tested this strategy during a public health crisis -namely, the Covid-19 pandemicand assessed the impacts of an SMS-based nudge program during a relatively stressful macro-economic and health crisis. Fourth, we tested whether the duration of messages matters in behaviour change, a question that has never been tested before.
This field experiment tested the following primary hypotheses: 1. Do behavioural nudges to caregivers in the form of SMS messages increase caregiver engagement in educational activities?
Hypothesis: SMS nudges increase caregiver engagement in their child's education and school life.

2.
Do messages change caregiver beliefs about returns to education and educational expectations and aspirations for each child of target age?
Hypothesis: SMS nudges increase caregivers' support for and investment in their child's education and aspirations for the future, and thus children's schooling and time devoted to educational activities should increase.
3. Do messages improve children's schooling outcomes (i.e., enrollment, attendance)? Are there changes in learning outcomes?
Hypothesis: SMS nudges improve schooling outcomes both in the short-term and medium-term and decrease school dropout rates.
We focus on schooling as primary outcomes (rather than learning), as it is not clear that children's learning outcomes will improve. Given low educational quality, attending school does not necessarily translate into Our design is a household-randomised trial. Randomisation was stratified within each of the three regions. Our final sample included 2,640 households randomly assigned to: 1. Nudges to caregivers supporting involvement with children's learning, child's social-emotional development, academic aspirations, and engagement (12 weeks).

2.
A 'gender-parity boost' arm to caregivers of both boys and girls, in which some of the nudges include content promoting girls' education and addressing some common stereotypes around gender roles (12 weeks).
3. Treatment (1) implemented for 24 weeks into the first term of the next academic year.
4. Treatment (2) implemented for 24 weeks, into the first term of the next academic year.

Implications for policy and practice
Our preliminary results suggest that a short, light-touch, SMS-based intervention can change caregiver behaviours. However, intervention effects vary widely by caregiver and child characteristics.
When we examined average results in the full sample, the interventions operated counter to our hypotheses, and decreased caregiver engagement, decreased self-reported school enrollment and attendance, decreased caregiver mental health, and decreased children's academic skills. These negative effects appear to be concentrated for less-advantaged caregivers and children -specifically, caregivers with no formal education, girls, and younger children. On the other hand, caregivers who had some schooling, increased their home and school engagement, consistently with our theory of change.
These findings suggest that caregivers may need a base level of resources (e.g., exposure to formal education) to enact the messages into positive behavioural changes for their children. Without this base level of human capital, the messages can backfire, as they may increase caregivers' aspirations for their children without providing enough support to change investments in positive ways. Indeed, in our study, the caregivers with no formal schooling reported a reduction in parental self-efficacy as a result of receiving these messages, while educated caregivers reported increases in this self-efficacy. This finding is in line with recent evidence that the digital version of a programme that was previously found to be effective, reduced caregiver mental health and increased stress, particularly among male caregivers during the pandemic in El Salvador (⇡Amaral et al., 2021).
Longer-term follow-ups would be important to understand how long these changes persisted after the programme was no longer being delivered to caregivers.
These results contribute to a small but growing evidence base about SMS-based nudge interventions to parents and caregivers (see ⇡Bergman, 2019 for a review). Importantly, the majority of studies that have found these types of programmes to improve parenting and child outcomes have been concentrated in middle-or high-income country contexts (⇡Bergman, 2019).
Our study is one of the first to test this type of programme in a rural, low-income, African setting, and during a public health and economic crisis.
Our findings suggest that careful consideration of the broader context of caregivers' and children's lives is needed to ensure programmes are tailored in ways that ultimately support caregiver investments in child education, caregiver-child relationships, and children's education. This attention to the context is especially pivotal in stressful times such as the current Covid-19 pandemic, which may be exacerbating existing poverty-related stressors.
We also note that our sample was predominantly Muslim, and that midline data collection occurred during Ramadan when many caregivers and many older children were fasting. In addition, our endline data collection took place during the harvesting season, and many children were working in the fields to earn money for school fees to return to school. Thus, the timing of both midline and endline data collection occurred at particularly unusual or busy times of the year. Our results suggest that the macro-context in which families receive these messages may be key to consider when they can be effective and when they may cause additional stress for caregivers. It is possible that the broader context was quite challenging for families at the time the programmes were implemented -e.g., fasting, economic hardship due to the pandemic -and that these messages related to parent and child investments caused additional stress for parents in ways that backfired.
We found that for caregivers with no formal schooling, educational expectations increased, making them more optimistic about the educational prospects for their children. In other words, the intervention may be successfully tackling an aspiration failure among the poor, which could lead to a self-sustaining poverty trap (⇡Ray, 2006). However, these caregivers do not seem to have the necessary human capital resources to act upon the nudges, e.g., by shifting their involvement in child education at home and school. By contrast, nudges seem to be effective in improving engagement among caregivers who have a minimum level of education. There are overall very low schooling levels among caregivers: as noted, only 35% of caregivers have some schooling, and among those caregivers who attended school at some point, around half have at most completed primary. Thus, the programme seems to be effective among caregivers with a very low level of education, with the condition that they have at least some education.

Structure
The rest of the report is organised as follows. In Section 2 we present related literature on the subject. Next, in Section 3 we discuss the methodology for the study. The results of the impact evaluation are then presented in Section 4. Finally, in Section 5, we highlight the policy implications and draw conclusions from the findings in Section 6.

Literature review
Ghana's Human Capital Index is 0.44, meaning that a child born today can only be expected to reach 44% of their potential. Further, 70% and 80% of children in Grades 2 and 4 respectively cannot read a simple word or perform basic arithmetic operations (⇡World Bank, 2018). The need to find low-cost solutions to minimise disruptions to learning and schooling in the aftermath of school closures is urgent, especially in the most-disadvantaged northern regions. Caregiver engagement is a key input that supports children's school persistence and learning outcomes. Caregiver engagement may vary by child gender, due to different opportunity costs of schooling for girls and boys (e.g., larger involvement of girls in household or care-work, or greater time spent in work outside the household for boys), lower perceived returns to girls' education, and widespread gender bias in social norms and aspirations (⇡Alderman & King, 1998). Providing timely, actionable information to poor caregivers with a low level of education, including via text-messages as a low-cost intervention can attenuate these barriers and improve caregiver engagement across child-age groups and gender (⇡Bergman, 2019). If such interventions work during and after a pandemic, where stressors are greater than under non-emergency circumstances, is unknown.
Research on remote learning programmes shows that caregiver participation is key to the learning of children, although this literature is mostly from high-income settings, leaving an important gap for low-income contexts. For instance, ⇡Myers et al. (2018) note that co-viewing remote classes provides essential support for children to respond to and learn from video chat interactions. Their findings suggest that children depend primarily on live social partners to make sense of their media experiences. Similarly, ⇡Troseth et al. (2006) suggest that learning deficits from televised-learning programmes experienced by children can be overcome through caregiver-child interaction. Additional research equally confirms the benefits of using text messages (SMS) to encourage caregiver-child engagement (⇡Doss et al., 2017), and a recent review of the literature found that such programmes can substantially improve children's learning outcomes at a low cost (⇡Bergman, 2019). For example, results from ⇡Doss et al. (2017) show that a low-cost, personalised literacy texting intervention to parents can have a substantial effect on student academic outcomes. Further, evidence from a school-based intervention in India shows that persuasive messaging around gender can positively address gender norms among parents and children (⇡Dhar et al., 2018).
Additionally, important evidence gaps exist when it comes to best practices in engaging caregivers through mobile messaging. Unanswered questions include the optimal length of exposure (i.e., how long messages need to be sent for to create long-term behaviour change), and the optimal degree of customisation, particularly to tackle caregivers' asymmetric beliefs for boys vs girls (i.e., what messages are most effective to change underlying beliefs about returns to girls' education, leading to long-term behaviour change). A further evidence gap relates to differences by child age in the effectiveness of mobile messaging interventions around caregiver engagement in child learning.

Methodology
Our design is a household-randomised control trial. Randomisation was stratified within each of the five regions. Households were assigned to receive either SMS nudges or no SMS nudges. Households in the comparison group did not receive any messages during the study period. Households were randomly assigned to one of five experimental groups: 1. Behavioural nudges (n = 513 households / primary caregivers): Primary caregivers received messages encouraging involvement with children's learning, their child's social-emotional development, academic aspirations, and ensuring children return to school after schools have reopened (12 weeks -24 SMS).
2. Behavioural nudges with 'gender-parity nudges' (n = 527 households / primary caregivers): Primary caregivers received messages, in which the content built on that in the standard message treatment with nudges increasing the salience of girls and including content promoting girls' education. Some messages addressed some common stereotypes around gender roles (12 weeks -24 SMS).
3. Behavioural nudges of longer duration (n = 540 households / primary caregivers): Primary caregivers received the same messages as Group 1 but the programme's duration was 24 weeks -into the second term of the 2021 academic year (24 weeks -48 SMS).
4. Behavioural nudges with 'gender-parity nudges' of longer duration (n = 518 households / primary caregivers): Primary caregivers of both boys and girls received messages to parents, in which some of the nudges included content promoting girls' education and addressing some common stereotypes around gender roles during the school closures (24 weeks, into the second term of the 2021 academic year) (24 weeks -48 SMS).
5. Comparison group (n = 530 households / primary caregivers): No messages during the study period.

Research questions
We test the following research questions: 1. Do nudges to parents in the form of SMS messages increase the rate of children returning to school and general engagement with education when schools reopen? 2. Do nudges change caregiver beliefs about returns to education and expectations and aspirations?
3. Do nudges improve children's learning and schooling outcomes (i.e., enrollment, attendance) in the medium-term?
4. Are impacts more equitable across girls and boys if messages focus on gender parity in education and in behaviours / attitudes towards girls?
5. Are impacts greater and do they persist for longer if exposure is longer (24 versus 12 weeks)?

Research framework / methodology
We conducted four main data collection activities for the study, namely, participants enrollment and caregiver survey (baseline), mini-survey, midline survey, and endline survey. Apart from the participant enrollment and caregiver survey, which was administered via phone, the midline and endline surveys were administered in person. All surveys or assessments were administered by trained enumerators. See timeline in Figure 1.
We conducted the participants' enrollment and caregiver survey to screen households for eligibility (having a child of the target age group of 5-17 years), seek their consent, communicate project information, enrol them in the study, and obtain pertinent background information on the primary caregiver as detailed in Table 1 below. . However, due to issues with some of the mobile phone numbers (i.e., inactive phone numbers, wrong numbers, phone numbers switched off, and connectivity issues in the catchment areas), we conducted in-person tracking of potential households in the study areas to update or identify over 150 new mobile phone numbers. Upon calling the potential respondents, trained enumerators screened them in line with the eligibility criteria. Once the respondent met the eligibility criteria, we sought verbal consent of the primary caregiver to a. participate in the study b. allow the selected child(ren) to participate in the study c. inform them about the project, the intervention, and implementation modalities d. participate in the intervention (only those in the treatment group).
Both eligible primary caregivers and school-going children were enrolled during the participant enrollment and caregiver survey. However, no child assessment was conducted at this stage.
The midline survey was conducted to collect two main sets of information. First, to ■ update the household and personal information of the primary caregiver and update the status of the eligible children selected at baseline; ■ measure changes in the primary caregivers' behaviour and attitudes as well as participation in the intervention.
Second, to ■ conduct child direct assessment with eligible children on literacy, numeracy, socio-economic and other useful domains of child development.
The midline survey was conducted in person at the homes of the study participants. In each study household, we first interviewed the primary caregiver to obtain child-specific information on each selected child before assessing the eligible child(ren). Interviewing the caregiver first was grounded in two reasons: 1. to confirm the availability of the eligible selected child; 2. to obtain caregiver-reported measures on each selected child in the caregiver survey.
The process was automated in SurveyCTO such that once the caregiver survey was completed and data synced to the SurveyCTO server, it published the eligible child's form to enable the assessment to be conducted on the particular child.
To sample children, we used the household roster and randomly selected one child from the 5-9-year-old range, and one child from the 10-17-year-old range. This allows us to examine direct impact on children of different ages in the household.
The endline survey collected such information as household and personal information, the status of selected children, behaviour, and attitudes as well as participation in the intervention from primary caregivers. We also conducted child direct assessment with eligible children on literacy, numeracy, socio-economic, and other useful domains of child development. At endline, primary caregivers and children were interviewed in person based on their availability.
Of the 2,628 interviewed at baseline, only 41 (1.56%) were not interviewed at midline. An additional 47 households were not interviewed at endline (88 total; 3.35% of the baseline sample).

Figure 1. Study timeline.
When the research team arrived at each community, they first went to meet the chief or the community leader to conduct community entry protocols. The community entry is undertaken by paying a fee as a customary practice to inform the chief and his kinsmen about the presence of the team. They introduced themselves briefly and also showed their IPA ID cards and explained the activities to be carried out, the purpose, the mode of operation, and how many days they would be in the community. The chief then assigned someone to the enumerators (if possible -not all the time) to assist in locating the households concerned. Upon reaching a household, the enumerator spoke with the household head and explained the purpose of the visit and what the entire process would entail, while the household head assisted in identifying the primary caregiver. During interviews, both enumerators and respondents sat outside where they could be visible to everyone though in a quiet space devoid of external interruption. During the rainy season, where respondents could not provide spaces like verandahs or corridors for interviews, the process was automatically rescheduled. During meetings with all those involved, Covid-19 protocols were strictly followed. All participants provided consent in order to participate in the surveys and assessments.

Research instruments / tools
Our trial and pre-analysis plan were registered in the American Economic Association's Social Science Registry. Two main sets of outcome measures -1 primary and secondary -were used to assess the impacts of the intervention as specified in our pre-analysis plan. These outcome measures were collected using the same procedures for the intervention and the comparison groups.

Primary outcomes
We had five primary outcomes: 1. Caregiver engagement in education 2. Caregiver expectations and aspirations for their child's schooling 3. Caregiver expectations on returns on education 4. Caregiver beliefs about gender norms 5. Children's school enrollment and attendance (reported by both caregivers and children).

Caregiver engagement in education
At both midline and endline, caregivers reported on whether they engaged in a set of activities related to their child's education over the past three days, specific to their engagement with each of the two focal children sampled. Six activities were summed to create an index of the number of activities parents engaged in. Activities were slightly different for older and younger children to ensure developmental appropriateness. For home engagement for younger children (ages 5-9), we used the Multiple Indicators Cluster Survey (MICS) home stimulation index on six activities which included reading or looking at books, telling stories, singing songs, taking the child outside the home, playing with the child, and naming / counting / drawing with the child. For older children (ages 10-17), we adapted these items based on the Young Lives surveys (⇡Barnett et al., 2013). Activities included working on a project together, playing sports / active games / exercise, discussing time management, talking about family / community history / heritage, discussing future education and career plans, and encouraging the child to listen to or watch remote teaching. For school engagement, we used a series of seven dummy variables that were the same across the age groups. These included whether in the last month, the caregivers 1. helped the child with homework 2. asked the child what they did at school.
Whether in the last academic year they 3. attended a PTA meeting 4. attended a scheduled meeting with the child's teacher 5. attended a school or class event 6. volunteered or served on a school committee 7. participated in fundraising for the child's school.
We also measured child-reported caregiver home engagement on their education, which is measured in the same way as the caregiver-reported measure. As the two scales are moderately correlated (r=0.40, p<0.01), we analysed treatment effects on both scales.

Caregiver expectations and aspirations for their child's schooling
At midline and endline, caregivers reported on their (i) educational aspirations and (ii) expectations for two focal children in the household: 'What is the highest level of education that you WISH [child] to achieve?', and 'What is the highest level of education that you EXPECT [child] to achieve?'. Overall, most caregivers had high aspirations for their child, with close to 80% aspiring for their child to achieve a post-secondary degree. We dichotomise both variables to indicate 'low aspirations' = 1 and 'low expectations' = 1 if the caregiver aspires for their child to reach a level that is below secondary high school or expects that the child will reach a level below secondary high school).

Caregiver expectations on returns on education
At endline only, caregivers reported on their perceived returns to education using a series of four questions (adapted from ⇡Attanasio & Kaufmann, 2014).
We asked six questions about perceived returns to three levels of education

Caregiver beliefs about gender norms
Gender bias was measured through two scales at baseline, midline, and endline. First, parents answered a single item: "Do you think education has a greater influence on your son's income than your daughter's income? (Yes/No). Second, parents answered 14 items from the "Gender norms and attitudes scale" (⇡Waszak et al., 2001), which measures egalitarian beliefs about male and female gender norms. Specifically, the scale assesses whether parents agree or disagree with a series of statements about the promotion of equity for girls and women and maintaining the rights and privileges of men (14 items / 2 subscales). Example items include: "It is important that sons have more education than daughters", "Daughters should be sent to school only if they are not needed to help at home", and "Daughters should have just the same chance to work outside the homes as sons."

Children's school enrollment and attendance (reported by both caregivers and children)
School attendance and enrollment were assessed at midline and endline through parent and child reports, where they answered: "How many days of school has [NAME] missed in the past week of school (Monday-Friday)?" We dichotomise this variable to indicate 'low attendance', if the child never attends or attends infrequently (2 or 3 times per week), which corresponds to around 10% of the sample. We use child-and caregiver-reported scales for both school enrollment and attendance, as correlations between child-and caregiver responses are moderate or low, respectively (school enrollment, r=0.65, p<0.01; low attendance: 0.08, p<0.01).

Secondary outcomes
Our secondary outcomes are divided into those we hypothesise will be mediators to the primary outcomes and those that we consider distal outcomes to the treatment.

Child time use
Older children (10-17 years) were asked to report on the following: "On a typical weekday from Monday to Friday (not a weekend or a holiday, or a day in which a child was sick), how many hours did you spend on the following activities last week?" Children reported on the number of hours they spent on eight activities: sleep, caring for others (e.g., younger siblings or the elderly), household chores, working on the farm or other family business, working for pay, in school, studying, and engaging in leisure (e.g., playing). These items were adapted from Young Lives (⇡Barnett et al., 2013). Following Young Lives, to measure these outcomes, children were given 24 counters and, using cardboard with 8 circles representing the above-listed 8 categories of activities. Fieldworkers asked the children to distribute the 24 counters according to the time spent in each of the eight tasks, as shown in Figure 2.

Caregiver self-efficacy
At both midline and endline, caregiver self-efficacy was measured using one scale from Bandura's Parental Self-Efficacy Scale (⇡Bandura et al., 2001), specifically the eight-item subscale related to parental self-efficacy regarding their children's schooling and learning. The scale was scored from 1-5, with 1 = nothing, and 5 = a great deal. Example items include: "How much can you do to make your children see school as valuable?" and "How much can you do to help your children get good grades in school?"

Disciplinary practices
Disciplinary practices used by caregivers were self-reported for each focal child separately using the UNICEF MICS scale (⇡UNICEF, 2010), which asks caregivers if they have used a series of disciplinary practices in the previous month. These include three subscales that were combined in two: non-violent practices (three items) and violent practices (psychological aggression and physical violence, eight items). Caregivers were asked questions such as "did you explain why [child] behaviour was wrong in the past month?" "did you hit or slap [child] on the face, head, or ears in the past month"? "call [child] dumb, lazy, or another name in the past month?". Questions were adapted to the different child age groups.

Emotional supportiveness in the home
Caregiver emotional supportiveness was assessed at midline and endline using a 5-item scale from the Early Childhood Longitudinal Study -Kindergarten Cohort (⇡Chapman, 2010). Parents answered a variety of 2 hypothetical questions specific to supporting their children's emotional needs and answered on a 4-point scale (1 = very often true, 2 = often true, 3 = sometimes true, and 4 = never true. Example items include: "Even if I am really busy, I make time to listen to [child]," and "I encourage [child] to talk about his/her troubles." Items were adapted to the specific child age groups. Distal outcomes include 1. Children's academic outcomes (literacy and numeracy) 2. Children's social-emotional skills 3. Caregiver mental health.

Children's academic skills
We measured literacy and numeracy skills for children. We covered similar subskills for all age groups (e.g., reading comprehension, oral vocabulary for reading; addition and subtraction for maths). Given the wide age range in our sample, we had three assessments to cover different skill levels (i.e., an assessment for 5-9-year-olds, an assessment for 10-14-year-olds, and an assessment for 15-17-year-olds). Using tasks from International Development and Early Learning Assessment (IDELA), Early Grade Reading Assessment (EGRA), and the Young Lives surveys, we measured the following domains of skills for literacy: expressive language (oral vocabulary), non-word reading, spelling, oral reading and comprehension, and phonological awareness. Using tasks from the IDELA, Early Grade Math Assessment (EGMA) and the Young Lives surveys, we measured the following skills domains for numeracy: number identification, number / quantity discrimination, missing number patterns, number sorting, word problems, and operations (addition / subtraction, and multiplication and division). To create a summary score, the total correct number of answers on each subtask was calculated within each of the following three age groups: 5-9 years, 10-14 years, 15-17 years. This was because the number and difficulty of questions varied based on child age. Then, we summed the total number of correct answers within each of the three age groups, and finally age-and round-standardised overall scores for literacy and numeracy (M = 0, SD = 1). Thus, our scores represent children's performance relative to the peers in their age group.

Children's social-emotional skills
Social-emotional skills were only measured for older children (10-17 years). First, we measured self-esteem (using the Rosenberg Self-Esteem Scale;⇡Rosenberg, 1965). This is measured by assessing the level of agreement of the child to ten items. Examples are: "On the whole, I am satisfied with myself", or "At times, I think I am not good at all." Higher levels measure higher self-esteem. Second, we measured school motivation using the Elementary School Motivation Scale (⇡Guay et al., 2005). Children are asked how much they identify themselves with nine statements such as "I like to go to school" or "In life, it's important to go to school."

Caregiver mental health
Caregiver mental health was measured at endline only using the Kessler Psychological Distress Scale (⇡Kessler et al., 2002), a 10-item questionnaire used globally to measure general psychological distress based on questions about anxiety and depressive symptoms. Each item is scored from zero (none of the time) to four (all of the time). Items are added to create a total score, with higher scores indicating higher psychological distress and a higher likelihood of a mental health disorder.

Research design
The randomised design allowed for the identification of causal effects of the interventions on parents and children by comparing mean outcomes between the randomised treatment arms. The analysis followed an intention-to-treat approach, using econometric analysis for all the relevant outcomes of the intervention.
For each caregiver outcome, we estimated the following ordinary least squares regressions indexed by caregiver p from household h and survey s: Where: ■ is the outcome variable for caregiver p in household h and survey rounds.
■ NShort, GShort, NLong, and GLong are indicator variables assuming the value of 1 if the household has been randomly assigned to any of the treatment arms (Arm1: Nudges Short duration; Arm2: Gender boost short duration; Arm3: Nudges long duration; and Arm4: Gender boost long duration). We note that for the first caregiver assessment (the one conducted at the end of the 12-week implementation of Treatments 1 and 2), we pooled samples from Arms 1 and 3, and 2 and 4 to estimate effects. For the endline, T p included four treatment dummies to treatment arms 1, 2, 3, and 4 separately.
■ is the baseline outcome variable for caregiver p in household h (when available) ■ is a vector of caregiver and household controls should there be a lack of balance in the randomisation.

■
is an indicator variable assuming the value of 1 for households belonging to the Ghana Panel Study sample, 0 otherwise; ■ are region fixed effects ■ is an individual error term

Nudges To Improve Learning and Gender Parity
For each child outcome, we estimated the following ordinary least squares regressions indexed by child c, living in household h and survey s: Where: ■ is the outcome variable for child c, living in household h and survey s ■ NShort, GShort, NLong, and GLong are indicator variables assuming the value of 1 if the household has been randomly assigned to any of the treatment arms (Arm1: Nudges Short duration; Arm2: Gender boost short duration; Arm3: Nudges long duration; and Arm4: Gender boost long duration). As for the caregivers' outcomes, for the first child assessment (the one conducted at the end of the 12-week implementation of Treatments 1 and 2), we pooled samples from Arms 1 and 3, and 2 and 4 to estimate effects. For the endline, T p included four treatment dummies to treatment arms 1, 2, 3, and 4 separately.
■ is a vector of caregiver and household controls should there be a lack of balance in the randomisation.

■
is an indicator variable assuming the value of 1 for households belonging to the GUP sample, 0 otherwise; ■ are region fixed effects ■ is an individual error term, clustered at the household level.

Stakeholders
We have multiple stakeholders in Ghana, including the Ministry of Education (MoE), Ghana Education Services (GES), the World Bank, and a number of non-governmental organisations that work on education and gender in the northern regions (e.g., World Education Inc., NORSACC, and ActionAid). The SMS programme was based on Movva's EDUQ+ programme, which has been implemented and evaluated in Brazil and Cote d'Ivoire. We worked with the group of stakeholders to adapt the intervention to the Ghanaian and Covid-19 pandemic context, in collaboration with Movva. The programme shares weekly suggestions of activities for caregivers to do with their children -none of them linked to curricular activities; rather, ones that aim to bring caregivers closer to their children's school life by having them ask about school, discuss future plans, and share how they dealt with similar conflicts back in the day. Nudges are structured around sequences in a format inspired by READY4K!, an eight-month-long text-messaging intervention for parents of preschoolers that targets the behavioural barriers to engaged parenting (⇡York et al., 2019).

Figure 3. Sample messages.
Note: The first sequence portrays a message from the general programme; the second sequence portrays the same sequence adapted to the gender boost programme.
The idea of conducting the proposed research emerged from conversations with employees of the MoE/GES, which highlighted low levels of caregiver engagement as a key barrier to achieving equitable learning outcomes in Ghana. This issue was exacerbated by the Covid-19-induced school closures.
The study therefore provides an opportunity to highlight the role of caregivers as key stakeholders in education and to better understand how best to engage caregivers. Caregiver engagement is also a key component of the Ghana Accountability for Learning Outcomes Project (GALOP), and the Ministry of Education team has expressed interest in testing different approaches for engaging caregivers as part of the GALOP research.

Ethical considerations
The study received ethical approval by the Institutional Review Board of Innovations for Poverty Action. A protocol was submitted and reviewed by the ethics review board. In-depth training for interviewing and the ethics of working with children was administered prior to starting data collection. Children were interviewed in quiet places and there was a strong emphasis on making sure that the assessments were not 'high-stakes' exams and that the information provided is confidential.
Notably, Ghana's government only requires ethical review / approval in health research studies, as per the directive from the Ghana Health Services. This 3 study is not considered health research.
Before enrolling participants into the study, participants were screened, and their eligibility determined. Households had to have at least one school-aged child in the home. They were then administered the informed consent and their consent was sought to enroll them into the study and enroll their children into the study. We also sought their consent to participate in the SMS messages intervention. Only those who consented to participate in the study, allowing their children to participate in the study and participating in the SMS intervention were enrolled into the study. Consent was obtained from the participants verbally and recorded directly on the SurveyCTO form on a Samsung tablet. Caregivers who completed the surveys were given GH¢ 5 worth of airtime while their children were given Note 1 exercise books and pen / pencil.
Data collectors for both the caregiver survey and child outcome measures had prior experience working with children, were trained extensively in study protocols and research methodology, and spoke the local languages of the communities in the study. All assessments were conducted in the local languages of the participants (i.e., Dagbani, Buli, Dagaare, Gruni, Wali, Mampruli, Sisaasla). When data collectors arrived at a village, they first went to meet the chief or the community leader to conduct community entry protocols. The community entry was undertaken by paying a fee as a customary practice to inform the chief and his kinsmen about the presence of the team. The data collectors introduced themselves briefly and also showed their IPA identification cards. Then, the team explained the activities to be carried out to the chief, the purpose, the mode of operation, and how many days they would be in the community. If possible, the chief then assigned someone from the community to support the enumerators in locating the sampled households. Upon reaching a household, the enumerator identified the household heads and explained the purpose of the visit and what the entire process would entail. The household head in turn assisted in identifying the child's primary caregiver. During the interviews, both enumerators and respondents sat outside where they could be visible to everyone, though in a quiet space devoid of external interruptions. During the rainy season, when respondents could not provide spaces like verandahs or corridors for the start or continuations of interviews, the process was automatically rescheduled. All Covid-19 protocols were strictly followed.

Data collection challenges
We encountered the following challenges during the implementation of the data collection activities.

Contractual delays with the main funder
Our initial focus was to support caregiver engagement in remote learning as schools were closed. Unfortunately, we experienced substantial contractual delays that significantly delayed our start. We decided to start implementation in January 2021 to encourage caregivers to send their children back to schools while we were still finalising the contracts. Schools reopened on January 18, 2021. Thus, our focus shifted from supporting caregiver educational engagement during school closures to supporting caregivers in their engagement as schools re-opened.

Challenges in surveying participants by phone
In May 2020, we planned to conduct telephone surveys to collect information on caregiver and child outcome measures as in-person data collection was not feasible due to the peak of the Covid-19 pandemic in Ghana. However, the participant enrollment and caregiver baseline survey in December 2020 with the study participants, as well as lessons from conducting child phone surveys in the Quality Preschool for Ghana study (⇡Wolf, Aurino, Suntheimer et al., 4 2021) have highlighted several challenges with telephone surveys, with implications for data quality. First, many primary caregivers had issues trusting the intentions of the research project, given that we had not been able to establish in-person contact with them before, and all contact was via the phone. Second, there were substantial network challenges that led to the need to conduct in-person tracking for hard-to-reach respondents located in many communities. Finally, using phone-based assessments is not the optimal way to measure children's learning outcomes: in-person assessments provide far superior data. Flowing from the above, we shifted follow-up surveys in April 2021 and August 2021 to in-person surveys (instead of phone surveys) with strict adherence to Covid-19 protocols. By implementing in-person follow-up surveys, we were able to reach study participants more easily and with the assurance of higher-quality data. We conducted in-person follow-up surveys following IPA's policy on restarting face-to-face data collection and the Ghana Health Service Ethical Review Committee's protocol for conducting in-person data collection during the Covid-19 period. This included 1. Providing each field staff with sanitiser and face masks; 2. Providing each field team with a thermometer gun for checking the daily temperature of team members before and after fieldwork; 3. Practising good personal hygiene through regular hand washing, use of the sanitisers, and wearing face masks; 4. Adhering to the social distancing protocol by observing a distance of at least six feet between the study participant and the field staff.

Non-availability of resources for proper setup for conducting child direct assessments at home
Optimal administration of child assessments requires a setup with a table and two chairs -one each for the assessor and the child -to enable the assessor and the child to effectively use the assessment materials including stimulus cards in conducting the assessment. However, the lack of such resources in most households made it quite difficult to effectively conduct the assessments. In some cases, children were assessed while sitting on stones, the ground, and on trees / wood. This did not provide conducive environments for children to fully participate in the assessments and thus has implications for data quality. Given our experience with previous child direct assessment and our understanding of the study contexts, we trained our data collectors to be able to improvise in such circumstances in a way that reduced the impact of the learning space on the children and data quality, as shown in Figure 4.

Tracking of study participants
We designed the evaluation as a household-level intervention involving home-based data collection activities for both primary caregivers and school-going children. However, several natural and human factors affected the tracking of study participants at the endline and the planned data collection schedules. First, primary caregivers (and to some extent children) were often not available due to their participation in farming or business activities. Second, children were recruited across primary and secondary schools with varying school calendars making it difficult to identify a common period for tracking and assessing children. Third, some primary caregivers and children migrated from the study regions to other study regions or the southern part of Ghana to engage in diverse economic activities. Finally, data collection activities coincided with the rainy season, making it difficult to access communities for data collection. These factors affected our tracking efforts and data collection schedules. To reduce the impact of these challenges, reduce attrition, and ensure data quality, we implemented the following measures: 1. Interviewing study participants outside their homes including their farms, business venues, or community centres; 2. Staggering data collection activities depending on study participantschildren were mostly interviewed after school hours (late afternoons)while targeting non-available or busy study participants on weekends for data collection; 3. (Re)scheduling appointments with primary caregivers ahead of a home visit to avoid missing the study participants and ensuring that they were available for interviews; 4. Postponing data collection activities in inaccessible areas for the safety of data collectors.

Ramadan and harvesting season
The midline survey coincided with the Ramadan period, and some participants were not comfortable with participating in the data collection activity during the fasting period. Relatedly, primary caregivers engaged in harvesting activities were identified, and together with those involved in the Ramadan activity, data collection activities were adjusted to align with a more convenient time for participants, including during weekends.

Programme implementation challenges
The main challenges associated with the implementation of the intervention were as follows.

Difficulty in procuring a dedicated shortcode for deploying text messages
Delays in procuring a dedicated shortcode and configuring the different networks into the shortcode were one major cause of the delay in implementing the text messages intervention. To address this, the team relied on a shared shortcode from another project to kickstart the text message intervention.

Network connectivity issues
Due to the geographical location of the primary caregivers and network issues, some of the treatment households did not receive any of the text messages sent to them. SMS records from Movva show that the SMS messages were never delivered to 38 primary caregivers due to wrong numbers or non-working mobile phone numbers. One strategy used was to inform the primary caregivers of the time of day the messages would arrive so that they could be in a location that allowed them to receive the text messages.

Results
Our analytic approach is driven by the randomised design of the evaluation. We discuss the data and limitations with this approach, then present the impact evaluation findings for our primary and secondary outcomes.

Data
On average, caregivers were slightly above 40 years of age, and households were large -with around 10 members. On average, three household members were school-age children. The majority (60%) of the caregivers were female. Of the child sample, close to half (47%) were girls and the average age was 10 years. Based on baseline equivalency analysis (see ⇡Wolf et al., 2021), we conclude that the randomisation was successful in generating five equivalent groups.
The remainder of this section will outline the main limitations of this study, and later, the main results. We will start by documenting treatment effects on primary outcomes and then heterogeneity by caregiver characteristics. We will focus on caregiver schooling as the main axis of heterogeneity.

Limitations
As with any longitudinal research design, sample loss to follow-up can be a critical limitation to the internal validity of study findings. Fortunately, between baseline, midline, and endline, attrition was very low. Of the 2,628 interviewed at baseline, only 41 (1.56%) were not interviewed at midline. An additional 47 households were not interviewed at endline (88 total; 3.35% of the baseline sample). Attrition analysis for the endline sample highlights that participants withdrawing from the study were more likely to be from the Upper East and Upper West regions, more likely to be from the 24-weeks treatment groups, more likely to be a male caregiver who participated in the study, and from households with fewer children. These effects were small, ranging from 1-4 percentage points, so they should not substantially threaten the internal validity of our research design .
Further, school enrollment and attendance data were only reported by children and caregivers. We are waiting on permission from the Ghana Education Service to collect school enrollment and attendance data from school records. This process is taking much longer than anticipated and has significantly delayed this data collection. Future analysis will investigate correlation between survey-based data and administrative data, as well as the programme's treatment effects on administrative data.
Finally, due to the state of the Covid-19 pandemic at the start of the study, we were unable to collect baseline assessments of children and only collected a brief caregiver survey over the phone. Thus, we do not have baseline assessments of our outcomes. Notably, though, baseline equivalence was established across treatment arms, suggesting that comparisons at midline and endline between treatment and control groups are internally valid.

Main treatment effects for primary outcomes
We use intent-to-treat estimates in our analysis. Our treatment impacts are estimated based on parent and child reports. Given small to moderate correlations between parent and child reports of the same outcomes (e.g., school enrollment), ranging from 0.10-0.55, we interpret each source separately. We present the main impacts on outcomes at both midline and endline in Table 2 (Panels A and B, respectively). Results compare mean scores for treatment and control groups at each time point. We consider results as statistically significant if the p-value for the treatment coefficient is below 0.05. First, we found no effects of the intervention on changing home and school engagement, as reported by both caregivers and children. Notes: Caregiver = caregiver-report, child = child-report. Robust standard errors in brackets. *** p<0.01, ** p<0.05, * p<0.1 Second, for child schooling outcomes, as reported by both caregivers and children, surprisingly, we found negative impacts of the 12-week gender boost on school enrollment and attendance at both midline and endline of similar magnitude, with a decrease of 3 percentage points (p.p.), in enrollment at midline (endline), and an increase in low attendance of 4 p.p. and 5 p.p. at midline and endline respectively. For the 24-week behavioural nudges treatment, we also found a negative effect on parent-reported child school attendance at midline, with a 4 p.p. increase in the likelihood of low school attendance, respectively.
Third, regarding gender bias, there were no impacts on pro-boy bias at midline. However, there were statistically significant increases of the 12-week gender-boost arm on pro-boy bias at endline when compared to pro-boy bias levels of the control group.
Fourth, we examined impacts on the gap between caregiver aspirations and expectations for their child's schooling outcomes. There were no statistically significant treatment impacts on the aspirations-expectations gap at midline. At endline, however, there were statistically significant impacts. The analysis of impacts on aspirations and expectations separately shows that expectations increased.
Finally, we examined caregivers' perceived returns to education at endline; that is, how much parents believe attaining a particular level of education will impact the future salaries of their children. There were no treatment impacts on perceived returns to education.

Subgroup treatment effects for primary outcomes
We examined heterogeneity of treatment effects for three different sets of subgroups: caregiver education (whether caregivers ever attended school (35.2% of the sample) or not); child age (5-9-year-olds versus 10-17-year-olds); and child gender. We found significant differences for each subgroup, suggesting that a full understanding of the effectiveness of the treatment needs to be understood within the context of various subgroups.

Caregiver schooling
Caregiver schooling was an important moderator of treatment impacts, with the general pattern of results suggesting negative impacts on non-educated caregivers and some positive impacts on educated caregivers (Table 3). More specifically, except for the short standard treatment, all other treatment conditions increased caregiver school engagement for caregivers who have some schooling. These effects persisted at endline (although the level of statistical significance goes down to p<0.1), and also appear on the caregiver home engagement scales for the short gender boost and the long standard treatment arms. By contrast, for caregivers who never attended school, treatment effects on school engagement were negative at both midline and endline. On the other hand, only for those caregivers that never attended school, the two gender boost arms positively shifted educational expectations at endline.

Child gender
We also found interesting differences in results based on child gender, as shown in Table 4. We found impact heterogeneity by gender in how the intervention impacted both aspirations and returns to education. For aspirations, we found that caregivers' expectations for girls' education improve at both midline and endline for the 24-week gender boost arm.
The estimate of a model with a three-way interaction between caregiver's schooling and child gender (results available upon request) highlights that the positive effects of the nudges on aspirations for girls are concentrated among caregivers with no education. Thus, the programme seems to be tackling aspiration failure for the poorest girls.
Impacts on returns to education were also moderated by child gender, as they increase for boys and decrease for girls, although for the latter, coefficients are not significant.
At midline, based on caregivers' reports, the short-term gender boost treatment reduced enrollment and attendance for girls (by approximately 4 p.p.). However, for attendance, we found increases based on child-based reports. In this case, all treatment arms consistently decrease the probability of girls attending school infrequently by between 5-7 p.p. Without school record data, it is difficult to verify which source is accurate, and thus these impacts are inconclusive.
At endline, the negative effects of the short gender arm treatment on caregiver-reported enrollment and attendance persisted, while the positive effects based on the child-based attendance indicator faded out.
These opposite effects between caregiver-and child-reported attendance were not surprising, given the low correlation between these variables, but we will investigate more the reliability of these indicators when we will measure enrollment and attendance based on administrative data.

Child age
Finally, with regard to child age (Table 5), we found highly heterogeneous effects between younger (5-9-year-olds) and older (10-17-year-olds) children with regard to caregiver-reported engagement. The negative, although not significant, effects that were observed for the overall sample for caregiver school engagement were driven by the younger children. These were only statistically significant for the short-term gender boost treatment (p<0.05), but treatment effects across different arms are similar in size and direction of effects, leading to a reduction of about a third of an additional item in the caregiver home engagement scale. These effects faded out at endline.  Also, the improvement in educational expectation that was observed for the whole sample is concentrated among the younger age group, with a decrease in the caregivers' expected probability of not reaching the aspired level of education by 6 p.p. in the short gender boost sample at midline, and by 10 p.p. in the long gender boost treatment arm at endline. No other major differences are observed by child age groups.

Impacts on secondary outcomes
When examining impacts on secondary outcomes, we first report on what we categorise as distal outcomes (results not shown but available upon request). Overall, we found few effects on child outcomes. At the midline assessment there was a negative impact on children's academic outcomes for the 12-week gender-boost arm (b = -0.100, p < .05), but insignificant coefficients for all other treatment arms. These impacts faded out at endline. Further, there were no treatment effects on social-emotional outcomes at either time point.
Lastly, we found evidence that caregiver-reported psychological distress increased at endline for the 12-week behavioural nudges treatment arm, and no impacts on other treatment groups.
Regarding secondary outcomes that we categorise as mediators, overall, we did not see significant changes in child time use when looking across treatment arms at midline and endline. Regarding caregiver self-efficacy and emotional supportiveness in the home, we found no impacts at midline. At endline, however, we found reductions in caregiver self-efficacy for the 12-week behavioural nudges treatment arm (p<.05), and negative but non-significant coefficients for all other treatment arms. There were no impacts on emotional supportiveness in the home.

Policy Implications
This evaluation assessed the effectiveness of an SMS-based intervention on caregiver engagement in education, educational aspirations, and gender bias in rural, remote, and deprived communities in northern Ghana during the Covid-19 pandemic. We find that the programme contributed to improved caregiver engagement only for caregivers who have minimum levels of education. For those who do not have any schooling, the SMS intervention backfired and decreased engagement, especially with younger children. In addition, we did not find a clear difference between households that received the programme for 12 weeks and those that received it for 24 weeks. Questions remain about the optimal length of time (not too short but also not too long) during which these types of messages should be sent to participants to induce behaviour change in the short-and long term. These could be addressed by future studies. We do not find impacts on children learning outcomes.
We learnt several lessons that we hope will be of value to future researchers and policymakers aiming to increase caregiver engagement and children's schooling outcomes and use of SMS-based programmes to reach caregivers.
First, understanding the broader context in which messages are being sent is key to designing programmes in ways that ultimately remove barriers to caregiver engagement (as is the intended effect of nudge-based programmes) rather than creating additional stressors. Our evidence also suggests that tailoring messages based on caregivers' educational backgrounds may be key to supporting the most vulnerable families.
Second, through the pilot, we learnt that the programme implementation should be tailored in terms of (i) caregivers' preferred time of text message delivery; (ii) language. We adjusted the time of day that caregivers received the messages based on their stated preferences, which we believe increased their engagement with the messages based on our pilot. Further, contextualising the language of the messages to both the geographical location of the caregivers as well as their reading level was an important part of making the content more accessible to the clients. By the same token, we learnt that it is important to inform caregivers before programme implementation starts and to brand texts sent with the programme name. This is because mobile phone users often receive lots of spam in their SMS inboxes and therefore do not bother to read the texts before deleting unsolicited messages. Branding would help them selectively delete unsolicited messages while keeping specific messages on the intervention.
Additional questions for future research include the following.
■ Questions about the mechanisms through which programmes may cause additional stress to caregivers, particularly during stressful times such as the Covid-19 pandemic, and for which groups this additional stress may be greatest.
■ Whether phone-based interventions reach the intended participant (particularly in households where members share a mobile phone).
■ Whether programmes should be more focused around a single theme and not as broad as the EDU+ programme was, which aimed to improve relationships between caregivers and children, improve children's academic and social-emotional outcomes, improve positive disciplinary practices, and increase caregiver engagement in children's education and broader lives.
Future research may also consider how uni-directional text-based interventions compare with other interventions (e.g., face-to-face community dialogues with parents) in terms of impact and cost-effectiveness. In addition, understanding the value-added -if anyof text messages in comparison to in-person interventions.
Finally, future studies could consider involving household members beyond primary caregivers, such as older siblings, grandparents, etc. for supporting child education. This is because these other household members play critical roles in supporting children's education. Involving multiple household members with diverse skills and capabilities such as literacy and the ability to use technology when providing SMS-based nudges to caregivers may enhance the effectiveness of text message interventions. This is so for two reasons.
1. Different household members interact and support children's learning at different times of the day (e.g., older siblings, grandparents, etc.). 2. Household members who cannot read or do not use technology proficiently may rely on household members with the requisite skills for using technology.

Conclusion
Our preliminary results suggest that a short, light-touch SMS-based intervention can change caregiver behaviours. However, the story is complicated, and the impacts vary widely by caregiver and child characteristics. In many cases, the interventions operated counter to our hypotheses, and decreased caregiver engagement, decreased self-reported school enrollment and attendance, decreased caregiver mental health, and decreased children's numeracy skills. These negative effects appear to be concentrated for less-advantaged caregivers and children -specifically, caregivers with no formal education, girls, and younger children.
The findings suggest that caregivers may need a base level of education to enact the messages into positive changes for their children. Without this base level of capital, text-based interventions can backfire. Indeed, the caregivers with no formal schooling in our study reported a reduction in caregiver self-efficacy as a result of receiving the messages, while educated caregivers reported increases in this self-efficacy.
These results contribute to a small but growing evidence base about SMS-based nudge interventions to parents and caregivers. Importantly, the majority of studies that have found these types of programmes to improve parenting and child outcomes have been concentrated in middleor high-income country contexts (see ⇡Bergman, 2019 for a review). Our study is one of the first to test this type of programme in a rural, low-income, African setting, and during a public health and economic crisis. Longer-term follow-ups would be important to understand how long the observed changes persisted after the programme was no longer being delivered to caregivers. Our findings suggest that careful consideration of the broader context of caregivers' and children's lives is needed to ensure programmes are tailored in ways that ultimately support caregiver investments, caregiver-child relationships, and children's education. For example, an SMS-based programme implemented and evaluated during the pandemic in El Salvador targeting caregiver mental health also found that the programme reduced caregiver mental health and increased stress, particularly among male caregivers (⇡Amaral et al., 2021).
We also note that our sample was majority Muslim, and that midline data 5 collection occurred during Ramadan when many caregivers and many older children were fasting. In addition, our endline data collection took place during the harvesting season, and many children were working in the fields to earn money for school fees to return to school. Thus, the timing of both midline and endline data collection occurred at particularly unusual times of the year. Our results suggest that the macro-context in which families receive these messages may be key to consider when they can be effective and when they may cause additional stress for caregivers. It is possible that the broader context was quite challenging for families at the time the programmes were implemented -e.g., owing to fasting and economic hardship due to the Covid-19 pandemic -and that the messages related to parent and child investments caused additional stress for parents in ways that backfired.
By contrast, nudges seem to be effective in improving caregiver engagement with their children when caregivers have a minimum level of education. We recall that ours is a sample with overall very low education among caregivers. As noted, only 35% of caregivers in our sample have some schooling, and among those caregivers who -attended school at some point, around half of them have at most completed primary.
We have three planned next steps to continue to investigate caregivers' experiences with this programme. First, as soon as we have approval from the Ghanaian Ministry of Education and with funds from another grant, we will collect school administrative records to examine a third source of data on children's school enrollment and attendance, as well as assess impacts on these outcomes several months after endline.
Second, in January 2022, we conducted a qualitative study with 30 randomly selected treatment group participants. Using semi-structured interviews, we asked caregivers about their experiences with the programme, their perceptions of their role in supporting their child's education generally and in the context of the Covid-19 pandemic. We also examined how non-literate parents engaged with the messages. We will receive the transcripts soon and we hope this data will shed additional light on some of our counterintuitive findings.
Third, through a new grant from the LEGO Foundation, and with two Ghanaian colleagues (Richard Appiah at the University of Ghana and Esinam Avornyo at the University of Cape Coast), we will conduct a community-based participatory research project to inform a deeper understanding of caregiver attitudes towards engagement in their children's education generally, and girls specifically, as well as to inform the adaptation of the Parental Nudges Project programme (PNP)intervention to increase caregiver educational engagement and gender parity. We will work in eight diverse communities in Ghana to gain a deeper understanding of caregiver perceptions about investments in children's learning, their engagement in child education and the challenges they currently face, and to validate and improve the context-specificity of an existing behavioural caregiver / parenting intervention to support caregiver educational engagement. The communities were selected to understand perceptions of education and learning from a diverse group of stakeholders, 2 rural communities in the northern region, 1 peri-urban community in the North; 1 cocoa-growing community; 1 peri-urban community in the Greater Accra Region; 1 southern rural community).