Nurse-Family Partnership

A nurse home visiting program for economically disadvantaged first-time pregnant mothers designed to improve prenatal and child rearing practices through the child's second birthday.

Program Outcomes

Child Maltreatment
Cognitive Development
Delinquency and Criminal Behavior
Mental Health - Other
Physical Health and Well-Being
Preschool Communication/ Language Development
Reciprocal Parent-Child Warmth

Program Type

Home Visitation
Parent Training

Program Setting

Home

Continuum of Intervention

Selective Prevention

Age

Infant (0-2)

Gender

Female

Race/Ethnicity

Endorsements

Blueprints: Model
Crime Solutions: Effective
OJJDP Model Programs: Effective
SAMHSA : 3.2-3.5
Social Programs that Work:Top Tier

Program Information Contact

Nurse-Family Partnership National Service Office
1900 Grant Street, Suite 400
Denver, Colorado 80203
Direct phone: 303-327-4240
Toll free: 866-864-5226
Fax: 303-327-4260
email: info@nursefamilypartnership.org
www.nursefamilypartnership.org/

Program Developer/Owner

David L. Olds, Ph.D.
University of Colorado Health Sciences Center

Brief Description of the Program

Nurse-Family Partnership begins during pregnancy as early as is possible and continues through the child's second birthday. Nurses work with low-income pregnant mothers bearing their first child to improve the outcomes of pregnancy, improve infant health and development, and improve the mother's own personal life-course development through instruction and observation during home visits. These visits generally occur every other week and last 60-90 minutes.

Specific objectives include improving women's diets; helping women monitor their weight gain and eliminate the use of cigarettes, alcohol, and drugs; teaching parents to identify the signs of pregnancy complication; encouraging regular rest, appropriate exercise, and good personal hygiene related to obstetrical health; and preparing parents for labor, delivery, and early care of the newborn.

Mejdoubi et al. (2013) adapted the program for Dutch women and their health care system. The most important adaptations included placing more emphasis on home delivery, instructing women to stop smoking during pregnancy, and offering more information about the advantages of breastfeeding. Similarly, a German adaptation of the program (Sierau et al., 2016) used social workers or midwives rather than nurses as home visitors. It also used a German developmental screening instrument and provided information on well-child check-ups.

Robling et al. (2015) adapted the program for use in a publicly funded healthcare system in England. In this context, mothers have access to publicly funded health and social care, including universally offered screening, education, immunization, and support from birth to the child's second birthday. The FNP program also provides an assigned family nurse, who makes up to 64 home visits, while other mothers receive care as needed from a specialist community public health nurse.

See: Full Description

Nurse-Family Partnership sends nurses to the homes of pregnant women who are predisposed to infant health and developmental problems (i.e., at risk of preterm delivery and low-birth-weight children). The goal is to improve parent and child outcomes. Treatment begins during pregnancy, with 60 to 90-minute visits about once every other week, and continues to 24 months postpartum. Program content covered in the home visits includes (a) parent education about influences on fetal and infant development; (b) the involvement of family members and friends in the pregnancy, birth, early care of the child, and support of the mother; and (c) the linkage of family members with other formal health and human services.

In addition to working with the mothers directly, the nurses promote the goals of the program by engaging other family members and close friends in the program and by assisting families to use other formal health and social services.

Robling et al. (2015) adapted the program for use in a publicly funded healthcare system in England. In this context, mothers have access to publicly funded health and social care, including universally offered screening, education, immunization, and support from birth to the child's second birthday. The program also provides an assigned family nurse, who makes up to 64 home visits, while other mothers receive care as needed from a specialist community public health nurse.

Outcomes

Primary Evidence Base

Studies 1-3

The three studies of pregnant women and their children - Elmira, Memphis, and Denver - found intervention-group improvements relative to the control group in the following areas:

Mother (Elmira, Memphis, Denver)

Unintended subsequent pregnancies, and the interval between first and second births
Domestic violence among married or cohabiting women
Maternal employment and use of welfare and food stamps

Infants and Young Children (Elmira, Memphis, Denver)

Health-care visits and hospitalization for injuries and illnesses
Emotional vulnerability, particularly among children born to mothers with low psychological resources
Language and mental development, particularly among children born to mothers with low psychological resources
Child abuse and neglect, and behavioral problems caused by use of alcohol or drugs (seen in mothers at 15- and 19-year follow-up in Elmira)

6-to-12-Year Follow-up (Memphis)

Intellectual functioning and receptive language
Behavioral problems at age 6
Relationship quality of mothers with current partners
Children's use of substances and internalizing mental health problems at age 12

15-year and 19-Year Follow-up (Elmira)

Among children, arrests and convictions

Significant Program Effects on Risk and Protective Factors

Prenatal health, such as hypertension and use of cigarettes
Responsive interactions with child
Parent social support (Elmira)

Brief Evaluation Methodology

Primary Evidence Base

Of the 15 studies Blueprints has reviewed, three (Studies 1, 2, and 3) meet Blueprints evidentiary standards (specificity, evaluation quality, impact, dissemination readiness). In addition, Studies 1, 2, and 3 were done by the developer.

Studies 1-3

Three major studies done in Elmira, New York (Eckenrode et al., 2010; Olds, Henderson, Chamberlin et al., 1986; Olds, Henderson, Tatelbaum et al., 1986; Olds et al., 1994, 1997, 1998), Memphis, Tennessee (Kitzman et al., 1997, 2010; Olds, Kitzman et al., 2004, Olds et al., 2007, 2010, 2014), and Denver, Colorado (Olds et al., 2002; Olds, Robinson et al., 2004), used similar designs. Each study recruited women who were pregnant for the first time and faced special risks such as low income, teen pregnancy, or single parenthood. Investigators randomly assigned the women to the Nurse-Family Partnership or control conditions. Follow-up assessments completed after the two-year program measured a variety of outcomes for the mothers and their children.

The Elmira study used data from 400 pregnant women who were recruited in 1978 from clinics in a rural Appalachian area of New York State with a largely white population. The study included 15-year and 19-year follow-ups, with about 310 adolescent children participating.

The Memphis study began in 1990 and obtained data from a sample of 743 pregnant, mostly black women. A variety of measures for the mother and child were obtained at 3 years, 9 years, and 12 years after the birth. At the 12-year follow-up, 613 first-born children of the 743 randomized women were studied.

The Denver study, which began in 1994, obtained data on a sample of 735 pregnant, mostly Hispanic and white women. Unlike the other studies, this one randomized women to two intervention conditions, one using nurses and one using paraprofessionals, and one control condition. Posttest and two-year follow-up assessments were done for 86% of randomized mothers and 82% of the children. Follow-ups when the child was age six and age nine included, respectively, 81% and 78% of the randomized sample.

Blueprints Certified Studies

Study 1

Olds, D. L., Eckenrode, J., Henderson, C. R., Kitzman, H., Powers, J., Cole, R., . . . Luckey, D. (1997). Long-term effects of home visitation on maternal life course and child abuse and neglect: 15-year follow-up of a randomized trial. Journal of the American Medical Association, 278(8), 637-643.

Olds, D. L., Henderson, C. R., Chamberlin, R., & Tatelbaum, R. (1986). Preventing child abuse and neglect: A randomized trial of nurse home visitation. Pediatrics, 78, 65-78.

Olds, D. L., Henderson, C. R., Cole, R., Eckenrode, J., Kitzman, H., Luckey, D., . . . Powers, J. (1998). Long-term effects of nurse home visitation on children's criminal and antisocial behavior: 15-year follow-up of a randomized controlled trial. Journal of the American Medical Association, 280(14), 1238-1244.

Study 2

Olds, D. L., Robinson, J., O'Brien, R., Luckey, D. W., Pettitt, L. M., Henderson, C. R., . . . Talmi, A. (2002). Home visiting by paraprofessionals and by nurses: A randomized, controlled trial. Pediatrics, 110, 486-496.

Study 3

Kitzman, H., Olds, D. L., Henderson, C. R., Hanks, C., Cole, R., Tatelbaum, R., . . . Barnard, K. (1997). Effect of prenatal and infancy home visitation by nurses on pregnancy outcomes, childhood injuries, and repeated childbearing. Journal of the American Medical Association, 278(8), 644-652.

Olds, D. L., Kitzman, H., Cole, R., Robinson, J., Sidora, K., Luckey, D. W., . . . Holmberg, J. (2004). Effects of nurse home visiting on maternal life course and child development: Age 6 follow-up results of a randomized trial. Pediatrics, 114, 1550-1559.

Risk and Protective Factors

Risk Factors

Individual: Stress

Family: Family conflict/violence*, Family history of problem behavior, Household adults involved in antisocial behavior*, Lack of prenatal care*, Low parental education, Low socioeconomic status*, Mother substance use during pregnancy*, Neglectful parenting*, Parental attitudes favorable to antisocial behavior, Parental attitudes favorable to drug use, Parental unemployment*, Parent history of mental health difficulties, Parent stress, Poor family management, Psychological aggression/discipline, Unplanned pregnancy*, Violent discipline

Protective Factors

Family: Attachment to parents*, Breastfeeding*, Nonviolent Discipline, Opportunities for prosocial involvement with parents, Parent social support*, Rewards for prosocial involvement with parents

* Risk/Protective Factor was significantly impacted by the program

Subgroup Analysis Details

Gender Specific Findings

Male
Female

Race/Ethnicity Specific Findings

African American

Subgroup Analysis Details

Subgroup differences in program effects by race, ethnicity, or gender (coded in binary terms as male/female) or program effects for a sample of a specific race, ethnic, or gender group:

Study 1 found subgroup effects by using a homogenous sample with 75% or more of participating mothers being economically disadvantaged. In addition, Eckenrode et al. (2000, 2001 ) tested for subgroup differences in program benefits by gender and economic disadvantage and found equal benefits across subgroups. Other reports (Eckenrode et al., 2010; Olds et al., 1988, 1997, 1998) tested for within-group program effects by gender and economic disadvantage and found significant benefits for male children, female children, and children from low SES-unmarried families.
Study 2 found subgroup effects by using a homogenous sample with 75% or more of participating mothers being economically disadvantaged. In addition, Olds, Robinson et al. (2004) tested for subgroup differences in program effects by gender and found equal benefits for male andd female children.
Study 3 found subgroup effects by using a sample with 75% or more of participating mothers being African American and economically disadvantaged. In addition, Olds, Kizman et al. (2004) and Sidora-Arcoleo et al. (2010) tested for subgroup differences in program effects by gender and found equal benefits for male and female children. Enoch et al. (2018) tested for within-subgroup program effects by race and found significant benefits for African American children. Heckman et al. (2017) tested for within-subgroup effects by gender and found significant benefits for males and females.

Sample demographics included race, ethnicity, and gender for Blueprints-certified studies:

Study 1 examined a largely low-income sample from the Appalachian region of New York.
Study 2 examined a low-income sample of primarily Hispanic and White women living in Denver.
Study 3 examined a sample of low-income African American women living in Memphis.

Training and Technical Assistance

Training of Staff

Training begins with the initial one-week session for the nurse home visitors and their supervisor, offered by the staff of the PRC in Denver, Colorado. This session is followed by a three-day and two-day follow-up training offered on site at times that coincide with the nurses' need to begin using the infancy and then toddler protocols with families. In addition to the group training sessions, the PRC staff are available for technical assistance by phone as needed.

The first training session is offered prior to the initiation of the program. It covers:

the history of the program
the research evidence to support its efficacy
the theoretical and clinical foundations of the program
the principles of forming effective therapeutic relationships with family members
solution focused therapies
understanding women's stages of readiness for change
issues related to ethnic and racial diversity
the prenatal content
safety issues related to home visiting
the program protocols
the record keeping system

The second and third training sessions reinforce the theories and clinical strategies introduced in the first session, cover the content of the infancy and toddler programs, train nurses in the P.I.P.E. program, and review selected cases that have been served in the program to date with the entire staff to ensure fidelity of program implementation.

Training Certification Process

Nurse-Family Partnership Supervisor Initial Education Units:

Supervisor Unit One: All new, expansion and replacement supervisors are required to complete the five distance education lessons in this course prior to attending Supervisor Unit Two. Each lesson takes approximately 20 to 30 minutes to complete. The lessons are designed to orient a supervisor to her/his role and responsibilities in the Nurse-Family Partnership program and concentrate on program logistics, including agency setup, documentation, referrals, and hiring nursing staff. Supervisors access this course by logging in to the online Tracker system http://training.nursefamilypartnership.org/Tracker3/. You will be asked to reset your password the first time you login to Tracker. You may use the same password for both the NFP Community and Tracker.
Unit Two: see description under Unit 2 below.
Supervisor Unit Three: This distance education session focuses on Nurse-Family Partnership implementation issues, provides the supervisor with support in assessing the quality of nursing practice and implementation, and supports the professional development of nurse home visitors. A lesson is included to help supervisors learn how to connect with their community to sustain and grow their program. Supervisors access this course by logging in to the online Tracker system http://training.nursefamilypartnership.org/Tracker3/. You will be asked to reset your password the first time you login to Tracker. You may use the same password for both the NFP Community and Tracker.
Supervisor Unit Four: This face-to-face three-day session occurs approximately 4-6 months after completion of Unit Two. The session again focuses on the Nurse-Family Partnership model to promote supervisor skills around teambuilding and job stress and burnout. It also builds on reflection and motivational interviewing skills learned in earlier sessions. All new, expansion, and replacement supervisors are required to attend.

Nurse-Family Partnership Initial Education Units:

National Education Symposium:

The NFP National Education Symposium is for all Supervisors, Nurse Consultants and Administrators. For more information, click here http://community.nursefamilypartnership.org/Nursing-Education/National-Education-Symposium.

Benefits and Costs

Program Benefits (per individual): $21,379
Program Costs (per individual): $14,459
Net Present Value (Benefits minus Costs, per individual): $6,920
Measured Risk (odds of a positive Net Present Value): 65%

Source: Washington State Institute for Public Policy
All benefit-cost ratios are the most recent estimates published by The Washington State Institute for Public Policy for Blueprint programs implemented in Washington State. These ratios are based on a) meta-analysis estimates of effect size and b) monetized benefits and calculated costs for programs as delivered in the State of Washington. Caution is recommended in applying these estimates of the benefit-cost ratio to any other state or local area. They are provided as an illustration of the benefit-cost ratio found in one specific state. When feasible, local costs and monetized benefits should be used to calculate expected local benefit-cost ratios. The formula for this calculation can be found on the WSIPP website.

Start-Up Costs

Initial Training and Technical Assistance

Nurse-Family Partnership is implemented by teams of eight nurse home visitors with one supervisor. The cost to prepare one team to begin offering the program is approximately $77,000, which includes training, salaries for one month while in training and equipment to set up an office. One team can serve approximately 200 families over an average length of stay of 1.7 years.

Curriculum and Materials

Included in the cost of training.

Materials Available in Other Language: NFP program guidelines have been translated for families into Spanish, Norwegian, and Bulgarian. Note: There is a formal process for international replication which requires significant investment on the part of government to demonstrate its commitment to go through a process of formative adaptation, process evaluation, possibly an RCT, and careful replication.

Licensing

None.

Other Start-Up Costs

The cost of renovating or securing appropriate office space for program administration.

Intervention Implementation Costs

Ongoing Curriculum and Materials

None.

Staffing

Qualifications: Home visitors and supervisors must be registered nurses.

Ratios: Each participating family is assigned a visiting nurse. The NFP national office recommends maximum caseloads of 25 for full-time visiting nurses.

Time to Deliver Intervention: According to the model, families receive weekly or biweekly visits from pregnancy until their child turns two. The actual average time nurses in the program serve families is 1.7 years.

Estimate of annual salary and benefit costs for a team of eight nurses and one supervisor serving 200 families: $711,000, but costs will vary based on local salary levels.

Other Implementation Costs

Travel is a significant expense, estimated at $21,000 for a nursing team annually. Other expenses are those typically incurred in operating a program including overhead, office space, supplies, professional development, and data systems.

Implementation Support and Fidelity Monitoring Costs

Ongoing Training and Technical Assistance

Ongoing training estimated at $1,526 annually for a nursing team; and replacement training as a result of turnover is $7,750 per supervisor and $6,000 per nurse (which includes the cost of salaries and benefits during training period).

Fidelity Monitoring and Evaluation

Technical assistance to ensure quality implementation and monitor fidelity is estimated to cost $8,816 per nursing team.

Ongoing License Fees

None.

Other Implementation Support and Fidelity Monitoring Costs

No information is available

Other Cost Considerations

Nurse-Family Partnership can be implemented with as few as four nurses and a supervisor but economies of scale are lost.

Year One Cost Example

Below are estimated costs to set up and operate for a year a Nurse Family Partnership team of 8 nurses and one supervisor serving approximately 200 families:

Initial Start-up-Training, salaries for 1 month equipped office	$77,000.00
Staff Salaries for one year with fringe	$711,000.00
Travel	$21,000.00
T.A. and Fidelity Monitoring	$8,800.00
Overhead and Office @ 25% of staff	$197,000.00
Total One Year Cost	$1,014,800.00

With 8 nurses and a caseload of 25 families per nurse, 200 families would be served at a cost of $5,074 per family for one year of services.

Funding Overview

The number and variety of approaches to funding Nurse-Family Partnership is very large, with 170 implementing agencies in 32 states. NFP can be supported by federal funding streams aimed at promoting healthy development of young children, including Medicaid, Title V, IDEA Part C, and Title IV-B Child Welfare Services. Many states have allocated general funds to support NFP based on the strong evidence of outcomes and cost/benefit achieved through the model. In addition, the Affordable Care Act made an historic investment in home visiting, allocating $1.5 billion to support states in implementing evidence-based home visiting programs.

Allocating State or Local General Funds

In addition to large commitments of state general fund dollars for Medicaid match and for direct allocations to Nurse-Family Partnership, states and localities have used a variety of state and local funding streams to support NFP:

Tobacco Restitution funding
State/Local Partnerships for Children
Local school system funding
State education funding
Dedicated state home visiting funds

Maximizing Federal Funds

Entitlements: As a health promotion intervention, it is not surprising that Nurse-Family Partnership is often funded by Medicaid. However, as a prevention program more than a treatment intervention, NFP is not often funded by Medicaid as a service. Most billing is for some form of targeted case management. There are limitations to Medicaid funding based on allowable costs. Below are some of the Medicaid approaches used by states:

Targeted Case Management, both for child and mother
Negotiated rates with Medicaid funded Managed Care Organizations
State Medicaid "Public Health" program
State Medicaid "Perinatal Services" program

Formula Funds:

Maternal, Infant, and Early Childhood Home Visiting Grants - The Affordable Care Act allocated $1.5 billion over five years to support evidence-based home visiting programs. Funds flow to a state agency designated by the governor to administer the program, which then assesses needs and administers funds to local communities.
Title V Maternal and Child Health Block Grant which funds public health activities aimed at supporting healthy pregnancy and early childhood.
Title IV-B Child Welfare Services grant which can be used to fund child abuse prevention activities and services aimed at keeping children in their homes.
IDEA funds for Infants with Disabilities which supports early intervention services for infants with disabilities.
Child Care Development Block Grant which is one of the major funding streams supporting child care and can be used for NFP when it is implemented as part of a comprehensive early care and education model.
Temporary Assistance for Needy Families which is the core funding stream dedicated to providing income support for low income families and can also be used fairly flexibly by states to support four key goals, including assisting needy families so children can be cared for in their own homes.

Discretionary Grants: There are many federal discretionary grants supporting early care and education that can potentially support NFP, including programs within SAMHSA, the Children's Bureau and The Head Start Bureau within DHHS.

Foundation Grants and Public-Private Partnerships

Below is a sample of the many foundations that have supported NFP. Foundations have funded both start-up costs as well as ongoing expenses.

March of Dimes
United Way
The Duke Endowment
Blue Cross Blue Shield Foundations
Baptist Health Fund
Kellogg Foundation
Robin Hood Foundation

A promising strategy to promote home visiting is developing partnerships between states with managed care organizations (MCOs) providing Medicaid health services on a capitation basis. A partnership could be formed with the MCOs based upon the assumption that NFP would improve the health of infants and mothers served, lowering future health care costs for the MCOs. An investment in future savings could motivate MCOs to fund NFP.

Generating New Revenue

Many states have created dedicated revenue streams to fund NFP. Some of these are:

Gambling taxes
Children's Trust Funds (e.g., license plates, commemorative documents and tax form check-offs)
Ballot initiatives such as Proposition 10 in California
Property tax dedicated to social service issues

Data Sources

Survey and interview with purveyor, Nurse-Family Partnership National Service Office, the purveyor of NFP.

Program Developer/Owner

David L. Olds, Ph.D.University of Colorado Health Sciences CenterPrevention Research Center for Family and Child Health1825 Marion St.Denver, CO 80220olds.david@tchden.org www.nursefamilypartnership.org/

Program Outcomes

Child Maltreatment
Cognitive Development
Delinquency and Criminal Behavior
Mental Health - Other
Physical Health and Well-Being
Preschool Communication/ Language Development
Reciprocal Parent-Child Warmth

Program Specifics

Program Type

Home Visitation
Parent Training

Program Setting

Home

Continuum of Intervention

Selective Prevention

Program Goals

A nurse home visiting program for economically disadvantaged first-time pregnant mothers designed to improve prenatal and child rearing practices through the child's second birthday.

Population Demographics

The population consisted of women at risk of preterm delivery and low-birth-weight children, with a focus on low-income, unmarried, teenage women bearing their first child.

Target Population

Age

Infant (0-2)

Gender

Female

Gender Specific Findings

Male
Female

Race/Ethnicity

Race/Ethnicity Specific Findings

African American

Subgroup Analysis Details

Subgroup differences in program effects by race, ethnicity, or gender (coded in binary terms as male/female) or program effects for a sample of a specific race, ethnic, or gender group:

Study 1 found subgroup effects by using a homogenous sample with 75% or more of participating mothers being economically disadvantaged. In addition, Eckenrode et al. (2000, 2001 ) tested for subgroup differences in program benefits by gender and economic disadvantage and found equal benefits across subgroups. Other reports (Eckenrode et al., 2010; Olds et al., 1988, 1997, 1998) tested for within-group program effects by gender and economic disadvantage and found significant benefits for male children, female children, and children from low SES-unmarried families.
Study 2 found subgroup effects by using a homogenous sample with 75% or more of participating mothers being economically disadvantaged. In addition, Olds, Robinson et al. (2004) tested for subgroup differences in program effects by gender and found equal benefits for male andd female children.
Study 3 found subgroup effects by using a sample with 75% or more of participating mothers being African American and economically disadvantaged. In addition, Olds, Kizman et al. (2004) and Sidora-Arcoleo et al. (2010) tested for subgroup differences in program effects by gender and found equal benefits for male and female children. Enoch et al. (2018) tested for within-subgroup program effects by race and found significant benefits for African American children. Heckman et al. (2017) tested for within-subgroup effects by gender and found significant benefits for males and females.

Sample demographics included race, ethnicity, and gender for Blueprints-certified studies:

Study 1 examined a largely low-income sample from the Appalachian region of New York.
Study 2 examined a low-income sample of primarily Hispanic and White women living in Denver.
Study 3 examined a sample of low-income African American women living in Memphis.

Other Risk and Protective Factors

Risk: low birth weight, prenatal exposure to drugs, alcohol, or tobacco; mother's psychological immaturity, dysfunctional caregiving, stressful conditions in the household, low household SES, household social risk factors (e.g. poor education, experience of violence or neglect).

Risk/Protective Factor Domain

Individual
Family

Risk/Protective Factors

Risk Factors

Individual: Stress

Protective Factors

Family: Attachment to parents*, Breastfeeding*, Nonviolent Discipline, Opportunities for prosocial involvement with parents, Parent social support*, Rewards for prosocial involvement with parents

*Risk/Protective Factor was significantly impacted by the program

Brief Description of the Program

Description of the Program

Robling et al. (2015) adapted the program for use in a publicly funded healthcare system in England. In this context, mothers have access to publicly funded health and social care, including universally offered screening, education, immunization, and support from birth to the child's second birthday. The program also provides an assigned family nurse, who makes up to 64 home visits, while other mothers receive care as needed from a specialist community public health nurse.

Theoretical Rationale

The home visitors were equipped with a theory-driven program and a visit-by-visit protocol that was designed to guide their efforts to help women improve their health-related behaviors, their care of their children, their planning of subsequent pregnancies, educational achievement, and participation in the workforce. These adaptive skills focus on both their own behavior and their ability to summon family and community support to improve the material and social contexts in which they live. Some of the theories that helped to inform the content of the program include human ecology theory, self-efficacy theory, and attachment theory.

Human ecology theory played an important role in identifying which families would participate in the program and when they would participate. The program typically focuses on women who had no previous live births and thus were undergoing a major role change known as an ecological transition. The program began during pregnancy and the early childhood years because, during pregnancy, women have not yet formally assumed the parental role. In providing support to young people prior to and while they were learning about being parents, it was reasoned that the visitors would enhance their influence on parents during orientation to their roles as parents and providers. Human ecology theory also focused the home visitors' attention on the systematic evaluation and enhancement of the material and social environment of the family; the involvement of other family members, friends, and partners; the identification of family stressors and needed health and human services; and the linkage of families with formal community services.

Self-efficacy theory played a role in the design of the Elmira program (Eckenrode et al., 2010; Olds, Henderson, Chamberlin et al., 1986; Olds, Henderson, Tatelbaum et al., 1986; Olds et al., 1994, 1997, 1998), through an emphasis on helping women set small achievable objectives for themselves that would strengthen their confidence in their capacity for behavioral change. However, it was not emphasized explicitly as a theoretical foundation in Elmira to the same degree as it was in Memphis. Self-efficacy theory also focused on home visitors' attention to promoting mothers' healthy behavior, optimal caregiving, family planning, and economic self-sufficiency by identifying family strengths and reinforcing behaviors that are close to the goals of the program and teaching the problem-solving method as a general approach to coping.

Attachment theory affected the design of the home visitation program in three fundamental ways: 1) emphasizing the visitors developing an empathic relationship with the mother; 2) the emphasis of the program on helping mothers and other caregivers review their own childrearing histories; and 3) the explicit promotion of sensitive, responsive, and engaged caregiving in the early years of the child's life.

Theoretical Orientation

Self Efficacy
Attachment - Bonding

Brief Evaluation Methodology

Primary Evidence Base

Studies 1-3

Outcomes (Brief, over all studies)

Primary Evidence Base

Study 1

Elmira, New York: Women who received the Nurse-Family Partnership program demonstrated significant improvements in prenatal health, such as reductions in hypertensive disorders, fewer kidney infections, improved diet, and reductions in cigarette use. Children of nurse-visited women also experienced reductions in health-care visits for injuries. Nurse-visited mothers were rated as more involved with their children. At six months of age, nurse-visited infants were significantly less likely to exhibit emotional vulnerability in response to fear stimuli than were control group infants, and nurse-visited children of women with low psychological resources were significantly less likely to display low emotional vitality in response to joy and anger stimuli. At 21 months, nurse-visited children were significantly less likely to exhibit language delays than children in the control group (this effect was concentrated among children whose mothers had low psychological resources). Nurse-visited children born to women with low psychological resources also had superior average language and mental development in contrast to control-group counterparts. Fewer unintended subsequent pregnancies were reported by nurse-visited women and increases in the interval between first and second births were also observed.

15-year follow-up: Child abuse-neglect was significantly reduced. Children of nurse-visited women experienced significantly fewer arrests and convictions at age 15.

19-year follow-up: Children of nurse-visited mothers were less likely to have ever been arrested or convicted and had fewer lifetime arrests and convictions than their counterparts in the comparison condition. There were no program effects on youths' self-reported criminal behavior or their use of alcohol or illegal drugs, high school graduation, economic productivity, number of sexual partners, use of birth control, teen pregnancy or childbearing, and use of welfare, food stamps, or Medicaid.

Study 2

Denver Study. Nurse-visited smokers had significantly greater reductions in the chemical markers for nicotine from intake to the end of pregnancy than did the control group at 24 months. In addition, nurse-visited women had significantly longer intervals between their next conception than did women in the control group at both 24 and 48 months. Women visited by nurses also were employed longer during the second year after giving birth than were control women. Nurse-visited mother-infant pairs interacted with one another more responsively than did control pairs, and at six months of age, nurse-visited infants were significantly less likely to exhibit emotional vulnerability in response to fear stimuli than were control group infants. Nurse-visited children of women with low psychological resources were significantly less likely to display low emotional vitality in response to joy and anger stimuli and were less likely to exhibit language delays than children in the control group at 21 months. Nurse-visited children born to women with low psychological resources also had superior average language and mental development at 24 months. Nurse-visited women also reported less domestic violence from partners during the six-month interval before the four-year interview. Finally, nurse-visited children born to women with low psychologic resources, compared with control group counterparts, had home environments more conducive to early learning, better language development, superior executive functioning, and better behavioral adaptation during testing.

Study 3

Memphis Study. Fewer nurse-visited women had pregnancy-induced hypertension as compared to the control group. In addition, children of nurse-visited women had fewer health care encounters and days of hospitalization for injury or ingestion at 24 months and fewer second pregnancies at 24 months. These program effects were sustained at 54 months. Four years after the end of the program at child age 2 years, nurse-visited women had fewer subsequent pregnancies and births, less use of welfare, longer relationships with their partners, and greater enrollment of their children in some form of preschool or licensed daycare. Nurse-visited children demonstrated higher IQs and language scores and fewer behavioral problems in the borderline or clinical range.

During the 9-year period after the birth of the first child, among women with at least one subsequent child, there were longer intervals between the births of first and second children and fewer cumulative subsequent births per year among nurse-visited women. Averaging across the 6- and 9-year follow-up periods, nurse-visited mothers had longer relationships with their current partners and were associated with employed partners to a greater degree than were women in the control group. Through age 12, the program reduced children's use of substances and internalizing mental health problems.

Outcomes

Primary Evidence Base

Studies 1-3

The three studies of pregnant women and their children - Elmira, Memphis, and Denver - found intervention-group improvements relative to the control group in the following areas:

Mother (Elmira, Memphis, Denver)

Unintended subsequent pregnancies, and the interval between first and second births
Domestic violence among married or cohabiting women
Maternal employment and use of welfare and food stamps

Infants and Young Children (Elmira, Memphis, Denver)

Health-care visits and hospitalization for injuries and illnesses
Emotional vulnerability, particularly among children born to mothers with low psychological resources
Language and mental development, particularly among children born to mothers with low psychological resources
Child abuse and neglect, and behavioral problems caused by use of alcohol or drugs (seen in mothers at 15- and 19-year follow-up in Elmira)

6-to-12-Year Follow-up (Memphis)

Intellectual functioning and receptive language
Behavioral problems at age 6
Relationship quality of mothers with current partners
Children's use of substances and internalizing mental health problems at age 12

15-year and 19-Year Follow-up (Elmira)

Among children, arrests and convictions

Significant Program Effects on Risk and Protective Factors

Prenatal health, such as hypertension and use of cigarettes
Responsive interactions with child
Parent social support (Elmira)

Mediating Effects

In Study 1, Eckenrode et al. (2017) presented a formal mediation analysis, though for a subset of the Elmira sample with low to moderate levels of self-reported domestic violence (N = 251). Eckenrode et al. (2017) used this subsample to examine official reports of child maltreatment through the 15-year follow-up. The mediation analysis found that the program effect on child maltreatment was significantly mediated by both months on public assistance and subsequent births.

In Study 3, Sidora-Arcoleo et al. (2010) tested for mediation using the Memphis data, finding that the intervention had a significant indirect effect on verbal ability at age six via improved physical aggression at age two. Heckman et al. (2017) also tested for mediation using the Memphis data. They found that intervention effects on achievement test scores for boys were explained "by program-induced improvements in maternal traits and early-life family investments."

Effect Size

In Study 1, reported odds ratios were sometimes large (e.g., 3.89 for re-enrollment in or graduation from high school for women who had dropped out by the start of the Elmira study).

In Study 2, Olds, Robinson et al. (2004) reported weak effect sizes for the Denver sample. For mother outcomes, effect sizes ranged from near zero to .32; for child outcomes, effect sizes ranged from near zero to .47. Across all studies, effects sizes below .10 predominated.

In Study 3, Olds, Kitzman et al. (2004) reported generally weak effect sizes for the Memphis 6-year follow-up. For mother outcomes, effect sizes ranged from near zero to .24; for child outcomes, effect sizes ranged from near zero to .32. Olds et al. (2007) likewise reported generally weak effect sizes for the Memphis 9-year follow-up: For child academic performance, effect sizes ranged from near zero to .33.

Generalizability

Three studies meet Blueprints standards for high quality methods with strong evidence of program impact (i.e., "certified" by Blueprints): Study 1 (Eckenrode et al., 2010; Olds, Henderson, Chamberlin et al., 1986; Olds, Henderson, Tatelbaum et al., 1986; Olds et al., 1994, 1997, 1998), Study 2 (Kitzman et al., 1997, 2010; Olds, Kitzman et al., 2004, Olds et al., 2007, 2010, 2014), and Study 3 (Olds et al., 2002; Olds, Robinson et al., 2004). All three samples targeted low-income women and included diverse race and ethnic groups.

Study 1 examined a sample from Elmira, N.Y., and compared the treatment group to a services-as-usual control group.
Study 2 examined a sample from Denver, Colorado, and compared the treatment group to a services-as-usual control group.
Study 3 examined a sample from Memphis, Tennesee, and compared the treatment group to a services-as-usual control group.

Potential Limitations

Additional Studies (not certified by Blueprints)

Study 4 (Mejdoubi et al., 2013, 2014, 2015)

High rate of attrition and large differences in attrition rates for intervention and control groups.
No outcome measure at baseline (although related measures of risk for intimate partner violence were used).
Some outcomes rated by parents, who helped deliver the parenting program

Mejdoubi, J., van den Heijkant, S. C. C. M., van Leerdam, F. K. M., Crone, M., Crijnen, A., & Hirasing, R. A. (2014). Effects of nurse home visitation on cigarette smoking, pregnancy outcomes and breastfeeding: A randomized controlled trial. Midwifery, 30, 688-695.

Mejdoubi, J., van den Heijkant, S. C. C. M., van Leerdam, F. K. M., Heymans, M. W., Hirasing, R. A., & Crijnen, A. A. M. (2013). Effect of nurse home visits vs. usual care on reducing intimate partner violence in young high-risk pregnant women: A randomized controlled trial. PLOS One. doi:10.1371/journal.pone.007818

Mejdoubi, J., van den Heijkant, S. C., van Leerdam, F. J., Heymans, M. W., Crijnen, A., & Hirasing, R. A. (2015). The effect of VoorZorg, the Dutch nurse-family partnership, on child maltreatment and development: A randomized controlled trial. PLoS One, 10(4), e0120182.

Study 5 (Robling et al., 2015)

Some significant differences in attrition by condition
No significant effects on primary outcomes, but some small positive effects in secondary outcomes
Possible iatrogenic effect in higher rate of emergency room (ER) attendance or admission after birth among treatment group than control group

Robling, M., Bekkers, M.-J., Bell, K., Butler, C. C., Cannings-John, R., Channon, S., . . . Torgerson, D. (2015). Effectiveness of a nurse-led intensive home-visitation programme for first-time teenage mothers (Building Blocks): A pragmatic randomized controlled trial. The Lancet,published online 14 October 2015.

Study 6 (Sierau et al., 2016)

Some child measures were reported by mothers, who participated in the program
Possible problem with loss of subjects for intent-to-treat analysis
Evidence of differential attrition in that lower SES households (younger, lower income, and experienced foster care placement) were more likely to drop out of the study
No effect on main child development outcomes and only marginal effects on family environment and mother competencies

Sierau, S., Dähne, V., Brand, T., Kurtz, V., von Klitzing, K., & Jungmann, T. (2016). Effects of home visitation on maternal competencies, family environment, and child development: A randomized controlled trial. Prevention Science, 17,40-51.

Study 7 (Carabin et al., 2005)

Non-random assignment with limited matching
Baseline outcome controls not possible for birth outcomes but many other controls
Several large baseline differences between conditions
Weak effects and possible iatrogenic effects

Carabin, H., Cowan, L. D., Beebe, L. A., Skaggs, V. J., Thompson, D., & Agbangla, C. (2005). Does participation in a nurse visitation programme reduce the frequency of adverse perinatal outcomes in first-time mothers? Paediatric and Perinatal Epidemiology, 19(3), 194-205.

Study 8 (Rubin et al., 2011; Matone et al., 2012a, 2012b; Yun et al., 2014)

Non-random assignment (but used propensity score matching)
Baseline outcome controls not possible but many controls used in matching
Incomplete tests for baseline equivalence
Few main effects and one possible iatrogenic effect

Rubin, D. M., O'Reilly, A. L. R., Luan, X., Dai, D., Localio, A. R., & Christian, C. W. (2011). Variation in pregnancy outcomes following statewide implementation of a prenatal home visitation program. Archives of Pediatrics & Adolescent Medicine, 165(3), 198-204.

Yun, K., Chesnokova, A., Matone, M., Luan, X., Localio, A. R., & Rubin, D. M. (2014). Effect of maternal-child home visitation on pregnancy spacing for first-time Latina mothers. American Journal of Public Health, 104, S152-S158.

Matone, M., O'Reilly, A. L., Luan, X., Localio, A., & Rubin, D. M. (2012a). Emergency department visits and hospitalizations for injuries among infants and children following statewide implementation of a home visitation model. Maternal & Child Health Journal, 16, 1754-1761.

Matone, M., O'Reilly, A. L., Luan, X., Localio, R., & Rubin, D. M. (2012b). Home visitation program effectiveness and the influence of community behavioral norms: A propensity score matched analysis of prenatal smoking cessation. BMC Public Health, 12(1), 1016.

Study 9 (Matone et al., 2018)

Non-random assignment (but used matching based on sociodemographics only)
Possible concerns with validity of claims data on child abuse
Not possible to control for baseline outcomes
Incomplete tests for baseline equivalence
No positive program effects
Iatrogenic effect on child abuse episodes (though rates were low in both conditions)

Matone, M., Kellom, K., Griffis, H., Quarshie, W., Faerber, J., Gierlach, P., . . . Cronholm, P. F. (2018). A mixed methods evaluation of early childhood abuse prevention within evidence-based home visiting programs. Maternal and Child Health Journal, 22(Suppl 1), S79-S91.

Study 10 (Nguyen et al., 2003)

Not possible to control for baseline outcomes plus no other controls
Several large differences between conditions at baseline
No tests for differential attrition
No formal significance tests for condition differences at follow-up, only informal comparisons of means

Nguyen, J. D., Carson, M. L., Parris, K. M., & Place, P. (2003). A comparison pilot study of public health field nursing home visitation program interventions for pregnant Hispanic adolescents." Public Health Nursing, 20(5), 412-418.

Study 11 (Segal et al., 2018)

Non-random assignment with no matching
No information on validity but some concern about measurement bias in treatment group
Not possible to control for baseline outcomes, but no other controls in main effect analysis
Many baseline differences
Few or no significant main effects
Narrow sample from one town in Australia

Segal, L., Nguyen, H., Gent, D., Hampton, C., & Boffa, J. (2018). Child protection outcomes of the Australian Nurse Family Partnership Program for Aboriginal infants and their mothers in Central Australia. PLoS ONE, 13(12), e0208764.

Study 12 (Holmes & Rutledge, 2016)

QED with limited matching
Not possible to control for baseline outcomes
Tests for baseline equivalence limited to sociodemographic measures
Weak effects for full sample

Holmes, M., & Rutledge, R. (2016). Evaluation of the Nurse Family Partnership in North Carolina. UNC Gillings School of Global Public Health. Available online: https://dukeendowment.org/sites/default/files//evalutaion-reports/Full%20Report%20-%20Nurse%20Family%20Partnership.pdf.

Study 13 (Thorland & Currie, 2017)

QED with limited matching
Not possible to control for baseline outcomes
No tests for differential attrition, although likely not a problem

Thorland, W., & Currie, D. (2017). Status of birth outcomes in clients of the Nurse-Family Partnership. Maternal and Child Health Journal, 21(5), 995-1001.

Study 14 (Lee et al., 2019)

An RCT, but all analyses except one combined Nurse-Family Partnership with another home-visiting program
Healthcare measures missed visits not paid for by Medicaid
Not possible to control for baseline outcomes
Tests for baseline equivalence used the combined intervention groups and found several differences
No posttest or long-term program effects

Lee, H., Crowne, S. S., Estarziau, M., Kranker, K. Michalopoulos, C., Warren, A., . . . Knox, V. (2019). The effects of home visiting on prenatal health, birth outcomes, and health care use in the first year of life: Final implementation and impact findings from the Mother and Infant Home Visiting Program Evaluation-Strong Start. OPRE Report 2019-08. Washington, DC: Office of Planning, Research, and Evaluation, Administration for Children and Families, U.S. Department of Health and Human Services.

Study 15 (Michalopoulos et al., 2019)

An RCT, but all analyses except one combined Nurse-Family Partnership with three other home-visiting programs
Some child measures came from parents
Healthcare measures missed visits not paid for by Medicaid
Not possible to control for baseline outcomes
Tests for baseline equivalence found few differences but used the combined intervention groups
Tests for differential attrition found few differences but used the combined intervention group
Only one posttest effect

Michalopoulos, C., Faucetta, K., Hill, C. J., Portilla, X. A., Burrell, L., Lee, H., . . . Knox, V. (2019). Impacts on family outcomes of evidence-based early childhood home visiting: Results from the Mother and Infant Home Visiting Program evaluation. OPRE Report 2019-07. Washington, DC: Office of Planning, Research, and Evaluation, Administration for Children and Families, U.S. Department of Health and Human Services.

Notes

A review of other pregnancy and infancy home visitation programs suggest that many do not work. The more successful programs focus their services on families at greater risk and use nurses who visit frequently, beginning during pregnancy (Olds and Kitzman, 1990; Olds and Kitzman, 1993). There is no convincing evidence that paraprofessionals can be used with significant success.

References

Olds, D. L., and Kitzman, H. (1993). Review of research on home visiting for pregnant women and parents of young children. The Future of Children: Home Visiting, 3, 53-92.

Olds, D. L., and Kitzman, H. (1990). Can home visitation improve the health of women and children at environmental risk? Pediatrics, 86, 108-116.

Endorsements

Blueprints: Model
Crime Solutions: Effective
OJJDP Model Programs: Effective
SAMHSA : 3.2-3.5
Social Programs that Work:Top Tier

Peer Implementation Sites

For Information on Peer Sites, Contact:
Michelle Neal, MS, RN
Colorado Program Director, Nurse-Family Partnership
1775 Sherman Street, Suite 2075
Denver, CO 80203
303.839.1808 ext. 101
mneal@iik.org

Program Information Contact

References

Study 1

Eckenrode, J., Campa, M. I., Morris, P. A., Henderson, C. R., Bolger, K. E., Kitzman, H., & Olds, D. L. (2017). The prevention of child maltreatment through the Nurse Family Partnership Program. Mediating effects in a long-term follow-up study. Child Maltreatment, 22(2), 92-99.

Eckenrode, J., Campa, M., Luckey, D. W., Henderson, C. R., Cole, R., Kitzman, H., . . . Olds, D. (2010). Long-term effects of prenatal and infancy nurse home visitation on the life course of youths: 19-year follow-up of a randomized trial. Archives of Pediatrics & Adolescent Medicine, 164, 9-15.

Certified Olds, D. L., Eckenrode, J., Henderson, C. R., Kitzman, H., Powers, J., Cole, R., . . . Luckey, D. (1997). Long-term effects of home visitation on maternal life course and child abuse and neglect: 15-year follow-up of a randomized trial. Journal of the American Medical Association, 278(8), 637-643.

Olds, D. L., Henderson, C. R., & Kitzman, H. (1994). Does prenatal and infancy nurse home visitation have enduring effects on qualities of parental caregiving and child health at 25 to 50 months? Pediatrics, 93, 89-98.

Certified Olds, D. L., Henderson, C. R., Chamberlin, R., & Tatelbaum, R. (1986). Preventing child abuse and neglect: A randomized trial of nurse home visitation. Pediatrics, 78, 65-78.

Certified Olds, D. L., Henderson, C. R., Cole, R., Eckenrode, J., Kitzman, H., Luckey, D., . . . Powers, J. (1998). Long-term effects of nurse home visitation on children's criminal and antisocial behavior: 15-year follow-up of a randomized controlled trial. Journal of the American Medical Association, 280(14), 1238-1244.

Olds, D. L., Henderson, C. R., Tatelbaum, R., & Chamberlin, R. (1986). Improving the delivery of prenatal care and outcomes of pregnancy: A randomized trial of nurse home visitation. Pediatrics, 77,16-28.

Eckenrode J., Ganzel, B., Henderson, C. R. Jr, Smith, E., Olds, D. L., Powers J., . . . Sidora, K. (2000). Preventing child abuse and neglect with a program of nurse home visitation: The limiting effects of domestic violence. JAMA, 284(11), 1385-91.

Eckenrode, J., Zielinski, D., Smith, E., Marcynyszyn, L. A., Henderson, C. R., Kitzman, H., . . . Olds, D. (2001). Child maltreatment and the early onset of problem behaviors: Can a program of nurse home visitation break the link? Development and Psychopathology, 13(4), 873-890.

Olds, D. L., Henderson, C. R., Tatelbaum, R., & Chamberlin, R. (1988). Improving the life-course development of socially disadvantaged mothers: A randomized trial of Nurse Home Visitation, American Journal of Public Health, 78(11),1436-1443.

Olds, D. L., Henderson, C. R., Kitzman, H., & Cole, R. (1995). Effects of prenatal and infancy nurse home visitation on surveillance of child maltreatment. Pediatrics, 95(3), 365-372.

Zielinski, D. S., Eckenrode, J., & Olds, D. (2009). Nurse home visitation and the prevention of child maltreatment: Impact on the timing of official reports. Development and Psychopathology, 21, 441-453.

Study 2

Certified Olds, D. L., Robinson, J., O'Brien, R., Luckey, D. W., Pettitt, L. M., Henderson, C. R., . . . Talmi, A. (2002). Home visiting by paraprofessionals and by nurses: A randomized, controlled trial. Pediatrics, 110, 486-496.

Olds, D. L., Robinson, J., Pettitt, L., Luckey, D. W., Holmberg, J., Ng, R. K., . . . Henderson Jr., C. R. (2004). Effects of home visits by paraprofessionals and by nurses: Age 4 follow-up results of a randomized trial. Pediatrics, 114, 1560-1568.

Olds, D. L., Holmberg, J. R., Donelan-McCall, N., Luckey, D. W., Knudtson, M. D., & Robinson, J. (2014). Effects of home visits by paraprofessionals and by nurses on children follow-up of a randomized trial at ages 6 and 9 years. JAMA Pediatrics, 168(2), 114-121.

Study 3

Kitzman, H., Olds, D. L., Cole, R. E., Hanks, C. A., Anson, E. A., Arcoleo, K. J., . . . Holmberg, J. R. (2010). Enduring effects of prenatal and infancy home visiting by nurses on children: Follow-up of a randomized trial among children at age 12 years. Archives of Pediatrics & Adolescent Medicine, 164(5), 412-418.

Certified Kitzman, H., Olds, D. L., Henderson, C. R., Hanks, C., Cole, R., Tatelbaum, R., . . . Barnard, K. (1997). Effect of prenatal and infancy home visitation by nurses on pregnancy outcomes, childhood injuries, and repeated childbearing. Journal of the American Medical Association, 278(8), 644-652.

Olds, D. L., Kitzman, H., Cole, R., Hanks, C., Arcoleo, K., Anson, E., . . . Stevenson, A. (2010). Enduring effects of prenatal and infancy home visiting by nurses on maternal life course and government spending: Follow-up of a randomized trial among children at age 12 years. Archives of Pediatrics & Adolescent Medicine, 164(5), 419-424.

Certified Olds, D. L., Kitzman, H., Cole, R., Robinson, J., Sidora, K., Luckey, D. W., . . . Holmberg, J. (2004). Effects of nurse home visiting on maternal life course and child development: Age 6 follow-up results of a randomized trial. Pediatrics, 114, 1550-1559.

Olds, D. L., Kitzman, H., Hanks, C., Cole, R., Anson, E., Sidora-Arcoleo, K., . . . Bondy, J. (2007). Effects of nurse home visiting on maternal and child functioning: Age 9 follow-up of a randomized trial. Pediatrics, 120, 832-845.

Olds, D. L., Kitzman, H., Knudtson, M. D., & Anson, E. (2014). Effect of home visiting by nurses on maternal and child mortality: Results of a 2-decade follow-up of a randomized clinical trial. JAMA, 472, E1-E7. Published online July 7, 2014.

Enoch, M.-A., Kitzman, H., Smith, J. A., Anson, E., Hodgkinson, C. A., Goldman, D., & Olds, D. L. (2018). A prospective cohort study of influences on externalizing behaviors across childhood: Results from a nurse home visiting randomized controlled trial. Journal of the American Academy of Child & Adolescent Psychiatry, 55(5), 376-382.

Heckman, J. J., Holland, M. L., Makino, K. K., Pinto, R., & Rosales-Rueda, M. (2017). An analysis of the Memphis Nurse-Family Partnership program. Cambridge, Mass.: National Bureau of Economic Research.

Kitzman, H., Olds, D. L., Sidora, K., Henderson, C. R., Jr, Hanks, C., Cole, R., . . . Glazner, J. (2000). Enduring effects of nurse home visitation on maternal life course: A three-year follow-up of a randomized trial." Journal of the American Medical Association, 283(15), 1983-1989.

Sidora-Arcoleo, K. H., Anson, E. A., Lorber, M., Cole, R. E., Olds, D. L., & Kitzman, H. J. (2010). Differential effects of a Nurse Home-Visiting Intervention on physically aggressive behavior in children. Journal of Pediatric Nursing, 25, 35-45.

Study 4

Mejdoubi, J., van den Heijkant, S. C. C. M., van Leerdam, F. K. M., Heymans, M. W., Hirasing, R. A., & Crijnen, A. A. M. (2013). Effect of nurse home visits vs. usual care on reducing intimate partner violence in young high-risk pregnant women: A randomized controlled trial. PLOS One. doi:10.1371/journal.pone.007818

Mejdoubi, J., van den Heijkant, S. C., van Leerdam, F. J., Heymans, M. W., Crijnen, A., & Hirasing, R. A. (2015). The effect of VoorZorg, the Dutch nurse-family partnership, on child maltreatment and development: A randomized controlled trial. PLoS One, 10(4), e0120182.

Study 5

Study 6

Study 7

Study 8

Study 9

Study 10

Study 11

Study 12

Study 13

Thorland, W., & Currie, D. (2017). Status of birth outcomes in clients of the Nurse-Family Partnership. Maternal and Child Health Journal, 21(5), 995-1001.

Study 14

Study 15

Study 1

Summary

The Elmira study reported the following program effects:

Mother

Unintended subsequent pregnancies, and the interval between first and second births
Domestic violence among married or cohabiting women
Maternal employment and use of welfare and food stamps

Infants and Young Children

Health-care visits and hospitalization for injuries and illnesses
Emotional vulnerability, particularly among children born to mothers with low psychological resources
Language and mental development, particularly among children born to mothers with low psychological resources
Child abuse and neglect, and behavioral problems caused by use of alcohol or drugs (seen in mothers at 15- and 19-year follow-up in Elmira)

15-year and 19-Year Follow-up

Among children, arrests and convictions

Significant Program Effects on Risk and Protective Factors

Prenatal health, such as hypertension and use of cigarettes
Responsive interactions with child
Parent social support

Evaluation Methodology

Design:

Recruitment: A randomized, clinical trial was conducted beginning in April 1978. Pregnant women were recruited from private obstetric offices and a free antepartum clinic in the Appalachian region of New York State. Five hundred women were invited to participate in the study and 400 women (80%) were successfully recruited.

Assignment: The sample was stratified on the basis of marital status and race, and participants were randomly assigned to one of four treatment groups. Families in Treatment 1 (n = 94) received sensory and developmental screening for the child at 12 and 24 months of age. Based on these screenings, children were referred for further clinical evaluation and treatment when needed. Families in Treatment 2 (n = 90) were provided with the screening services offered to those in Treatment 1, plus free transportation for prenatal and well-child care through the child's second birthday. There were no differences between Treatments 1 and 2 in their use of prenatal and well-child care; therefore, these two groups were combined to form a single comparison group. Families in Treatment 3 (n = 100) received the screening and transportation services offered to Treatment 2, but in addition were provided a nurse who visited them at home during pregnancy. Families in Treatment 4 (n = 116) were provided the same services as those in Treatment 3, with continued visits until the infants were 18 to 24 months of age.

Assessments/Attrition: The first follow-up assessment came at the 32nd week of pregnancy, and the last, long-term follow-ups came at 15 years and 19 years. At the 15 year follow-up, when the adolescents had reached age 15, 81% of the original 400 randomized cases completed the assessment. At the 19-year follow-up, assessments were completed on 310 youths (78% of the original sample).

Sample: Of the 400 women in the sample, 85% were either low-income, unmarried, or teenaged, and none had a previous live birth. Eighty-nine percent of the sample was white.

Measures: Interviews were carried out with participating women by project staff members at baseline and follow-up.

Dietary intake was measured using 24-hour diet records and 24-hour recalls. For 74% of the sample, dietary data were gathered for two consecutive 24-hour periods at each assessment period; for an additional 14% of the sample, data were available for a single 24-hour period. These data were aggregated into a nutrient-adequacy ratio which converted the intake of 12 nutrients into a summary of percentages of Recommended Dietary Allowances.

Serum cotinine assays were done to validate the women's reported level of smoking. Serum was derived from blood samples drawn routinely at the patients' registration in the clinic and at approximately the 36th week of pregnancy. Cotinine levels were determined by radioimmunoassay.

Estimates of the length of gestation gave priority to newborn physical and neurological examinations and to ultrasound readings taken before the 28th week of pregnancy. Reported last menstrual periods and measurements of uterine size made before 20 weeks were used when newborn examination and ultrasound data were not available. The gestational age of all low-birth-weight babies was estimated from the newborn physical examination findings.

At the 15-year and 19-year follow-ups, measures included women's use of welfare, the number of subsequent children, and arrests derived from self-reports. Verified reports of child abuse and neglect were abstracted from state records. The adolescents' arrests, convictions, and delinquent behavior were based on self-reports.

Analysis: For all analyses, a core statistical model was derived that consisted of a 2 X 2 X 2 X 2 factorial structure. This model was extended to include a repeated-measures structure for dependent variables measured both early and late in pregnancy. Treatments 1 and 2 were combined for purposes of analysis after it was determined that there were no differences between these two groups in their use of routine prenatal care. Treatments 3 and 4 were also combined for the prenatal analysis because they were identical during this phase of the research.

Outcomes

Implementation Fidelity:

Five registered nurses were hired though a non-profit private agency expressly for this experimental program. Each nurse had a caseload of 20-25 families and received regular clinical supervision.

Baseline Equivalence:

The women in the four conditions were essentially equivalent on various measures after randomization. However, some evidence of baseline differences emerged between groups on sense of control among unmarried women and partner support, two variables then used as covariates in the models.

Differential Attrition:

At the 19-year follow-up, there were no differences across conditions by mother's race, marital status, age, education, or SES, or by child's sex. However, some evidence of differential attrition emerged from the 4-year assessment. Dropouts in the intervention group had a higher mean sense of control and higher mean education than dropouts in the control group. Because the intervention group began with a stronger sense of control, the sample of completers showed similarity on baseline measures.

Posttest:

Formal and Informal Support Systems: By the end of pregnancy, nurse-visited women were aware of more community services, attended childbirth education classes more frequently, received more WIC vouchers, reported that they talked more frequently with service providers and members of their informal networks about the stresses of pregnancy and family life, indicated that their babies' fathers showed a greater interest in their pregnancies, and were accompanied by someone to the labor room more often than women in the comparison group.

Maternal Obstetrical Conditions and Health Habits: Nurse-visited women had fewer kidney infections after enrollment than comparison women, greater improvements in the quality of their diets from registration to the 32nd week of pregnancy, and nurse-visited smokers made greater reductions in their smoking than smokers assigned to the comparison group.

Birth Weight and Length of Gestation: No significant treatment effects were noted for birth weight or length of gestation, but the nurse-visited women gave birth to babies who were an average of 395g heavier than those in the comparison group. There was a 75% reduction in the incidence of preterm delivery among smokers; however, nurse-visited older nonsmokers gave birth to infants of shorter gestations.

Child Abuse and Neglect: First Two Years of Life: During the first two years of the children's lives, the nurse-visited women at the highest risk (poor, unmarried teens) had abused or neglected their children less than the comparison group (4% vs. 19% respectively). Although the treatment effects were not statistically significant in other groups less at risk, these effects were in the same direction. Nurse-visited women reported that their babies had more positive moods, but more frequent episodes of resisting eating. They also reported a greater concern about infants' behavioral problems than did women in the control group. Within the group at greatest risk (poor, unmarried teens), the nurse-visited women punished and restricted their children less at 10 and 22 months of age and provided a larger number of play materials than the comparison group. Developmental quotients of children of the poor, unmarried nurse-visited teens were higher at 12 and 24 months of life than were those in the comparison group. Babies of nurse-visited women were seen less in emergency rooms during the first and second years of life and presented with fewer accidents and poisonings than their counterparts.

Enduring Effects at 25 to 50 Months: Homes of nurse-visited families had fewer hazards for children at 34 and 46 months; however, there were no program differences on the extent to which mothers kept poisonous substances out of children's reach and children rode in cars with child safety restraints. Nurse-visited children had 40% fewer notations of injuries and ingestions and 45% fewer notations of child behavioral/parental coping problems in physicians' records, and 35% fewer visits to the emergency room. No treatment differences were noted in interviewers' ratings of mothers' warmth or control, but at the 34th-month observation, nurse-visited women were rated as more involved with their children. Nurse-visited women punished their children more frequently than did comparison group women. During the first four years after delivery, nurse-visited women who had not completed high school returned to school more rapidly (although there were no treatment differences in educational attainment at 46 months postpartum); nurse-visited women who were poor and unmarried were employed 82% more of the time, had 43% fewer subsequent pregnancies, and delayed the birth of their second child an average of 12 months longer.

Olds et al. (1995) examined many of the same outcomes for a subsample of 56 families in which children had a state-verified report of child abuse or neglect during the first 4 years of the children's life. For this subsample, the intervention group had significantly fewer visits to physicians for injuries and emergency room visits. Several risk and protective factors were also significantly improved for the intervention group compared to the control group: observed safety hazards for children; more intellectually stimulating toys, games, and reading materials; and less controlling mothers.

Long-Term:

15-Year Follow-up:

Treatment contrasts were made between the combination of Treatments 1 and 2 (the comparison group) with Treatment 4 (the pregnancy and infancy nurse-visited group).

Rates of Subsequent Births, Use of AFDC, Substance Use Impairment, and Arrests: Nurse-visited women had 0.3 fewer subsequent children.

High-risk Sample: Among high-risk women (unmarried and from low socioeconomic status households), nurse-visited women had 0.7 fewer subsequent pregnancies, 0.5 fewer subsequent children, and the spacing between first and second births was 30 months longer than for women in the comparison group. They also reported using AFDC 30 fewer months than did women in the comparison group and reported 43% fewer instances of their being functionally impaired by alcohol or drugs. Additionally for the high-risk women, the nurse-visited women reported having been arrested 69% fewer times, having been convicted of crimes 74% fewer times and having spent 96% fewer days in jail than did comparison group women.

Rates of Child Abuse and Neglect: Nurse-visited mothers reported 58% fewer substantiated reports (frequency) of child abuse or neglect than did mothers in the comparison group. Among the high-risk subsample, there was an 86% reduction in child abuse or neglect. When child abuse or neglect was expressed as a dichotomous variable, there was a treatment effect for the high-risk subsample (17% of the Treatment 4 nurse-visited children had been abused or neglected at least once vs. 37% of the comparison group children). Eckenrode et al. (2000) replicated these results but also showed that the presence of domestic violence moderated the intervention effect on child abuse and neglect. Specifically, a high level of domestic violence reduced the impact of the program on child maltreatment. Zielinski et al. (2009) used survival analysis and Cox regression models to further check for condition differences in child maltreatment. Overall, the intervention group did not differ significantly in the timing of child maltreatment from the control group. However, the results differed by time period. The study found that the survival functions remained largely identical until age 4, but after that age, the onset rate of child maltreatment was significantly lower for the intervention group.

In addition, Eckenrode et al. (2017) examined official reports of child maltreatment from Child Protective Services for a subsample of respondents that had low to moderate levels of self-reported domestic violence (N =251). For this subsample, they found that the program effect on child maltreatment at the 15-year follow-up was significantly mediated by months on public assistance and the number of subsequent births.

Children's Rates of Arrests, Conviction, and Delinquency: There were 52% fewer arrests for the nurse-visited children born to women who were unmarried and from low SES households. Adolescents of nurse-visited mothers who were unmarried and from low SES households reported smoking cigarettes 21% fewer times during the six-month period prior to the 15-year interview.

Early Onset Problem Behaviors. Eckenrode et al. (2001) found no main effect of the program on a measure that summed the number of problem behaviors (e.g., smoking, intercourse, drug use, arrests) occurring at unusually young ages (ages 12-15 or younger, depending on the behavior). However, tests for moderation showed that the effect of maltreatment on early-onset problem behaviors was much larger for the control group than the intervention group. Stated equivalently, the program had benefits for children with a history of maltreatment reports but not for those with no maltreatment reports.

19-Year Follow-up:

Nurse-visited girls were less likely to have ever been arrested or convicted and had fewer lifetime arrests and convictions than their counterparts in the comparison condition. The age at first arrest of nurse-visited girls was also younger than their counterparts. The treatment effect for rate of arrests was concentrated in mid-adolescence (i.e., control girls rates declined in later adolescence and they caught up with the nurse-visited girls).

There were no program effects on youths' self-reported criminal behavior or their use of alcohol or illegal drugs, high school graduation, economic productivity, number of sexual partners, use of birth control, teen pregnancy or childbearing, and use of welfare, food stamps, or Medicaid.

Nurse-visited girls born to unmarried and low-income mothers had fewer children and less Medicaid use than their comparison group counterparts.

Study 2

Summary

The Denver study reported the follow program effects:

Mother

Unintended subsequent pregnancies, and the interval between first and second births
Domestic violence among married or cohabiting women
Maternal employment and use of welfare and food stamps

Infants and Young Children

Health-care visits and hospitalization for injuries and illnesses
Emotional vulnerability, particularly among children born to mothers with low psychological resources
Language and mental development, particularly among children born to mothers with low psychological resources
Child abuse and neglect, and behavioral problems caused by use of alcohol or drugs (seen in mothers at 15- and 19-year follow-up in Elmira)

Significant Program Effects on Risk and Protective Factors

Prenatal health, such as hypertension and use of cigarettes
Responsive interactions with child

Evaluation Methodology

Design:

Recruitment: A three-armed randomized trial (control, paraprofessional home visits, and nurse home visits) was conducted. Women were recruited if they had no previous live births and either qualified for Medicaid or had no private health insurance. Women were allowed to enroll at any time prior to delivery. Of the 1,178 consecutive women from 21 antepartum clinics serving low-income women in Denver invited to participate in the study, 735 accepted.

Assignment: The 735 women were randomly assigned to either a nurse visitation group, a paraprofessional visitation group, or a control group. Women in the nurse group (n = 235) were provided developmental screening and referral services for their children at 6, 12, 15, 21, and 24 months of age plus nurse home visitation during pregnancy and infancy. Women assigned to the paraprofessional group (n = 245) received the screening and referral services plus paraprofessional home visitation during pregnancy and infancy. Women in the control group (n = 255) received the developmental screening and referral services.

Assessments/Attrition: Participants were interviewed at 12, 15, 21, 24, and 48 months postpartum and then at child ages six and nine years. At the 48-month follow-up, interviews were conducted with mothers in 86% of the cases randomized and 91% of those in which the child was alive and not adopted. At nine years, direct assessments of children were completed in 82% of the cases randomized and 87% of those in which the child was alive and not adopted.

Sample: The sample consisted of largely Hispanic and White women with low income.

Measures:

Follow-up assessments at child ages six and nine years (Olds et al., 2014) were carried out by interviewers and child evaluators masked to intervention condition and involved interviews, observations, and psychological tests with the children as well as mothers' and teachers' reports of children's behavior.

End-Of Pregnancy Assessments: Participants were interviewed at 36 weeks of gestation in the study office to assess their health-related behaviors, including use of psychoactive drugs and use of ancillary preventive and emergency services. Urine was again collected in order to test for the chemical markers for nicotine, marijuana, and cocaine. Change in tobacco use from intake to 36 weeks was measured by change in creatinine-adjusted cotinine among those designated as smokers at intake.

Maternal Life Course: Participants were interviewed at 12, 15, 21, 24, and 48 months postpartum to assess their number and timing of any subsequent pregnancies; and at 24 and 48 months to assess educational achievement, participation in the workforce, and use of welfare. At the 48-month in-home assessment, mothers also were asked whether they had been married or cohabitating, and, for women who lived with a partner during the two-year period before the interview, whether they had experienced physical violence during the two-year and six-month periods preceding the interview. Variables were constructed to reflect years of education completed and number of months women were in the workforce and used welfare during the 1- to 12-month and 13- 24-month periods.

Mother-Infant Interaction and Quality of the Home Environment: Mother-infant interaction was videotaped at all postpartum assessments. Responsive Interaction was standardized at each assessment to a mean of 100 and a standard deviation of 10. Infants' home environments were rated at 12, 21, and 48 months.

Child Emotional, Mental, and Behavioral Development: At six months of age, infants' emotional reactivity and looking at mother were videotaped in the laboratory and coded separately for their responses to stimuli designed to elicit fear, joy, and anger. Emotional vulnerability was defined as high distress reactions to fear stimuli coinciding with limited efforts by the infants to look at or seek assistance or comfort from their mothers. Emotional vitality was defined as the lively expression of joyful and angry affect that was shared with others.

Children's language development was tested at 21 months in their homes and their mental development was tested using the Mental Development Index (MDI) at 24 months in the laboratory. Children with language scores below 85 were classified as delayed and children with MDI scores below 77 were classified as developmentally delayed. At 48 months, children were assessed in their homes using the Preschool Language Scales and with a series of cognitive tasks focusing primarily on the children's capacity for sustained attention and inhibitory control. Mothers reported on children's irritability at six months and their externalizing behavior problems at 24 and 48 months.

The follow-ups at ages six and nine (Olds et al., 2014) measured primary outcomes in two domains. The behavioral problem domain included measures of norm-referenced internalizing, externalizing, and total behavioral problems based on teacher and parent reports. It also included children's scores on the Conners Continuous Performance Test that placed them in the dysfunctional attention/impulsive range. The cognitive domain included measures from norm-referenced tests of receptive language and intellectual functioning at age six years; sustained attention at ages six and nine years; reading and math achievement at ages six years and nine years; and executive cognitive functioning (visual attention/task switching and working memory) at age nine years.

Secondary outcomes measured at follow-up consisted of children's narrative responses to the MacArthur Story Stem Battery coded to characterize the degree of dysregulated aggression and incoherence revealed in their stories; evaluators' ratings of children's behavioral regulation during testing; and mothers' reports of their children's receipt of specific therapeutic services (for speech and language problems, cognitive delays, attention deficit and hyperactivity, and emotional problems-both prior to the interview at age six years and between ages six and nine years), whether their children had been retained in school, and whether they had received special education or remedial services in the first 3 years of elementary school.

Analysis: Data analyses were conducted on all cases for which outcome data were available. The primary statistical model consisted of treatments (three levels), maternal psychological resources (high vs. low), and the interaction between these two classification factors. In addition, five covariates were included to control for nonequivalence among the treatment groups at intake: maternal age, housing density, whether the mother registered in the study after 28 weeks of gestation, maternal conflict with her partner, and maternal conflict with her mother.

For the age six and nine follow-ups (Olds et al., 2014), the analyses used all cases randomized insofar as outcome data were available (intention to treat). The models included six baseline covariates to adjust for treatment nonequivalence plus two additional covariates (child age at assessment and sex). Continuous dependent variables were examined with the general linear model; dichotomous outcomes were examined with modified Poisson regression.

Outcomes

Implementation Fidelity:

No specific information was provided as to the content, frequency, or duration of visits. Nurses assisting with the study were required to have BSN degrees and experience in community or maternal and child health nursing. Paraprofessionals were required to have a high school education but were excluded from participation in the study if they had college preparation in the helping professions or a bachelor's degree in any discipline. Each visitor managed an average caseload of 25 families.

Baseline Equivalence:

The treatment groups were similar at baseline, but there was some nonequivalence at intake on maternal age, housing density, whether the mother registered in the study after 28 weeks of gestation, maternal conflict with her partner, and maternal conflict with her mother.

Differential Attrition:

A total of 560 mothers completed the 24-month child assessments (Nurse n = 168; Paraprofessional n = 188; and Control n = 204). Nurse-visited women had lower rates of completed assessments than did women in the control group at each postpartum assessment period, although the pattern of baseline differences between nurse-visited and control group women on whom assessments were not conducted by child age two indicated that the nurse-visited women were higher functioning than their control group counterparts.

At the 48-month follow-up, rates of completed assessments were high and equivalent across treatment conditions.

For the follow-ups at age six and age nine (Olds et al., 2014), rates of retention were similar across conditions (see eTable 1 in the supplementary document). Tests for differential attrition (eTable 2 of the supplementary document) showed the intervention and control baseline means for the age six and age nine analysis samples. The tables do not present significance tests, but the text states (p. 117) "Participants in the 6- and 9-year follow-ups were similar on background characteristics across treatment conditions."

Posttest:

Paraprofessional Program: Paraprofessional-visited mother-child pairs in which the mother had low psychological resources interacted with one another significantly more responsively than did their control group counterparts. No other significant effects were found for the paraprofessional group, although there were trends toward mothers in the paraprofessional group reducing subsequent pregnancies and births as well as to delay subsequent pregnancies.

Nurse Program: Nurse-visited smokers had significantly greater reductions in the chemical markers for nicotine from intake to the end of pregnancy than did their control group counterparts. In addition, nurse visited women had significantly longer intervals between their next conception than did women in the control group. Women visited by nurses also were employed longer during the second year after giving birth than were control women.

Nurse-visited mother-infant pairs interacted with one another more responsively than did control pairs. At six months of age, nurse-visited infants were significantly less likely to exhibit emotional vulnerability in response to fear stimuli than were control group infants, and nurse-visited children of women with low psychological resources were significantly less likely to display low emotional vitality in response to joy and anger stimuli. At 21 months, nurse visited children were significantly less likely to exhibit language delays than children in the control group (this effect was concentrated among children whose mothers had low psychological resources). Nurse-visited children born to women with low psychological resources also had superior average language and mental development in contrast to control-group counterparts.

For most of the outcomes, paraprofessional visitation effects were approximately half the size of those produced by nurse visitation, although the only significant difference between the two groups was that nurse-visited children born to mothers with low psychological resources demonstrated significantly superior language development than did their paraprofessional-visited counterparts.

Long-term:

Paraprofessional Program: Two years after the end of the program, women visited by paraprofessionals were less likely to be married or live with the child's biological father than were women in the control group. Women visited by paraprofessionals worked significantly more between child age two and four and had a greater sense of mastery and better mental health than their control group counterparts. Although there were no significant effects on rates or timing of subsequent pregnancies and birth, when a subsequent pregnancy did occur, paraprofessional-visited women were significantly less likely than control women to have a low birth weight newborn.

Paraprofessional-visited mother-child pairs displayed more sensitive and responsive interactions during free-play evaluations than did pairs in the control group. Families in which mothers had low psychologic resources at registration had home environments significantly more supportive of early learning, compared with control group counterparts.

Nurse Program: Nurse-visited women, compared with control women, had significantly greater intervals between the births of their first and second children when a second birth occurred. Nurse-visited women also reported less domestic violence from partners during the six-month interval before the four-year interview. In addition, nurse-visited women also reported enrolling their children less frequently in preschool, Head Start, or licensed daycare as compared with women in the control group.

Nurse-visited children born to women with low psychologic resources, compared with control group counterparts, had home environments more conducive to early learning, better language development, superior executive functioning, and better behavioral adaptation during testing.

At the age six and age nine follow-ups (Olds et al., 2014), the results in Table 1 for behavioral problem (internalizing, externalizing, attention dysfunction) outcomes showed no significant differences at p < .05 between the nurse condition and the control condition or for the paraprofessional condition and the control condition (although three were significant at p < .10). The remaining tables focus only on women with low psychological resources, finding some evidence that, for this subsample, the paraprofessional and nurse interventions significantly improved several cognitive outcomes and significantly lowered the number of therapeutic treatment services than the control group. The supplementary tables (eTables 3-4) present results for mothers with high psychological resources. Presumably, the effects on these outcomes for the full sample showed no significant program benefits.

Study 3

Summary

The Memphis study found the following program effects:

Mother

Unintended subsequent pregnancies, and the interval between first and second births
Domestic violence among married or cohabiting women
Maternal employment and use of welfare and food stamps

Infants and Young Children

Health-care visits and hospitalization for injuries and illnesses
Emotional vulnerability, particularly among children born to mothers with low psychological resources
Language and mental development, particularly among children born to mothers with low psychological resources
Child abuse and neglect, and behavioral problems caused by use of alcohol or drugs (seen in mothers at 15- and 19-year follow-up in Elmira)

6-to-12-Year Follow-up (Memphis)

Intellectual functioning and receptive language
Behavioral problems at age 6
Relationship quality of mothers with current partners
Children's use of substances and internalizing mental health problems at age 12

Significant Program Effects on Risk and Protective Factors

Prenatal health, such as hypertension and use of cigarettes
Responsive interactions with child

Evaluation Methodology

Design:

Recruitment: The Memphis trial was designed to determine if the effects of the Elmira program could be replicated through an existing health department with a large sample of low-income African American women, children, and their families living in a major urban area. The program was conducted through the Memphis/Shelby County Health Department. From June 1990 through August 1991, the study recruited 1,290 low-income women who were at less than 29 weeks of gestation from the obstetrical clinic at the Regional Medical Center in Memphis. Approximately 1,089 women (88% of those recruited) enrolled in the study.

Assignment: Once accepted into the program, baseline interviews were performed and the women were randomly assigned to one of four treatment groups. Women assigned to Treatment 1 (n = 166) were provided with free round-trip taxicab transportation for scheduled prenatal care appointments; they did not receive any postpartum services or child developmental assessments/screening. Women assigned to Treatment 2 (n = 515) received the free transportation for scheduled prenatal care plus developmental screening and referral services for the child at 6, 12, and 24 months of age. Women assigned to Treatment 3 (n = 230) received the free transportation and screening offered those in Treatment 2 plus intensive nurse home visitation services during pregnancy, one postpartum visit in the hospital before discharge, and one postpartum visit in the home. Women assigned to Treatment 4 (n = 228) received the same services as those in Treatment 3; in addition, they continued to be visited by nurses through the child's second birthday.

For the evaluation of the prenatal phase of the program, Treatments 1 and 2 were combined to form a single comparison group that was then contrasted with Treatments 3 and 4, a group that had nurse visitors during pregnancy. For the postnatal phase of the study, Treatment 2 was contrasted with Treatment 4.

Assessments/Attrition: Interviews with participating women were carried out at the time of registration (prior to their assignment to treatments), at the 28th and 36th weeks of pregnancy, and at the 6th, 12th, 24th, and 54th months of the child's life, and again near the child's sixth and ninth birthdays. Medical and social-service records were abstracted and teachers' reports of children's classroom behavior (primarily first grade) were also obtained. Of mothers who were randomly assigned and had no fetal or child death, follow-up assessments at child age 9 were completed with 91% of the mothers, school records were abstracted for 88% of the children, teacher report forms were completed for 81% of the sample, and achievement-test scores were abstracted for 83%.

Enoch et al., 2018: When children were 18 years old, a total of 657 mothers and 669 children were eligible for follow-up. The attrition was predominantly due to maternal and child deaths. Of the eligible participants, 94% of mothers and children were interviewed.

Sample: Ninety-two percent of the women were African American, 98% were unmarried, 64% were aged 18 or younger at registration, 85% came from households with incomes at or below the federal poverty guidelines, and 22% smoked cigarettes at registration.

Measures:

Baseline Assessments: At registration, baseline interviews were conducted with all participants in order to determine their socioeconomic conditions, mental health, personality characteristics, obstetric histories, health-related behaviors (cigarette smoking, alcohol, and illegal drug use), and social support. Women also completed brief tests to estimate their levels of intellectual functioning. Women's pre-pregnancy weights and heights were also determined by self-report. The last weights recorded in the prenatal record prior to delivery were used to calculate pregnancy weight gains. Household per-annum discretionary income was calculated using subsistence standards for determining Medicaid eligibility in Tennessee, the number of individuals in the household, and reported household income. In addition, each participant was assigned a value that represented the percentage of poverty households in the census tract in which she resided.

A variable was created to index women's psychological resources measured at registration and based on the averaged z scores of their: 1) intelligence, 2) mental health, and 3) sense of mastery/self-efficacy. Self-efficacy was assessed with a measure developed for the current study to determine mothers' confidence in their ability to behave in accordance with the major behavioral objectives of the program.

Prenatal measures: Women were interviewed at the 28th week of gestation by phone to assess their health-related behaviors, social support, use of community services, and participation in school and work. Identical interviews were conducted at the 36th week. At the 36th week of gestation, women were also assessed to ascertain their mental health symptoms and their sense of mastery.

Obstetrical and newborn records were abstracted directly and verified against an on-line perinatal database from the University of Tennessee.

Urine screens for marijuana and cocaine were performed on 511 women who registered for prenatal care at the Regional Medical Center as part of their clinical assessment during the time that this trial was conducted.

Urinary tract infections (UTIs) were recorded if a culture produced a colony count for a single uropathogen >100,000/ml of clean-catch voided urine. Diagnoses of pyelonephritis were recorded from the medical record. Cultures for Neisseria gonorrhea and Chlamydia trachomatis were obtained at the first prenatal visit and were coded from the prenatal record; as were additional STDs, infections, or Pregnancy-Induced Hypertension (PIH).

Birth weight was recorded from the hospital record and gestational age was estimated from reported last menstrual period and ultrasound administered before the 26th week of gestation.

Postpartum measures: Participants were interviewed in the study offices at 6 months postpartum to measure breastfeeding practices and beliefs associated with child abuse and neglect.

Maternal Life Course: Participants were interviewed at 12, 15, 21, 24, and 54 months postpartum and near the child's sixth and ninth birthdays to assess their number and timing of any subsequent pregnancies; and again at 24 months and near the child's sixth and ninth birthdays to assess educational achievement, participation in the workforce, and use of welfare and Medicaid. At the interview conducted at 54 months, women were asked whether their children attended Head Start, preschool programs, licensed daycare, or early intervention programs. Near the child's sixth and ninth birthdays, women were also asked to report on their use of substances, any behavioral problems attributable to the use of substances, the number of times they had consumed three or more drinks of alcohol three or more times per month in the past year, use of marijuana. and use of cocaine since the last interview at child age 6, and the counts of maternal arrests and days jailed (age nine follow-up). Variables were constructed to reflect years of education completed and number of months women were in the workforce during the 1- to 12-month and 13- 24-month periods. The 54-month and six-year interviews also assessed rates of marriage and cohabitation, duration of women's current partnered relationships, current partners' education, employment and social class (based on their occupational codes), domestic violence since the birth of the first child, and whether the current male partner was the biological father of the child. The age nine follow-up assessed the counts of subsequent miscarriages, abortions, and low birth weight newborns; reported participation in the workforce; depression; whether they had experienced physical violence from any of their partners since their first child was 6; and the portion of time their current partners were employed while they were together after the birth of the first child.

Child Emotional, Mental, and Behavioral Development: At the 24th-month office visit, the children were tested with the Bayley scales of infant development, and their mothers completed the Achenbach Child Behavior Checklist.

The children's medical records were reviewed with a focus on hospitalizations, emergency department visits, and outpatient encounters in which injuries and ingestions were detected. In addition, the dates and types of children's immunizations were recorded.

Data were also abstracted from Tennessee Department of Human Services records to ascertain women's and their first-born children's use of Aid to Families with Dependent Children during the period from the child's birth through second birthday.

At the age six follow-up, assessments were conducted after children had completed at least seven months of kindergarten (through March). The children's mothers completed the Achenbach Child Behavior Checklist (CBCL) to assess the severity of internalizing, externalizing, and total behavior problems. The children's teachers completed the Hightower Teacher-Child Rating Scale (HTCRS) to assess the degree to which children were engaged with school and children's classroom socioemotional adjustment. Teacher reports and school data were derived primarily from the children's first-grade teachers (n = 486), although a small number of reports came from kindergarten (n = 33), second grade (n = 42), and special education (n = 3) teachers. Children's responses to eight story beginnings (stems) from the McArthur Story Stem Battery (MSSB) were videotaped and coded for a series of content themes, observable affective expressions, and coherence in completing the stories. Children's representations of dysregulated aggressive behavior, parental warmth/empathy themes in the stories, and whether each story completion was incoherent were categorized by constructs and a coding scheme specially designed for low-income African American children. Finally, children's cognitive and language skills were assessed with the Kaufman Assessment Battery for Children (KABC) and the Peabody Picture Vocabulary Test (PPVT-III).

At the age nine follow-up, children's school records in grades one to three were reviewed and teachers' reports of classroom behavior (primarily from third grade) were obtained. Children's GPAs in reading, math, and behavior were abstracted from school records. In addition, children's achievement-test scores (primarily the Tennessee Comprehensive Assessment Program Achievement Test) were also abstracted. Teacher reports of antisocial behavior and maternal report of child disruptive behaviors and depressive and anxiety disorders for the past year were assessed using the Computerized Diagnostic Interview Schedule for Children. Due to infrequently occurring rates of individual reported disorders, two broad categories were created: (1) a count of depressive and anxiety disorders reported in the past year with actual values ranging between 0 and 5 and (2) a count of disruptive behavior disorders reported in the past year, with actual values ranging between 0 and 2. The number of times children were retained in grades 1 to 3 was counted, and placement in special education was coded. Teachers' assessments of children's behavior in the classroom were assessed using items from the Social Competence Scale and Social Health Profile form the Fast Track trial and the Observation of Child Adjustment Revised. The items from these measures produced 3 scales: (1) antisocial behavior, a primary outcome, and (2) academically focused behavior, and (3) peer affiliation. Finally, children's death was assessed by sending every case in which the child was born alive and on which a maternal assessment was not completed at age 9 to the National Death Index (NDI). The age of the child at death (in days) and the cause of death were then coded.

Mortality Records were collected for mothers and children using the National Death Index (Olds et al., 2014). Maternal deaths were categorized into natural (such as stroke) and external categories (such as drug overdose, homicide). Children's deaths were classified as preventable, such as sudden infant death syndrome, unintentional injuries, and homicide, and natural causes of death, such as chronic respiratory disease.

Analysis: Data analyses were conducted on all cases for which outcome data were available. Dependent variables for which a normal distribution was assumed were analyzed in the general linear model, dichotomous outcomes and low-frequency count data were analyzed in the log-linear model. The primary statistical model for postnatal outcomes focused on classification effects for Treatments (2 vs. 4) and maternal psychological resources (high vs. low), plus two covariates (household income, and census-tract poverty level). Regressions of children's story coherence on their level of emotional expression were tested for homogeneity by treatments, with adjustment for the standard three covariates. Quantitative outcomes on which multiple assessments existed for each mother or each child were analyzed using mixed models that included, in addition to the variables from the core model, children (or mothers) as levels of a random factor, a fixed repeated measures classification factor for time of assessment, and all interactions of time with the other fixed classification factors. School performance outcomes were available for math and reading for each of 3 grades. For these outcomes, grade level was the repeated measure over time, and the model included a second fixed repeated measures factor for subject area. For maternal repeated outcomes in the age 9 follow-up, reported results were averaged over the entire period from which data was available as well as the interval between 6 and 9 years of the first child's life. For all maternal low-frequency count outcomes except the rates of subsequent low birth weight newborns, only the treatment factor (with no covariates) was included. Low birth weight newborns were analyzed in a model that included treatment, psychological resources, the treatment x psychological resource interaction, and the household poverty covariate. For all child low-frequency count outcomes except mortality, the model consisted of treatment, psychological resource interaction, and child gender; the child mortality dichotomous outcome was tested in a simple treatment model with no psychological resource factor or adjustments for covariates.

Outcomes

Baseline Equivalence:

In the follow-up for 18-year old children (Enoch et al. 2018), there was no difference in baseline characteristics of mothers in the treatment and control groups (supplemental Table S1).

Differential Attrition:

The treatment groups were similar with respect to background characteristics for the participants completing the 6-year assessment, with the following exceptions: at intake, nurse-visited women (treatment 4) had higher scores for child-rearing attitudes associated with child maltreatment and lived in households with less discretionary income and higher housing densities than did women in the comparison group (treatment 2). These differences suggest that the nurse-visited group had a greater proportion of at-risk families at child age 6 years, although the proportions of families for whom assessments were conducted were large and nearly equivalent across treatment conditions.

In the follow-up for 18-year old children (Enoch et al. 2018), mothers lost to the study did not differ from mothers retained in the study at 18 years in baseline mental health or Pearlin Mastery scores. In the control group only, mothers lost to the study had significantly higher baseline self-efficacy scores than study-retention mothers.

Posttest:

Prenatal Findings: By the 36th week of pregnancy, nurse-visited women were more likely to use other community services than were women in the control group. They were also more likely to be working, particularly among women who were not in school when they were randomized into treatment conditions. In contrast with women in the comparison group, nurse visited women had fewer instances of PIH. Among women with PIH, those who received a nurse home visitor had mean arterial blood pressures during labor that were 3.5 points lower than women in the comparison group.

Dysfunctional Caregiving and Child Development: Children of nurse-visited mothers had fewer health-care encounters for injuries and ingestions, and they had fewer days of hospitalization for injuries or ingestions at 24 months than did children of mothers in the comparison group. These trends were greater for children born to women with few psychological resources.

Nurse-visited mothers reported that they at least attempted to breastfeed more frequently than did women in the comparison group. By the 24th month of the child's life, nurse-visited women held fewer beliefs about childrearing associated with child abuse and neglect than did women in the comparison group. In addition, the homes of nurse-visited women were rated as more conducive to children's development using the HOME scale.

Children born to nurse-visited mothers with limited psychological resources were observed to be more responsive to their mothers and to communicate their needs more clearly than did children born to low resource mothers in the comparison group.

Maternal life course: Nurse-visited women reported 23% fewer second pregnancies at 24 months and 32% fewer subsequent live births (among nurse-visited women with high levels of psychological resources) than did women in the comparison group. Nurse-visited women and their children relied upon AFDC for fewer months during the second year of the child's life than did comparison group women and children. These results were sustained at 54 months (see also Kitzman et al., 2000). At 54 months, nurse-visited women had higher rates of living with a partner and living with the father of their child and were with partners who had been employed for longer durations as compared to the control group. The program was able to help those women with fewer mental health symptoms, higher IQs, and more active coping styles in becoming less dependent upon welfare, but was unable to do so with women with fewer psychological resources. Nurse-visited women sustained significantly fewer subsequent pregnancies and births than did women in the comparison group, and significantly longer intervals between the births of the first and second children at 54 months and at child age six. Nurse-visited women also had longer relationships with their current partners. Between children's 54th and 72nd months (6 years) of life, nurse-visited women had fewer months of using welfare and food stamps as compared to women in the control group. In addition, nurse-visited children were more likely to have been enrolled in formal out-of-home care (Head Start, preschool, licensed daycare, or early intervention) between 2 and 4.5 years of age as compared to children in the control group. There were no statistically significant program effects on women's mastery, mental health, education, employment, marriage, being in a partnered relationship, living with the father of the child, outcomes of subsequent pregnancies, current partner's education or socioeconomic status, use of marijuana, behavioral problems attributable to the use of alcohol or drugs, or domestic violence.

During the 9-year period after the birth of the first child, among women with at least 1 subsequent child, nurse-visited women had longer intervals between the births of first and second children and had fewer cumulative subsequent births per year than did their control group counterparts. The treatment main effect on the number of cumulative subsequent births was limited to women with initially high psychological resources averaging across the entire period after the birth of the first child. Averaging across the 6- and 9-year follow-up periods, nurse-visited mothers had longer relationships with their current partners than did mothers in the control group. The program effect was particularly pronounced at child age 9. In correspondence with their longer partnered relationships, nurse-visited women were associated with employed partners to a greater degree than were women in the control group. From birth through child age 9, nurse-visited women used welfare (AFDC/TANF) and food stamps for fewer months per year than did women in the control group. For the 6- to 9- year interval, the program effect on food stamps was significant, but the effect on AFDC/TANF was non-significant. When examined over the entire 9-year period, nurse-visited women expressed greater mastery over the challenges in their lives than did women in the control group, but by age 9 the treatment-control group difference was no longer significant. There were no statistically significant program effects on women's subsequent miscarriages, abortions, or stillbirths; arrests or being jailed; use of Medicaid; depression; employment; or marriage or being in a partnered relationship.

Child outcomes: At six years of age, nurse-visited children had higher scores on tests of intellectual functioning and receptive language and were reported by their mothers to have fewer problems in the borderline or clinical range of the Achenbach Child Behavior Checklist (CBCL) Total Problems scale, compared to children in the control group. In addition, nurse-visited children born to mothers with low psychologic resources had higher arithmetic achievement tests scores than their control group counterparts, and expressed less dysregulated aggression and told fewer incoherent stories than control group children. For both the entire sample and children born to mothers with low psychologic resources, fitted regressions were significantly different by treatment, favoring the nurse-visited children. Children's story coherence disintegrated in the presence of high levels of emotional expression to a greater degree in the control group, compared with children visited by nurses. There were no other significant program effects.

At the age 9 follow-up, nurse-visited children who were born to mothers with low psychological resources, compared to their control group counterparts, had better GPAs averaged across reading and math and had better math and reading achievement-test scores in grades 1 to 3. There were no statistically significant program effects on placements in special education or mothers' reports of their children's disruptive behavior disorders or third-grade teachers' reports of children's behavioral or academic adaptation to the classroom.

Sidora-Arcoleo et al. (2010) examined verbal ability at two years and six years and examined physical aggression at two years, six years, and 12 years. The intervention significantly lowered physical aggression at age two but not at ages six or 12, and it had no significant effects (p < .05) on the two measures of verbal ability. Moderation tests further indicated that the impact on physical aggression at age two was limited to girls and that there was a significant effect on physical aggression at age six for mothers with high psychological resources. Mediation tests showed that the intervention had a significant indirect effect on verbal ability at age six via improved physical aggression at age two.

Infant and childhood death: Control group children were 4.46 times more likely to die in the age range of birth through child age 9 than were nurse-visited children.

At 21 years following randomization, women in the two nurse-visited groups were less likely to have died than women assigned to the two control groups. The comparison of control groups 1 and 2 with treatment group 3 was significant, but a similar comparison with treatment group 4 was not significant. By age 20 years, children whose mothers received home visits during pregnancy and through child age 2 years were significantly less likely to have died from preventable causes (but not all causes) compared with their counterparts in the control group (Olds et al., 2014). For the study of child mortality, inclusion of children in treatments 1 and 3 was not possible, so only the full NFP treatment was compared to one of the control conditions. For women, data were available for all 4 treatments.

Follow-up at 12 Years Old:

(Kitzman, Olds, et al., 2010): Maternal interviews at 12 years were completed by 82% of the nurse-visited mothers and 79% of the control group mothers. Child interviews were completed by 79% of nurse-visited children and 77% of control children. Completion rates for other records (teacher reports, school records, social service records) vary by the record. Nurse-visited children at 12 years of age, compared to controls, reported fewer days of having used cigarettes, alcohol, and marijuana during the 30-day period before the 12-year interview and were less likely to report having internalizing disorders that met the borderline or clinical threshold. The nurse-visited children born to women with low psychological resources, compared with their control group counterparts, scored higher on the Peabody test in reading and math and during the first six years of school scored higher on group-administered standardized tests of math and reading achievement. There were no significant effects for externalizing or total behavioral problems.

(Olds, Kitzman, Cole, Hanks, et al., 2010): The program also improved maternal life course by the time the child was age 12. Nurse-visited mothers compared with control mothers reported less role impairment owing to alcohol and other drug use, longer partner relationships, and greater sense of mastery. Government spending on food stamps, Medicaid, and AFDC/TANF was reduced. There were no differences on mothers' marriage, partnership with the child's biological father, intimate partner violence, alcohol, and other drug use, arrests, incarceration, psychological distress, or reports of child foster care placements.

(Enoch et al., 2018): An analysis of a measure of externalizing disorder obtained from mothers completing the Achenbach Child Behavior Checklist included assessments at ages two, six, and 12. The results showed a positive effect of the treatment at age two but only for mothers with high baseline self-efficacy. The treatment had no effects on externalizing at older ages.

Follow-up at 18 Years Old:

(Enoch et al., 2018): The analysis showed no treatment effect on externalizing disorder, alcohol use disorder, drug use disorder, or smoking.

Reanalysis (Heckman et al., 2017)

This study examined the program impact on all outcomes across all assessed ages. Its contribution came from methodological improvements. The study used 1) permutation-based inference that did not require assumptions about the distributions of the measures and 2) multiple-hypothesis testing adjustments with a step-down procedure. The models controlled for maternal height, household income, grandmother support, maternal parenting attitudes, and whether the mother was currently enrolled in school. All tests were done separately for boys and girls rather than for the full sample.

The article gave substantial attention to differential attrition in the supplementary materials. The tests showed no differences between conditions in rates of attrition and few differences across conditions in baseline measures for the analysis samples. Also, Appendix F shows that the findings held when correcting for attrition using inverse probability weighting and that the correction for attrition did not play a substantial role in the intervention evaluation.

In brief summary, most of the findings from previous papers survived permutation testing, but fewer treatment effects survived corrections for multiple hypothesis testing. Nonetheless, some effects for boys were sustained until age 12.

Table 4 shows that, of 157 outcomes, 22.3% for males and 17.2% for females had p-values smaller than 5% - larger than what would be expected to occur by chance. Table 5 summarized the large number of tests, which the authors interpreted as follows: "The NFP intervention significantly improved maternal mental health, home environment, and parenting skills. On average, treated boys were healthier at birth and experienced an increase in cognitive abilities by age 6. We find that the NFP intervention generated stronger effects on socio-emotional skills for girls and stronger effects on academic achievement for boys. By age 12, treated boys outperformed controls in math and reading achievement. Treated girls experienced an improvement in cognitive and socio-emotional skills at age 6. At age 12, treated girls had a lower body-mass index (BMI), but the estimated treatment effects are not significantly different from the control group in other measures."

Mediation analyses showed that the treatment effects were explained "by program-induced improvements in maternal traits and early-life family investments. At age 12, the treatment effects for males (but not for females) persist in the form of enhanced achievement test scores. Treatment effects are largely explained by enhanced cognitive skills at age 6."

Study 4

Summary

Mejdoubi et al. (2013; 2015) evaluated the program in a randomized controlled trial using a sample of 460 women in the Netherlands. Primary outcomes included intimate partner violence victimization through 24 months after birth and documented reports of child abuse through 36 months after birth. Secondary outcomes included measures of the home environment at 6, 18, and 24 months after birth as well as internalizing/externalizing behaviors at 24 months after birth.

Mejdoubi et al. (2013; 2014; 2015) reported that, compared to control group participants, intervention group participants had significant improvements in:

Cigarette use
Psychological aggression
Physical assault
Sexual coercion,
Injury
Combined forms of intimate partner violence
Agency reports of child maltreatment (36 months)
Quality of the home environment (24 months)
Child internalizing behaviors (24 months)

Evaluation Methodology

Design:

Recruitment /Sample size: First, midwives, general practitioners, gynecologists, and others actively recruited women in 20 municipalities in the Netherlands who met several inclusion criteria: (1) maximum age of 25 years, (2) low educational level (pre-vocational secondary education), (3) maximum 28 weeks of gestation, (4) no previous live birth, and (5) some understanding of the Dutch language. Second, nurses interviewed women to ensure they had at least one additional risk factor (being single, a history or present situation of domestic violence, psychosocial symptoms, unwanted pregnancy, financial problems, housing difficulties, no employment and/or education, alcohol and/or drug use). In some cases, potential participants not meeting all of the inclusion criteria but with multiple risk factors were still included (N = 77, 16.7%). The total number of recruited participants was 460.

Study type/Randomization/Intervention: All 460 eligible women were randomized into the control or intervention group after stratification by region and ethnicity. The 223 women in the control group received usual health care during pregnancy, while the 237 women in the intervention group were also offered 10 nurse home visits during pregnancy, 20 during the first year of the child's life, and 20 during the second year of the child's life.

Assessment/Attrition: Subjects were interviewed three times in their homes: pretest (16-28 weeks of pregnancy), interim (32 weeks of pregnancy), and posttest (2 years after the birth of a child). Of the 460 subjects at baseline, 111 (24.1%) were lost to follow-up, declined, moved outside of the region, or could not be interviewed because of study start-up problems at 32 weeks, and 194 (42.2%) were not interviewed at follow-up for the same reasons. However, the study also states that the Conflict Tactics Scale was completed by still fewer numbers: 1) in the control group by 110 (49%) at 32 weeks and 74 (33%) at 24 months and 2) in the intervention group by 156 (66%) at 32 weeks and 110 (46%) at 24 months.

For child maltreatment reports at 36 months after birth (Mejdoubi et al., 2015), 2 out of 10 districts declined to provide data. As a result, data were missing for 26% for the control group and 24% for the treatment groups. For analyses looking at home environment and child internalizing behaviors, attrition reached as high as 58% for the control and 45% intervention group at 24 months.

Sample characteristics:

The female subjects averaged about 19 years of age and were primarily Dutch (49%), with representation of Surinamese/Antillean (26%), Turkish/Moroccan (6%), and other (19%) ethnicities. The subjects had low education, and few were married or living together (18%). About 18% had been a victim of physical abuse during the last year.

Measures:

The key outcome measure, annual prevalence of intimate partner violence, comes from the revised Conflict Tactics Scale. The scale includes four components - physical assault, psychological aggression, injury, and sexual coercion - plus a combined measure. It also takes into account the severity of violence by measuring less severe (level 1) and more severe (level 2) incidents over the past year, and both perpetration and victimization in the conflict. With the four components and a combined measure, two levels of violence, and measures of perpetration and victimization, the analysis examined a total of 20 outcomes.

Interviewers did not administer the Conflict Tactics Scale at baseline because the scale "measures IPV [intimate partner violence] during a current or most recent relationship rather than relationships in the past." Instead, they used the Abuse Assessment Screen to measure physical and sexual violence in the past.

The study presents no information on the validity and reliability of the measures. The Conflict Tactics Scale has been used extensively but also has been subject to debate over its validity. Interviews were conducted in private given the sensitive nature of the measures.

Mejdoubi et al. (2015) also examined child maltreatment in the form of reports to the Dutch equivalent to Child Protective Services. The authors noted some limitations in the validity of this measure, including the fact that abuse would be missed for participants moving across districts. Secondary outcomes included home environment and child externalizing/internalizing behaviors. Home environment was assessed by interviewers blinded to allocation with the Home Observation Measurement of the Environment (IT-HOME), which sums the results of 45 yes/no items. The internalizing behavior and externalizing behavior sub-scores of the Child Behavior Checklist 1.5 - 5 Years (CBC/1.5-5) were used to assess child behavioral problems. Parents, who helped deliver the parenting portion of the program, provided the ratings.

Analysis:

Multivariable logistic regression analyses were performed to compare differences in dichotomous outcomes between the control and intervention groups. Multivariable linear regression analyses were used to compare continuous outcomes. The analyses used multilevel models for the longitudinal relationship between the intervention and the outcomes. The intervention effects over time were modeled using interaction terms between the condition and the time variables.

The study used complete case analysis at 32 weeks but imputed the less complete data at 24 months after birth. Although also checked with the last observation carried forward, the results reported in the tables at 24 months came from multiple imputation.

With the use of multiple imputation at 24 months, the analysis complied with the intent-to-treat principle.

Mejdoubi et al. (2015) used Poisson regression for the maltreatment reports and internalizing/externalizing outcomes as well as both linear regression and mixed models for the home environment outcome. Analyses included potential confounders such as region, age, ethnicity, and child gender as covariates. Multiple imputation was used to address missing data in the internalizing/externalizing outcomes.

Outcomes

Implementation fidelity:

Women in the program were offered 10 visits during pregnancy, and the majority of the subjects received 6-13 home visits.

Baseline Equivalence:

The study states "no significant differences in demographic characteristics or in the number of risk factors between the control and intervention groups were found at baseline."

Differential attrition:

The study states that participants who were lost to follow-up did not differ significantly from participants who remained in the study with regard to demographic characteristics and risk factors. However, attrition reached 42% at the 2-year posttest, and completion of the Conflict Tactics Scale was greater in the control group (67%) than the intervention group (54%). In Table 2, Mejdoubi et al. (2015) compared the means for completers and non-completers within both the intervention and control group. They stated that the comparisons showed no significant differences.

Interim and Post-test: For victimization at 32 weeks, intervention subjects showed significantly lower outcomes on 5 of 10 measures: level 2 psychological aggression, level 1 physical assault, level 2 physical assault, level 1 sexual coercion, and 2 or more forms of intimate partner violence. For victimization at posttest (24 months post-birth), intervention subjects showed significantly lower odds ratios on 1 of 10 measures: level 1 physical assault. Odds ratios for significant effects ranged from .38 (medium-large) to .57 (small-medium). Averaged over all time points, the multilevel logistic regressions revealed significantly greater reductions in the intervention group for two outcomes: level 2 psychological aggression and level 1 physical assault.

For perpetration at 32 weeks, intervention subjects showed significantly lower outcomes on 5 of 10 measures: level 2 psychological aggression, level 1 physical assault, level 1 injury, combined forms of intimate partner violence, and 2 or more forms of intimate partner violence. Odds ratios for significant effects fell in the small-medium range (.53 to .57). For perpetration at posttest (24 months post-birth), intervention subjects showed significantly lower odds ratios on 2 of 10 measures: level 1 sexual coercion (OR = .10) and combined forms of intimate partner violence. Multilevel logistic regression analyses showed that only one outcome, level 1 physical assault, declined significantly more over all time points among the intervention group than the control group.

For women reporting both victimization and perpetration at 32 weeks, intervention subjects showed significantly lower level 2 psychological aggression and level 1 physical assault. At 24 months after birth, intervention subjects showed significantly lower level 1 physical assault.

Mejdoubi et al., 2014
Fewer women in the intervention smoked during and after the birth, and they smoked fewer cigarettes per day after the birth and fewer cigarettes in the presence of the baby. More intervention women breastfed their child at six months. There were no effects on pregnancy outcomes, such as birth weight, weeks of gestation, or adverse pregnancy outcomes (low birth weight, prematurity, and small for gestational age).

Mejdoubi et al., 2015
There were significantly fewer child maltreatment reports to CPS in the treatment group than the control group through three years after birth. As compared to the control group, an independent rating of home environment quality was significantly higher for the treatment group at two years after birth. Also at two years, a measure of parent-rated child internalizing behaviors was significantly lower for the treatment group.

Long-term effects:

The study did not collect long-term follow-up data and therefore was not able to demonstrate sustained effects.

Study 5

Summary

Robling et al. (2015) evaluated the program in a randomized controlled trial in England, with a sample of 1,645 first time mothers aged 19 or younger recruited from local maternity services. A total of 1,645 women were assigned randomly. Primary outcomes included third-trimester smoking cessation, time until second pregnancy (through 24 months after birth), child birth weight, and child NICU admission (within 24 hours of birth).

Robling et al. (2015) reported no beneficial program main effects.

Evaluation Methodology

Design:

Recruitment: The study recruited young, first-time mothers from 18 primary and secondary local National Health Service organizations and local authorities. Women were identified and approached via local maternity services and recruited usually at their homes by locally based researchers. A total of 3,251 women were screened for eligibility to obtain 1,645 subjects. The appendix noted that the intervention women were broadly representative of the population of women who receive the local maternity services.

Assignment: The study assigned women randomly to the treatment group or the usual-care control group, with randomization stratified by site and minimized by gestation (<16 weeks vs >16 weeks), smoking, and preferred language of data collection. Of the 1,645 women randomized, 823 were assigned to the treatment group and 822 were assigned to the control group. The treatment mothers received screening, education, immunization, and support from birth to the child's second birthday from an assigned family nurse, while the control mothers received usual care from a specialist community public health nurse.

Attrition: Assessments of primary outcomes occurred at baseline (<25 weeks gestation), shortly before birth (35-36 weeks gestation), after the birth of the child, and 24 months after the birth of the child. Of the 1,645 randomized women, 66% completed smoking measures before the birth of the child, 92% had data on child birth weight, 90% had data on emergency room visits after birth of the child, and 78% had data on another pregnancy 24 months after the birth of the child. Reasons for exclusion included missing assessments or withdrawal of consent.

Sample: The mean age of mothers in the sample was 17.3 years. The sample was comprised of 82.9% white, 4.9% black, 1.4% Asian, 5% mixed, and 1.2% other mothers (4.6% of mothers were missing data on race/ethnicity). The study did not provide information on SES of mothers, but the sample was intended to be nationally representative of teenage mothers.

Measures: The study used a combination of mother reports, medical records, and urinalysis. Smoking late in pregnancy was assessed with a combination of urinalysis and mother reports. Birth weight, second pregnancy less than 24 weeks after birth, and emergency room visits and admissions were assessed using mother reports and medical records. Mother reports were obtained by researchers unaware of condition.

The study also used mother reports for 63 secondary outcomes, including child language and cognitive development.

Analysis: The study used mixed-effects three-level regression models to adjust for site as a stratification variable and to allow for clustering by the family nurse in the intervention group. In addition, the other pretest minimization variables (gestation, smoking, and preferred language) were used as covariates. Where appropriate, baseline measures were included as covariates, and repeated measures models with group-by-time terms were used for those assessments that were taken multiple times.

Intent-to-Treat: Analyses included all available data without imputation. The study attempted to follow all subjects (page 3, "Women who wanted to discontinue the intervention were offered the opportunity to still provide follow-up data").

Outcomes

Implementation Fidelity: The appendix contains detailed information on attainment of the fidelity goals of the program and suggests that the "intervention has been delivered as intended." The mean number of visits by community midwives was 10.4 for the treatment group and 10.68 for the control group, and for health visitors was 16.25 for the treatment group and 8.6 for the control group. The treatment group received 39.28 visits on average from their family nurse of a possible 64 maximum visits (the control group received on average .45 visits due to enrolment in error).

Baseline Equivalence: The study stated that sociodemographic and outcome measures were well balanced between groups (Table 1), but it did not report significance tests.

Differential Attrition: The appendix compared differential attrition between the two conditions. It summarized the findings by stating that "Overall this may suggest that women who are in a significant relationship and who are more vulnerable are more likely to disengage from the trial if they are allocated to FNP."

Posttest: The study found no significant effect of the program on late pregnancy smoking, second pregnancy within 24 months, or birth weight. Control group participants were significantly less likely to report ER visits or admissions (OR 1.32, p=.03).

The study reported a small positive effect on 7 of 63 secondary outcomes: intention to breastfeed, maternally reported child cognitive development (only at 24 months), maternal reported language development (only at 24 months), using a standardized assessment (Early Language Milestone at 24 months), levels of social support, partner-relationship quality, and general self-efficacy.

Long-Term: The measures of emergency room visits and subsequent pregnancy may be considered long-term outcomes, but they did not improve significantly more for the intervention group.

Study 6

Summary

Sierau et al. (2016) evaluated the program in a randomized controlled trial of 755 low-SES mothers in Germany. Primary outcomes included family environment, parenting skills, and child development with assessments occurring at baseline (<28 weeks gestation) and at 6, 12, and 24 months after birth.

Sierau et al. (2016) reported no significant main effects.

Evaluation Methodology

Design:

Recruitment: The study recruited low-income, first-time mothers between their 12th and 28th week of pregnancy who reported at least one economic risk factor and at least one social risk factor. Participants were recruited from gynecologists, job centers, and youth welfare offices, in addition to self-referrals. A total of 755 women volunteered to participate.

Assignment: The study randomly assigned women to either the treatment or control groups after stratifying the sample by implementation site, age group, and maternal nationality. A total of 394 women were assigned to the treatment group and 361 to the control group. Both groups received information about existing health or social services, repayment for travel expenses to preventive medical check-ups, reimbursement for regular research attendance, and feedback about the children's developmental status. Only the treatment group received home visits from a social worker or midwife.

Attrition: Assessments of primary outcomes occurred at baseline (<28 weeks gestation), shortly before birth (35-36 weeks gestation), 6 months after the birth of the child, 12 months after the birth of the child, and 24 months after the birth of the child (at the end of the program). Of the 755 randomized women, approximately 70% completed the 36-week assessment, 67% completed the 6-month assessment, 57% completed the 12-month assessment, and 54% completed the 24-month assessment. Reasons for leaving the program included relocation, refusing services, or loss of contact.

Sample: The mean age of mothers in the sample was 21 years and a majority was unmarried (85-89%), and born in Germany (84-89%). Approximately half had less than a high school diploma (49-54%) and a majority of the mothers were considered low income (80-82%).

Measures: The study included three main outcomes: family environment, maternal competencies, and child development. Child development was measured with seven outcomes. Mental development, psychomotor development, and child behavior were measured with the Bayley Scales of Infant Development-II (ICC=.62-.88), which appears to be measured with a researcher-administered scale. Language development was rated by mothers using parent questionnaire ELFRA1 and 2 (alpha=.91-.99) and by a standardized language development test for 2-year-old children, SETK-2 (alpha=.95). Internal and external socio-emotional development were measured with the child behavior checklist CBCL (alpha>.86). Most measures were taken at all time periods after birth, but language and socio-emotional development were only measured at 24 months.

The five family environment measures included maternal stress (alpha=.70-.71), partnership satisfaction (alpha=.88-.89), social support (alpha=.91-.93), birth of additional children, and educational achievement. Measures for maternal stress, partnership satisfaction, and social support were externally developed Likert scales, while additional children and educational achievement were dichotomous reports.

The eight measures of maternal competencies were parental self-efficacy (alpha=.90-.84), knowledge on children rearing (alpha=.77-.74), feelings of attachment (alpha=.73-.79), parenting style (.69-.71), mother-child affectivity (ICC=.69-.71), mother-child responsiveness (ICC=.62-.67), maternal empathy (alpha=.69-.71), and belief of control (alpha=.66-.72). Only social support, parental self-efficacy, knowledge on child reading, and feelings of attachment were measured at baseline. Other measures were started after birth or later in the study. Most of the measures included are mother-reports. Mother-child affectivity and responsiveness, in addition to child behavior, were coded by blind reviewers using videotapes.

Analysis: The study used generalized estimating equation models, which cluster data within persons for repeated measures. Whenever possible, the models accounted for baseline values of the outcome variables. The study reported within condition changes, but the results reported here focus on group-by-time interactions.

Intent-to-Treat: All available data were included in the analysis. The study stated that generalized estimating equations allowed for the use of subjects with partial data. However, the high attrition substantially reduced the size of the original sample.

Outcomes

Implementation Fidelity: The study detailed efforts to train for fidelity. The average number of visits per household was 32.7 and ranged from 0 to 94, of an expected 59 visits.

Baseline Equivalence: The study reported only one significant difference between conditions at baseline in either demographic or outcome measures. Control group mothers were significantly more likely to have a psychiatric disorder, which was controlled in the analysis.

Differential Attrition: Program dropouts were younger, on average, had lower income, and experienced foster care placement. The study did not analyze completion status with condition-by-outcome measures.

Posttest: The study found no significant effects on any of the seven child development outcomes. Of five measures of the family environment, only maternal stress was marginally significant (p=.074). Of eight measures of maternal competencies, only parental self-efficacy (p=.063) and feelings of attachment (p=.062) were marginally significant.

Long-Term: The study did not conduct long-term follow-up.

Study 7

Carabin et al. (2005): This study evaluated the Oklahoma Children First program, which was implemented using the Nurse-Family Partnership model. The authors noted that the program was delivered by a network of nurses and that the frequency of visits was consistent with the model.

Summary

Carabin et al. (2005) evaluated the program in a retrospective cohort quasi-experimental design (QED), gathering data on a total of 64,335 women in Oklahoma who delivered their first child between 1 January 1998 and 31 December 2001. Primary outcomes included gestational age, birth weight, and mortality within 12 months of birth.

Carabin et al. (2005) reported no significant beneficial main effects.

Evaluation Methodology

Design:

Recruitment: This QED study used a retrospective cohort design. The initial sample came from program enrollees in the state of Oklahoma beginning in 1997. The enrollment criteria for the program included, but was not restricted to, pregnant women who were at less than 28 weeks gestation, had not previously delivered a live-born infant, and had little financial or social support. Some late entries (about 10%), many with a high-risk pregnancy, started the program after 28 weeks gestation. Most program participants were referred by other public health programs and family planning clinics. There were 8,598 eligible program participants with linked birth certificate data.

Assignment: The retrospective design relied on non-random assignment to select a comparison group. With selection based on birth certification data, the comparison group included any first-time mother residing in Oklahoma at the time of delivery who potentially could have been enrolled but did not enroll. Both the treatment and comparison groups were limited to mothers with a date of delivery between 1 January 1998 and 31 December 2001 and aged 35 or less. All women with available data were matched on the estimated date of conception so that the two conditions had the same distribution of gestational ages at enrollment. The selection process did not match on other characteristics such as maternal age or socio-economic status; rather, the analysis controlled for these differences. The final sample of 64,335 included 55,737 women in the comparison group along with 8,598 women in the intervention group.

Assessments/Attrition: The observation period lasted from birth to one year after birth - while the program was ongoing. With the retrospective selection of the sample, there was no attrition. Only one measure had substantial missing data - tobacco use during pregnancy (about 14%).

Sample:

The sample was largely young (under age 25) and about a quarter did not complete high school.

Measures:

Birth certificate data were used to measure four pregnancy outcomes: gestational age (preterm versus non-preterm and very preterm versus preterm and normal), birth weight (low versus normal and very low versus low and normal). Linked death certificate data were used to measure infant mortality over the year after birth (no versus yes).

Analysis:

The analysis estimated Bayesian hierarchical logistic models, and the approach initially adjusted for potential geographical and temporal (calendar year) clustering. However, tests showed that adjusting for clustering did not improve the model fit. The results did not present main effects but examined the effects separately among groups defined by the combination of marital status and prior pregnancy loss/pregnancy risk.

The models controlled for participation in the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC), mother's age, mother's education level, mother's race, tobacco use during pregnancy, and a history of fetal death or the presence of pregnancy risk factors. The model for infant mortality also controlled for birthweight categories (normal birth weight, low birth weight, and very low birth weight).

Intent-to-Treat: Multiple imputation methods accounted for missing data.

Outcomes

Implementation Fidelity:

Not examined.

Baseline Equivalence:

Table 1 on baseline characteristics shows large differences between the conditions, though without significance tests. Most notable were differences in marital status (61% single in the intervention group, 41% in the comparison group), age (23% under 18 in the intervention group and 11% in the comparison group), education (60% with 12 years of school or more in the intervention group and 76% in the comparison group), and WIC participation (58% for the intervention group and 32% for the comparison group). Outcomes were not available for testing, but the presence of pregnancy risk factors was nearly identical for the two conditions.

Differential Attrition: No attrition and mostly low missing data.

Posttest:

The study did not present main effects. Table 2 lists program effects separately by marital status and pregnancy risk groups. Across 14 tests, there were three significant beneficial program effects (fewer very preterm births for non-married women, fewer low birthweight births for non-married women with no pregnancy risks, and fewer infant deaths for non-married women) and two significant iatrogenic effects (more preterm births for married women with pregnancy risks and more low birthweight births for married women with pregnancy risks). The authors concluded that the program was associated with reduced risks of adverse pregnancy outcomes for single women but not for married women.

Long-Term:

Not examined.

Study 8

Four QEDs examined outcomes of the full statewide program implementation in Pennsylvania. Rubin et al. (2011) and Matone et al. (2012a) examined the full sample of program participants, while Yun et al. (2014) examined the program for the subgroup of Latina women and Matone et al. (2012b) examined the subgroup of women who smoked during the first trimester of pregnancy.

Summary

Rubin et al. (2011) evaluated the program in a retrospective cohort QED with propensity score matching, collecting data on a total of 14,782 new mothers in Pennsylvania delivering their first child between 1 January 2000 and 31 December 2005. The primary outcome was time until the second pregnancy up to 18 months after the first birth. Matone et al. (2012a) added one site and changed the period of observation to between January 2003 and December 2009, for a total of 21,720 subjects. Primary outcomes included deaths and child injury episodes for 24 months after birth. Matone et al. (2012b) examined a tobacco-using subgroup (n = 6,429) with a primary outcome of smoking cessation in the last trimester of pregnancy. Yun et al. (2014) examined a subgroup of Latina women (n=4,385) with a who gave birth to their first child between January 2003 and December 2007 with a primary outcome of time until the second pregnancy up to 18 months after the first birth.

Matone et al. (2012a) reported some iatrogenic effects on child emergency room visits but Rubin et al. (2011), Matone et al. (2012b), and Yun et al. (2014) found for mothers that compared to comparison group participants, intervention group participants showed significant improvements in:

Time to second pregnancy (later program participants only)

Smoking cessation (later program participants only)Evaluation Methodology

Design:

Recruitment: Rubin et al. (2011) examined program participants from 23 disparate rural and urban sites across Pennsylvania during the period from January 1, 2000, to December 31, 2007. All had delivered a first-born singleton infant between January 1, 2000, and December 31, 2005, and received welfare assistance from the Commonwealth of Pennsylvania within 12 months prior to the infant's birth. Of the total of 5,731 program participants in the 23 sites over the study period, 4,468 (78%) had vital statistics records for their infants, were singleton first pregnancies, and were welfare eligible in the year preceding the infant's birth.

Matone et al. (2012a) examined one additional site, 24 in total, changed the program enrollment period from January 1, 2003, to December 31, 2009, and excluded mothers with first-born children at medical risk and without linked Medicaid data. Matone et al. (2012b) examined 24 sites and the 2003-2007 period but limited the sample to women who reported on the birth certificate that they smoked cigarettes in the first trimester of their pregnancy.

Yun et al. (2014) limited the sample to Latina mothers and 15 sites that served at least 15 Latina women. The period of program enrollment lasted from January 1, 2003, to December 31, 2007.

Assignment: The QED study used a retrospective cohort design and propensity-score matching. Intervention participants were matched to nonparticipating controls who also delivered a first-born infant during the same periods, were welfare enrolled within the 12 months prior to the infant's birth, and were from the same program service regions. Rubin et al. (2011) matched 3,844 in the intervention group to 10,938 in the comparison group (N = 14,782). Matone et al. (2012a) matched 5,016 in the intervention group to 16,704 in the comparison group (N = 21,720).

In the sample of Latina mothers, Yun et al. (2014) matched 1,000 in the intervention group to 3,385 in the comparison group (N = 4,385). In the sample of mothers who smoked during pregnancy, Matone et al. (2012b) matched 1,552 in the intervention group to 4,877 in the comparison group (N = 6,429).

The propensity-score matchings used all or most of following predictors: maternal education, maternal age, maternal race, marital status, history of smoking during pregnancy, type of welfare program, history of having received food stamps, history of pregnancy-induced hypertension, history of gestational diabetes, and whether the mother resided in an urban or rural location. A variable for residing in a high-density zip code was used to capture women in high-penetration neighborhoods. The matching was stratified within period categories. A nearest neighbor within a caliper of 0.05 who was also within the same age category as a program participant was considered a match (up to a maximum of four matched controls per client).

Assessments/Attrition: Rubin et al. (2011) and Yun et al. (2014) followed study participants for 15-18 months after the first birth. Matone et al. (2012a) followed study participants for two years after the first birth. Matone et al. (2012b) followed participants through the end of the first pregnancy. All assessment periods fell within the two-year program. With the retrospective design that matched only those having complete data, there was no attrition.

Sample:

The majority of study subjects were white, young, from urban environments, or with less than high school education. These characteristics differed considerably from other women giving birth across Pennsylvania during the period, even compared with all welfare-eligible births.

The sample of Latina women was mostly of Puerto Rican origin (67%) but also included women of Mexican (13%) and other (20%) origins. About 53% were 18 and younger and about 60-61% had not completed high school. Only 11% were married, and nearly all lived in urban areas.

The sample of women who smoked was 91% unmarried, 80% urban, 11-25% black, 30-44% under age 18, and 45-50% with less than high school educational attainment.

Measures:

The primary outcome in Rubin et al. (2011) and Yun et al. (2014) was the length of time to conceiving a second pregnancy within 15-18 months post-partum of the first pregnancy. The time frame roughly translated into an interval that has been shown to increase the risk for adverse pregnancy outcomes. The variable was calculated from birth certificate data by subtracting the second infant's gestational age in weeks at delivery from his or her birth date and then calculating the remaining interval from the first infant's birth date.

The primary outcome in Matone et al. (2012a) was a count of child injury episodes in the first two years of life. Birth certificate data were linked to Medicaid claims files with data on injury claims from emergency department visits and hospitalizations. Deaths due to injury were coded as a hospitalization under the assumption that hospitalization for the fatal injury would have occurred had the child survived.

The primary outcome in Matone et al. (2012b) was a binary measure of smoking cessation, or the self-reported use of zero cigarettes, during the last three months of pregnancy as recorded on birth certificates. There was no discussion of the validity of this measure.

Analysis:

The survival analysis for time to second pregnancy (Rubin et al., 2011; Yun et al., 2014) used Kaplan-Meier curves and Cox proportional hazards models for the time from the first infant's birth to a second pregnancy. Clustering by site did not change the results. Some analyses compared results across periods, testing the hypothesis that program effectiveness may be delayed for several years following the wide implementation of a community-based prevention program.

The analysis of child injuries (Matone et al., 2012a) used generalized linear models with a Poisson distribution, robust variance estimates, and a control for non-injury emergency department visits to adjust for overall differences in health care utilization. Matone et al. (2012b) used multivariable logistic regression with a control for the baseline country smoking rate.

Intent-to-Treat: The analyses used all matched subjects.

Outcomes

Implementation Fidelity:

Not examined.

Baseline Equivalence:

Tables in Rubin et al. (2011) and Yun et al. (2014) list the means of the matched conditions for eight sociodemographic measures, none of which related directly to pregnancy or childbirth. Although the tables lack significance tests, the means appear quite similar. Yun et al. (2014) stated (p. S155) that the samples have "balance (< 5% weighted difference between groups) across covariates both within each NFP service area and in aggregate."

Table 1 in Matone et al. (2012a) includes eight baseline measures that, despite lacking statistical tests and outcomes, appear similar across conditions. Table 1 in Matone et al. (2012b) includes eight baseline measures and, along with lacking statistical tests and outcome, shows some differences across conditions. The intervention group included 25% blacks versus 11% in the comparison group, and the intervention group included 44% under age 19 versus 30% in the comparison group.

Differential Attrition:

No attrition.

Posttest:

For the full sample (Rubin et al., 2011), the intervention group did not differ significantly from the comparison group in the early period (2000-2003) but had significantly slower time to a second pregnancy (HR = .87) in the later period (2004-2005). For the early period, all eight tests for program effects within subgroups failed to reach statistical significance. For the later period, four of the eight subgroup program effects were significant (rural mothers under age 18 showed the largest effect, HR = .40).

Also for the full sample (Matone et al., 2012a), the intervention group had significantly more visits to emergency rooms for injuries and significantly more children with visits to emergency rooms for injuries than the comparison group. The same iatrogenic results emerged for emergency room visits for superficial injuries, non-injuries, injuries leading to hospitalization, and injuries from motor vehicle accidents. The authors noted that their results did not confirm previous studies that found reductions in emergency room utilization for injuries.

For the sample of smokers (Matone et al., 2012b), the intervention group had a significantly higher rate of smoking cessation in late pregnancy than the comparison group (OR = 1.26). When examined separately by implementation period, the program had a significant effect in the later period (2006-2007) but not the early period (2003-2005).

For the Latina sample (Yun et al., 2014), the intervention group had a significantly lower risk than the comparison group of a very short inter-pregnancy interval (HR = .60) and of a short inter-pregnancy interval (HR = .86). Subgroup analyses found significant effects on a very short interval for Mexican, Puerto Rican, adolescent, and adult mothers and significant effects on a short interval for adolescent mothers.

Long-Term:

Not examined.

Study 9

Matone et al. (2018) evaluated two other programs besides Nurse-Family Partnership that are not included here.

Summary

Matone et al. (2018) evaluated the program in a retrospective cohort QED with entropy balancing matching, collecting data on 173,769 women who gave birth between 2008 and 2014. The primary outcome was documentation of an abuse episode or high-risk injury episode for 24 months after birth.

Matone et al. (2018) reported no beneficial program main effects.

Evaluation Methodology

Design:

Recruitment: Intervention participants were enrolled in 22 Pennsylvania Nurse-Family Partnership programs from 2008 to 2014. Inclusion criteria for participants were as follows: (1) program enrollment was identifiable in birth certificate files and (2) enrollment in the state medical assistance program (Medicaid).

Assignment: The quasi-experimental retrospective design selected study participants after the study period. To define a comparison group, intervention participants were matched to local-area women not participating in the program (and not participating in either of the two other programs evaluated in the study). The comparison women were also enrolled in the State Medicaid program, had similarly aged children identified in birth certificate files, and resided in the same local implementing agency area (i.e., county or multi-county service area). The matched cohort included 8,736 women in the intervention group and 165,033 women in the comparison group (total N = 173,769).

The matching used entropy balancing rather than propensity-score matching because young mothers in rural areas would have been disproportionately dropped in attempts to find matches. Entropy balancing retained all cases by reweighting the comparison group to have the same covariate distribution as the intervention group. The weighting was done within local implementing agency areas to address possible confounding by geography (i.e., the outcomes might vary across sites at a community level beyond maternal-level characteristics). The balancing used maternal sociodemographic and clinical characteristics from birth certificate, welfare, and Medicaid data: mother's age at birth, race/ethnicity, maternal education, gestational age, smoking prior to pregnancy, receipt of welfare prior to or during the first trimester of pregnancy, Medicaid eligibility, maternal diagnosis of substance abuse, depression and/or bipolar disorder in the immediate preconception period or first trimester of pregnancy.

Assessments/Attrition: The women and their children were followed retrospectively for 24 months after the birth of the child, a period in which the program was ongoing. There was no attrition with the retrospective design.

Sample:

The majority of clients were unmarried and non-Hispanic white. About 21% were under age 18, 90% were unmarried, 38% had less than a high school degree, and 49% received food stamps.

Measures:

Outcome measures were derived from child Medicaid claims and codes indicating abuse or a suspicious high-risk injury. The primary outcome was the presence of an abuse episode or high-risk injury episode, and the secondary outcome was the presence of any injury episode.

Analysis:

The study used a weighted logistic regression with a random intercept for county to control for the variability across counties. Other controls were already used in the balancing weights.

Intent-to-Treat: The analysis used all matched cases, with those lacking data excluded in the matching process.

Outcomes

Implementation Fidelity:

A qualitative component of the study described reasons why the program led to more instances of abuse.

Baseline Equivalence:

Although there were no significance tests and no outcomes available, weighted means by condition showed close correspondence for sociodemographic measures.

Differential Attrition:

No attrition.

Posttest:

Across all models, children of intervention group mothers were significantly more likely to experience an abuse episode than children of comparison mothers (OR = 1.32). The rates were low: 1.4% in the intervention group and 1.0% in the comparison group sustained an abuse injury within the first 24 months of life. Sensitivity analyses demonstrated a lack of confounding by Child Protective Services involvement (Online Appendix D) or maternal intimate partner violence exposure (Online Appendix E).

Long-Term:

Not examined.

Study 10

Summary

Nguyen et al. (2003) evaluated the program in a randomized controlled trial of 225 Hispanic adolescents (<20 years) who were pregnant with their first child. Primary outcomes were maternal weight gain and child gestational age and birth weight, all measured only at the time of birth.

Nguyen et al. (2003) reported that, compared to the control group, intervention group participants had significantly:

Fewer preterm births
Higher birthweights

Evaluation Methodology

Design:

Recruitment: The participants included pregnant Hispanic adolescents in Orange County, California, who were referred to the study by physicians, community clinics, schools, social services agencies, probation departments, pregnancy testing clinics, juvenile health services, and the food program for women, infants, and children. To participate in the study, participants needed to be on or eligible for Medi-Cal, at less than 28 weeks gestation, younger than 20 years old, and pregnant with their first child. A total sample of 225 Hispanic adolescents enrolled in the study.

Assignment: The study randomly assigned participants to the intervention group (N = 104) or control group (N = 121). The control group received traditional Public Health Field Nursing services.

Assessments/Attrition: Assessment came at the time of the child's birth. Information on birth outcomes was available for 152 (67.6%) of the randomized sample

Sample:

The median number of years of education completed for the sample was about 10, with over half (55%) still enrolled in school. Only 14.2% were employed part-time or full-time. Sixty-eight percent and 69.3% of adolescents in the study reported receiving Medi-Cal and WIC, respectively. The reported median total household income range was $12,001 to $15,000.

Measures:

The outcome measures included maternal weight gain and gestational age and birth weight of the newborn. The source of data, although not stated, was likely birth certificates.

Analysis:

The study compared condition means (without controls) at follow-up but presented no significance tests.

Intent-to-Treat: The study used all available data. The most prevalent reasons for missing data were declining further participation, moving out of state/program area, and missing excessive appointments.

Outcomes

Implementation Fidelity:

Not examined.

Baseline Equivalence:

Tests for nine sociodemographic measures in Table 1 showed one significant difference for place of birth, with 61% of the control group and 44% of the intervention group born outside the U.S. Other measures not significant in the tests nonetheless showed sizable differences. For example, 50% of the control participants versus 61% of the intervention participants were enrolled in school.

Differential Attrition:

Attrition rates were similar across conditions (67.8% in the control group and 67.3% in the intervention group), but the study presented no other information and no formal tests.

Posttest:

The means (listed without significance tests) showed that the intervention group had fewer premature births and a higher mean birth weight than the control group.

Long-Term:

Not examined.

Study 11

This study evaluated an adapted model of the program that included Aboriginal community workers as part of the home visiting team and did not limit the program to first-time mothers.

Summary

Segal et al. (2018) evaluated an adapted version of the program in a retrospective cohort QED, collecting data on all live-born children (n=854) of Aboriginal heritage who were born between March 2009 and December 2015 in or around Alice Springs, Australia. Primary outcomes were documentation of child abuse or neglect (reported, investigated, or substantiated) and documented days of out-of-home placement from birth until the end of the study observation period (up to 8 years after birth).

Segal et al. (2018) reported no significant main program effects.

Evaluation Methodology

Design:

Recruitment: Participants included all live-born children to women in the Australian Health service database born between March 1, 2009, (program commencement) and December 31, 2015, who met the following eligibility criteria: 1) pregnant with an Aboriginal baby; 2) located during pregnancy (13 to 28 weeks gestation) in Alice Springs, Australia, or within a 100 kilometer radius; 3) not having an early miscarriage, termination or abortion (<13 weeks gestation); and 4) had not previously enrolled in the program.

Assignment: Two study cohorts (N = 854) were created in the retrospective design: 1) the program group of all children live-born to women who had enrolled in the program (N = 291), and 2) the control group of children of eligible women who were not referred to and had never participated in the program (N = 563).

Assessments/Attrition: The study followed participants from the time of birth (March 1, 2009, to December 31, 2015) through the end of collection of child protection outcome data on December 31, 2016. That defined a period ranging from one year to just under eight years. At least part of the sample was likely participating in the program at the follow-up assessment. The retrospective design, which was based on the selection of birth records with complete data, meant there was no attrition.

Sample:

The sample of Aboriginal mothers had a mean age of about 25 and nearly half were having their third child or more. About 80% were not employed.

Measures:

Data for the outcomes came from child protection records. The four measures of involvement with the child protection system included annualized rates of reports, investigations, and substantiations of abuse or neglect, plus annualized days of out-of-home placements. The authors noted that the surveillance of program mothers might have been greater than for control mothers and led to more contact with the child protection system.

Analysis:

The analysis examined main effects with bivariate mean comparisons. Generalized linear models included controls for mothers' socio-demographic characteristics, but they were used only for subgroup analyses.

Intent-to-Treat: The analysis used the full sample, although women who were referred to the program but declined to participate were not selected for study.

Outcomes

Implementation Fidelity:

Not examined.

Baseline Equivalence:

Table 1 showed four significant differences in sociodemographic measures in six tests, with each showing a large difference in condition means.

Differential Attrition:

No attrition.

Posttest:

For the full sample, Table 2 showed no significant differences in four tests done without multivariate controls, and a test for the difference in the cumulative incident curve of out-of-home placement was not significant. However, the text says that the intervention group had significantly fewer annualized days of out-of-home placement. The multivariate results were presented only for subgroups, not for the full sample, and demonstrated more consistent significant evidence of program benefits for younger (<21), first-time mothers.

Long-Term:

Not examined.

Study 12

This short study has only four pages of text.

Summary

Holmes & Rutledge (2012) evaluated the program in a retrospective QED using matching based on inverse propensity weights and collecting data on children of first-time mothers born in North Carolina between 2009 and 2013 (n = 61,993). Primary outcomes were birth weight, preterm birth status, NICU admission, and breastfeeding initiation, all recorded at the time of hospital discharge.

Holmes & Rutledge (2012) reported that compared to the comparison groups, intervention group participants had significantly:

Fewer very preterm births
Fewer extremely preterm births
Fewer NICU admissions
More breastfeeding initiations

Evaluation Methodology

Design:

Recruitment: This retrospective study began by selecting program participants in North Carolina from 2009-2013 who could be matched to birth certificate data. Of 1,027 clients, 908 had a valid delivery date and were matched. After dropping women who had higher-order births (any previous live birth), did not have Medicaid as the primary source of payment at the time of delivery, or gave birth from 2009‐2010 (before a change in birth certificate data), the sample of participates fell to 564.

Assignment: The study identified a control group of 61,429 women with birth certificate data who met the same criteria as the 564 women in the intervention group (total n = 61,993). Inverse propensity treatment weights using maternal age, race and ethnicity, maternal education, marital status, smoking behaviors, and obesity status were generated to compare the conditions. More specifically, the weighting was done for three control group comparisons: 1) all women in North Carolina (statewide sample), 2) women delivering in hospitals used by intervention participants that year (hospital sample), and 3) women delivering in counties also containing intervention participants that year (county sample).

Assessments/Attrition: Study participants were assessed retrospectively at the birth of their child and hospital discharge, while the program was ongoing. The retrospective design eliminated missing birth certificate data in the sample selection stage and therefore had no attrition.

Sample:

The women were on average 22 years old, with 33% under age 20. About 10% were Hispanic, 48% white, 39% African American, and 2% Native American. Only about 20% were married.

Measures:

Data from birth certificates and hospital records provided nine outcome measures: low birthweight (including any low birthweight, moderate low birthweight, and very low birthweight separately), preterm birth (including any preterm birth, moderate preterm birth, very preterm birth, and extreme preterm birth separately), neonatal intensive care unit (NICU) admission, and breastfeeding initiation recorded at hospital discharge.

Analysis:

The analysis used doubly‐robust logistic regression with clustered standard errors and controls for the same covariates used in the weighting. Clustered standard errors accounted for unobserved similarities in populations within hospitals and counties, and a bootstrapping technique with 100 repetitions of the model provided more precise standard errors. The covariates included a quadratic for maternal age, race/ethnicity, education, marital status, WIC participation, adequacy of prenatal care, and smoking behaviors before and during pregnancy.

Intent-to-Treat: Given the retrospective design, the analysis used the full sample selected for the study.

Outcomes

Implementation Fidelity:

Not examined.

Baseline Equivalence:

The weighted sample means for 11 baseline sociodemographic measures show the conditions to be closely matched (less than 10% difference), although no significance tests were presented.

Differential Attrition:

No attrition.

Posttest:

With nine outcomes and three samples, the 27 tests showed five significant program effects. For the match with the statewide sample, the intervention group had significantly fewer very preterm births and significantly more breastfeeding initiations than the control group. For the match with the hospital sample, the intervention group had significantly fewer very preterm births and admissions to the neonatal intensive care unit. For the match with the county sample, the intervention group had significantly fewer extremely preterm births. Tests for moderation showed that the program was most effective for African‐American mothers.

Long-Term:

Not examined.

Study 13

Summary

Thorland & Currie (2017) evaluated the program in a retrospective cohort QED using propensity score matching for a total of 108,772 women who gave birth to a singleton first child between 2008 and 2010. Primary outcomes included the preterm birth status and birth weight as recorded on birth certificates.

Thorland & Currie (2017) reported that compared to the matched comparison group, intervention group participants had significantly:

Fewer premature births
Better birthweights

Evaluation Methodology

Design:

Recruitment: Using a retrospective cohort study design and the Nurse Family Partnership data warehouse, the study began with a national sample of women who entered the program between July 1, 2007, and June 30, 2010, with the birth of their first child typically occurring during the years of 2008-2010. An initial sample of 39,207 clients was reduced by 1,903 enrollees (4.9%) with no home visit data and by 754 clients who delivered plural births. This report focused on the remaining 36,550 clients who had records of receiving one or more home visits before a singleton birth.

Assignment: The non-randomized comparison group came from national birth certificate data collected by the National Center for Health Statistics. The group included all live, singleton, and first-born births recorded between the years of 2008-2010, about 13 million cases. The study used propensity score matching to select an equivalent comparison group from the 13 million cases. The matching used the demographic covariates of maternal race-ethnicity, age group, marital status, education attainment, and smoking status during pregnancy. A 3:1 match based on the nearest-neighbor method with an exact match caliper was used

For each intervention participant with full demographic data (92% of the full birth cohort), three clients from the birth records database were selected, resulting in sample sizes of 27,195 for the intervention group and 81,577 for the comparison group (eight of the intervention women could only be matched to two controls). The total sample was 108,773. Presumably, those in the comparison group did not receive the program. The authors noted that, lacking information on individual identities from the national database, a few program participants could have been matched with themselves, but that such overlap would weaken program effects because the outcomes for the duplicates would be identical.

Assessments/Attrition: The retrospective design examined women at the birth of their child, typically many months after the start of the program but while the program continued. Outside of the 8% excluded before the matching due to missing birth certificate data, there was no attrition.

Sample:

The matched sample included a mix of black (28%), Hispanic (31%), and white (35%) mothers with a modal age category of 18-19 (26%). Most (87%) were not married, and 51% had a high school degree or equivalent.

Measures:

Five outcome measures obtained from birth certificates included early preterm birth (<32 weeks gestation), preterm birth (<37 weeks), early-term birth (<39 weeks), very low birth weight, and low birth weight.

Analysis:

The analysis compared the mean outcomes for the two matched conditions without any covariate controls.

Intent-to-Treat: The analysis used all matched subjects, though about 5% of program participants with no recorded home visits were excluded at the start of the study.

Outcomes

Implementation Fidelity:

Of the clients receiving home visits during pregnancy, 80.6% remained active in the program up to or beyond the live birth of their child. For those mothers who delivered live births while enrolled, the average length of home visiting services during pregnancy was 21.3 weeks.

Baseline Equivalence:

Table 2 shows no significant differences between conditions for the five demographic measures used in the matching but presents no information on other measures.

Differential Attrition:

Given the retrospective design that matched only those with complete data, there was no formal attrition. However, no information was provided on the 5% of cases excluded before matching because of no home visit records and the 8% of cases excluded before matching because of missing data.

Posttest:

The intervention group had significantly better birth outcomes than the comparison group for four of the five outcomes measuring premature and low birth weight births.

Long-Term:

Not examined.

Study 14

This study was part of a federally sponsored evaluation of two home-visiting programs for pregnant women. For nearly all results, the study combined two programs - Nurse-Family Partnership and Healthy Families America - into a single treatment group. The evaluation specific to Nurse-Family Partnership came from one table in the extensive report.

Summary

Lee et al. (2019) evaluated the program with a randomized controlled trial in which pregnant women in 66 local sites were randomized into a local home-visiting program - either Nurse-Family Partnership or Healthy Families America - or a control group. Outcomes related to infant birth and health care were examined separately for the two intervention conditions in one subgroup analysis.

Lee et al. (2019) found no significant effects of the Nurse-Family Partnership program on birth or health-care outcomes.

Evaluation Methodology

Design:

Recruitment: The recruitment process sought local programs from 2012 to 2015 that had been in operation for two years, employed at least three home visitors, and had no implementation problems. A total of 66 local sites in 17 states that primarily served Medicaid beneficiaries joined the study. The sample included 2,900 women who were no more than 32 weeks pregnant and 15 years of age or older. The Nurse-Family Partnership women had to be low-income, first-time mothers and no more than 28 weeks into the pregnancy.

Assignment: The women were randomly assigned at a 60:40 ratio to a local home-visiting program - either Nurse-Family Partnership or Healthy Families America - or to a control group that received information on other appropriate community services. Four families were excluded because they came from one of three local programs in which all families were assigned to the same research group. For the final sample of 2,896 mothers, there were 1,569 in the program group and 1,327 in the control group.

Assessments/Attrition: Data came from a baseline survey, infant birth and death records, and Medicaid data. Given recruitment from 2012 to 2015, and the use of Medicaid data from January 2011 to May 2017, the follow-up period appears to have been 2-5 years. However, the study examined results for the full period without separating short-term from long-term results. The vital records were available for 2,609 mothers (90%) and 2,650 infants (91%). Medicaid records were available for 2,896 mothers (99%) and 2,790 infants (96%). A good part of the attrition came from miscarriages, stillbirths, and fetal deaths.

Sample:

On average, women entered the study at 17 weeks of pregnancy. Their average age was 22 years. The sample was diverse: 21% identified as white, 27% as black, and 43% as Hispanic. Also, 22% of the sample was foreign-born. Only about two-thirds of the women had a high school diploma or GED certificate.

Measures:

Confirmatory measures obtained from birth certificates included: 1) smoking during the third trimester of pregnancy, 2) infant low birth weight (at less than 2,500 grams), 3) infant preterm birth (at less than 37 weeks into the pregnancy), 4) infant admittance to the neonatal intensive care unit; and 5) mother breastfeeding at discharge from birth hospital.

Other confirmatory measures for the period after birth of the child included 1) the number of Medicaid-paid well-child visits, 2) having at least one Medicaid-paid emergency department visit, and 3) having at least one Medicaid-paid hospitalization. None of the measures included health care received outside of Medicaid.

Analysis:

The analyses used regression to compare condition means after adjusting for numerous characteristics of the mother, but it was not possible to control for baseline outcomes related to the newborn child. The authors stated (p. 103) that "an alternative regression model, wherein families were clustered by site, yielded consistent results."

Intent-to-Treat: The analysis included all families with data, even the 14% of those assigned to the intervention who never received a home visit.

Outcomes

Implementation Fidelity:

Local programs reported that they had systems in place to guide, monitor, and support home visitors in their work. Home visitors reported that the implementation systems at their local programs were strong and rated their own effectiveness levels as high. However, "most program group families received a lower dose of home visiting from the first visit until the child's first birthday than called for by the evidence-based models."

Baseline Equivalence:

Table 2.3 lists baseline means for the combined treatment group (both Nurse-Family Partnership and Healthy Families America) and the control group but does not present statistical significance or effect sizes. The authors said that (p. 42) "There were a few statistically significant differences between research groups, but the impact analyses control for many of these and other baseline characteristics."

Differential Attrition:

The study did not examine differential attrition, instead stating that attrition was minimal (p. 129) and came most commonly from miscarriages, stillbirths, and fetal deaths.

Posttest:

Table G.2 presents the only results for Nurse-Family Partnership that are separated from Healthy Families America. For the eight primary outcomes, none showed significant differences between the Nurse-Family Partnership group and the control group.

Long-Term:

The follow-up period for the three health care outcomes varied with the timing of program enrollment, making it difficult to isolate long-term effects.

Study 15

This study was part of a federally sponsored evaluation of four home-visiting programs for pregnant women and new mothers. For nearly all results, the study combined the four programs into a single intervention group. The evaluation specific to Nurse-Family Partnership came from one table in the extensive report.

Summary

Michalopoulos et al. (2019) evaluated the program with a randomized controlled trial in which pregnant women in 87 local sites were randomized into one of four local home-visiting programs - including 22 sites for Nurse-Family Partnership - or a control group. Outcomes at 15 months after birth related to infant maltreatment, home support, health care, behavior problems, and language skills.

Michalopoulos et al. (2019) found that, relative to the control group, the Nurse-Family Partnership group had significantly fewer

Medicaid-paid emergency department visits

Evaluation Methodology

Design:

Recruitment: The study recruited women from 88 local programs in 12 states. The programs, almost all located in metropolitan and relatively large counties, had been operating for at least two years. The women had to be pregnant or with children under six months old, at least 15 years old, and able to speak English or Spanish. Other local program eligibility criteria applied; the Nurse-Family Partnership limited participants to first-time mothers. From October 2012 to October 2015, a total of 4,229 families entered the study. With one program failing to enroll anyone, the final sample included 87 local programs (22 were Nurse-Family Partnership programs).

Assignment: Women were randomly assigned within sites to a home-visiting intervention group or a control group that received information about other appropriate services in the community. The intervention group included four home-visiting programs: Nurse-Family Partnership, Healthy Families America, Early Head Start - Home, and Parents as Teachers. After the loss of 14 women who withdrew from the study or were belatedly found to not qualify, the randomized sample of 4,215 included 2,102 in the intervention group and 2,113 in the control group.

Assessments/Attrition: Data were collected around the time the child was 15 months old. Administrative data had little missing data, but response rates for other data sources ranged from 68% to 79%.

Sample:

For the full sample, about a third of the women in the study were Hispanic, a little more than a quarter were Black, and a little more than a quarter were White. Almost two-thirds of the women were less than 25 years old at the study start, and 35% were less than 21 years old. Almost 60% of women ages 18 to 20 had graduated from high school and more than three-quarters of all women had been employed during the previous three years. Nearly all were receiving some form of public assistance.

Measures:

The study team chose 12 confirmatory outcomes based on the evidence of effects from previous studies and on policy relevance. The outcomes came from a survey, video recordings of mother-child interactions, a preschool language scale, weight and height information, home observations, and administrative data. Parents rated child behavior problems, and the study did not state that coders were unaware of condition. The scales had good reliability and other measures had face validity, but the healthcare measures missed visits not paid by Medicaid.

New pregnancy after study entry (parent reported)
Mother receiving education or training (parent reported)
Quality of the home environment (home observations)
Parental supportiveness (home observations)
Frequency of minor physical assault toward the child during the past year (parent reported)
Frequency of psychological aggression toward the child during the past year (parent reported)
Health insurance coverage for the child (parent reported)
Number of Medicaid-paid well-child visits (Medicaid data)
Number of Medicaid-paid child emergency department visits (Medicaid data)
Any Medicaid-paid child health encounter for injury or ingestion (Medicaid data)
Child behavior problems (parent reported)
Child receptive language skills (home assessment)

The study also examined several dozen exploratory outcomes that were not based on evidence from previous research.

Analysis:

The main analyses used linear regression with adjustment for dozens of covariates (see pages 163-164 for a list). Given the prenatal start of the program, it was not possible to control for baseline child outcomes. The analyses of the separate influence of each of the four programs came from a model with fixed effects and random slopes for the 87 local programs and used maximum likelihood estimation.

Intent-to-Treat: The study used all available data and checked the results using the full sample with imputed values for missing data.

Outcomes

Implementation Fidelity:

For the four intervention programs combined, 83% of the families received at least one home visit, and the average for those receiving a visit was about 18 visits during the first year. Almost half of the families who had received at least one visit were still participating in home visiting at the child's first birthday. While lower than recommended, these participation rates were said to be typical.

Baseline Equivalence:

Appendix Table A.1 presents 77 tests for differences between the combined intervention group and the control group, with only three significant differences (p < .05). An omnibus test found no significant difference overall. The listed sample size of 4,215 is nearly the same as the randomized sample size, although missing data on baseline measures could have reduced the sample further.

Differential Attrition:

Appendix C.1 used the analysis sample of survey respondents (n = 3,315) to compare baseline measures across the combined intervention group and the control group. Tests for 44 measures showed only two significant differences. Appendix C.2 used the analysis sample of in-home assessment respondents (n = 2,976) to make the same comparisons. Tests for 44 measures showed only one significant difference.

Appendix Tables C.3 and C.4 compared respondents and non-respondents, finding numerous differences. That the conditions remained comparable in Tables C.1 and C.2 suggests that the differences affected the intervention and control groups similarly. The authors noted as well that the outcome results were similar when imputing missing data for non-respondents.

Posttest:

Table 5.1 lists results separately for the four programs and the 12 confirmatory outcomes. Significance tests examined differences in effects across the four programs rather than for each program compared to the control group. For the 12 tests, two showed a significant difference across the four programs. For these two outcomes, Nurse-Family Partnership had the weakest effect on the parenting measure of home environment quality (d = .05) and the largest effect on Medicaid-paid emergency department visits (d = -.50).

For the other 10 outcomes, the lack of intervention differences suggests that the overall effects differed little from the Nurse-Family Partnership effects. Tables 3.3 to 3.9 show no significant overall effects for the 10 outcomes.

Moderation tests for seven family characteristics and the 12 confirmatory outcomes found that differences across "subgroups of families are generally small and not statistically significant."

Robustness tests for the intervention effects in Table 5.1 revealed generally similar results but also some differences that suggested inconsistency in the findings (see Appendix F).

Long-Term:

Not examined.