To our knowledge, no prior studies have examined how frequently related policies co-occur, a necessary step to lay the foundation for rigorous analytic solutions. For researchers aiming to estimate individual policy effects, guidance is needed on how to evaluate whether the impacts of policy co-occurrence on estimation are likely to undermine the study. In some cases, the challenge of co-occurring policies may require a modified analytic approach or even altering the research question. In this article, we address these gaps by proposing and applying an approach to assess the extent of policy co-occurrence and to quantify the impact of policy cooccurrence on the precision of effect estimates for individual policies. Using 13 exemplar social policy databases covering diverse domains, we visually depicted and quantified the extent of policy co-occurrence in each database and used simulations to estimate impacts on precision. We illustrate in this article a method that can be used in applied research to determine when policy co-occurrence is so severe that alternative analytic approaches are needed.We developed a systematic sample of social policy databases covering diverse health-related domains that capture measures of policy adoption or implementation across jurisdictions and time. To evaluate the extent and impacts of policy clustering, we applied 3 analyses to each database. First, we visualized the degree of policy co-occurrence in each database by plotting heatmaps of pairwise correlations among the measured policies.
Second,how to trim cannabis building on the positivity literature, we quantified the overall degree of cooccurrence in each database as the amount of variability in each policy measure across jurisdictions and time that could be explained by the other policy measures in the same database. This step indicated how much independent variation remained with which to study the policy of interest. Finally, we used simulations to estimate the impacts of policy co-occurrence on precision by comparing the variance of estimated effects given the observed co-occurrence with the variance if all policies were adopted independently.Because no registry of all available social policy databases exists, we identified an exemplar set by evaluating contemporary research on social policies and health, and selecting domain specific policy databases corresponding to those studies. We identified all studies of social policies published in 2019 in top medical, public health, and social science journals, emphasizing general-topic journals that publish research on the health effects of social policies. These journal were Journal of the American Medical Association, American Journal of Public Health, American Journal of Epidemiology, New England Journal of Medicine, Lancet, American Journal of Preventive Medicine, Social Science and Medicine, Health Affairs, Demography, and American Economic Review. After these journals were selected, we asked a convenience sample of 66 researchers from diverse disciplines to rank relevant journals. Responses confirmed that our selected journals reflect common perceptions of most relevant venues for research on the health effects of social policies .
We identified original, empirical studies in which the authors aimed to estimate the causal effects of 1 or more social policies on health-related outcomes in any country, state, or locality . Although the definition of social policies varies across the literature, a priori we defined “social policy” to mean any non-medical, population-based or targeted policies that are adopted at a community or higher level and hypothesized to affect health or health inequalities via changes in social or behavioral determinants. A priori, we defined health-related outcomes broadly to include morbidity, death, health conditions, and factors such as smoking, homelessness, and sales of unhealthy products. Given our focus on social interventions, we excluded studies that pertained to health care, health insurance, interventions delivered in the clinical setting, medications, or medical devices, including studies of the Affordable Care Act or Medicaid expansion. For reproducibility, additional detail is presented in Web Appendix 2. An independent analyst reviewed a subset of candidate articles to confirm that our strategy to identify relevant papers was reproducible. Concordance between reviewers upon initial review was 90% . For each social policy study, we identified any corresponding quantitative databases capturing the content, locations, and times of adoption of the index policy and related policies in the same domain. We searched the scientific literature; websites of domain-relevant research institutions, scientific centers, and organizations; and the internet to identify relevant, publicly available databases. We also asked the authors of each index social policy study for policy database recommendations.
When possible, we included databases provided on request from individual investigators. If more than 1 policy database was available, we selected the one that was most amenable to this analysis: first, the database requiring the least data cleaning or manipulation ; then, among those remaining, the database with the greatest clarity of variable definitions, followed by the least missingness and most comprehensiveness . We excluded domains for which we could not identify or access any corresponding database. Figure 1 presents information on the number of articles considered, studies and corresponding databases included in the final sample, and studies and databases excluded. Additional detail is presented in Web Appendix 4.We formatted each database to have 1 row per jurisdiction and period and 1 column per policy measure. The types of policy information varied across databases. Some included exclusively binary indicators of policy adoption, whereas others provided information on benefit generosity, implementation, access, and/or scope . We included all available policy measures for the heat maps . For subsequent analyses, when multiple measures of the same policy were available , we selected the measure used in the publication in the original search that invoked the policy, if relevant, or the measure we judged to be the most representative. Some policies were subordinate to umbrella policies. For example, provisions regulating cannabis delivery services are only applicable in jurisdictions where recreational cannabis is legal. For jurisdictions and times in which the umbrella policy was not active, we included these observations in the analysis and coded provisions conditional on that umbrella policy as 0. Additional details are provided in Web Appendix 5.First, to visually depict policy co-occurrence in each database, we plotted hea maps of the Pearson correlation matrix for each pairwise combination of policy measures . Although numerous measures are appropriate, we selected the Pearson correlation because it is common, intuitive, and accommodates continuous-continuous, continuous-binary, and binary binary variable comparisons. Although the distribution of the Pearson correlation between continuous and binary variables is constrained, this constraint is appropriate in this context. Second, we assessed the degree of unique variation available to estimate individual policy effects, when considering each individual policy while controlling for all others. To do this, we estimated an R2 value in regression models of each policy regressed on the set of all other policies in the same database. We modeled continuous policy variables using linear regression and used R2 adjusted for the number of predictor variables. We modeled binary policy variables using logistic regression and used the McFadden pseudo R2 . For both types of regression, we included as main terms all predictor policy variables in the database. This step quantified the amount of variability in each policy across jurisdictions and times that could be explained by the other policy measures and resulted in a distribution of R2 values—1 for each policy in each database. This step is also conceptually very similar to estimating propensity scores to assess positivity, except that it accommodates continuous exposure variables. Third, we used simulations to estimate the impacts of policy co-occurrence on precision. For each policy measure, in each policy database, we applied the following procedure: Step A: Assign a simulated outcome of N observations, where N is the number of jurisdictions and times in the policy database .
To simulate the outcome, we assumed 1) a random normal distribution with a mean of 100 and a standard deviation of 5; 2) a null effect of the index policy on the outcome ; and 3) 10% of the variance of the outcome was explained by a randomly selected non-index policy . We incorporated this last component because the precision of the estimated effect of the index policy depends on the proportion of the variance in the outcome that is explained by the other variables in the model. Because large-scale social programs are recognized to have small individual-level effects ,vertical growing system we considered 10% explained to be optimistic in the setting of the health effects of social policies. We assumed no other confounding was present. Step B: Apply a linear regression, modeling the simulated outcome as a function of the index policy, the non-index policies, jurisdiction fixed effects, and time fixed effects. From this regression, record the variance of the regression coefficient corresponding to the effect estimate of the index policy 2). This was the variance in the real-world, co-occurring, data. Step C: To estimate the variance if there were no cooccurrence, randomly redistribute the values of the all policy measures across jurisdictions and time . This process preserves the overall mean and variance of each policy measure but eliminates systematic co-occurrence. Step D: Apply the same regression model as in step B to the redistributed policy data and record the variance of the effect estimate of the index policy. Step E: Take the ratio of the variance of the effect estimate of the index policy, under the real-world policy regime versus under the randomly redistributed regime . This ratio is an estimate of the variance inflation due to policy co-occurrence. We conducted steps A–E 1,000 times for each policy measure in each database, which resulted in a set of estimates of the variance inflation. We summarized the variance inflation due to policy co-occurrence for each database by stacking all the variance inflation estimates for all the policy measures in that database and plotting their distribution. We summarized the variance inflation due to policy co-occurrence overall by stacking all the variance inflation estimates for all policy measures in all databases and calculating their summary statistics. All analyses were conducted using R, version 3.6.2 . The statistical code is provided in Web Appendix 6.We identified 55 studies evaluating links between social policies and health that met our inclusion criteria , among which there were 36 unique policies or databases invoked, and 13 social policy databases that could be identified and accessed . Studies included, for example, a panel data analysis of the impacts of changes in the level and duration of paid maternity leave on fertility, workforce participation, and infant mortality across 18 African and Asian countries and a synthetic control evaluation of the effect of raising state-level beer excise taxes on young adult firearm homicides . The sample of 13 identified social policy databases included 5 country-level databases, 6 state-level databases, and 2 local-level databases. Domains included poverty and social welfare; family and child welfare; worker welfare; pensions; unemployment; fertility; immigration; lesbian, gay, bisexual, and transgender rights; firearms; alcohol use; tobacco use; and recreational cannabis use. The number of unique policies per database ranged from 6 to 134. Some databases had multiple umbrella policies, whereas others focused exclusively provisions relating to 1 umbrella policy. For example, the Policy-Relevant Observational Studies for Population Health Equity and Responsible Development database includes overarching policies and specific provisions for breastfeeding breaks, child health leave, family leave, maternity leave, parental leave, paternity leave, and sick leave, whereas the recreational cannabis policy database focused exclusively on provisions for US states in which recreational cannabis is legal .The degree of policy co-occurrence varied by database . Across the 13 databases, Figure 2 shows an example of intermediate degrees of co-occurrence among unemployment, sick leave, and pension benefits policies across 40 years in 22 countries. Figure 3 displays an example of high levels of co-occurrence among recreational cannabis policies across 108 months in 50 US states. Because the correlations are calculated on panel data at the level of the jurisdiction and time unit, higher correlations indicate that jurisdictions that adopt 1 policy are more likely to adopt the other and that the policies are likely to be adopted in closer temporal succession. State cannabis policies displayed the highest co-occurrence , whereas national lesbian, gay, bisexual, and transgender rights policies showed the lowest co-occurrence . For example, US states with restrictions on where recreational cannabis products can be sold for retail sale also tend to tax retail cannabis sales, whereas countries that allow same-sex marriage were relatively independent of countries that ban lesbian, gay, bisexual, and transgender-related employment discrimination.