Integral to this approach is the development of an Indigenous Statistical Space , which comprises three essential components: 1) standpoint, which encompasses the situated and reflexive positionality of the researcher within the context of the empirical work; 2) theoretical framework, which provides a conceptual lens through which to interpret and analyze data from an Indigenous perspective; and 3) data analysis technique, which for this approach leverages the use of computer-based algorithms, models, or techniques to analyze, process, and interpret data. Figure 2 is a visual representation of the Indigenous Statistical Space constructed for this paper, culminating in the Indigenous Computational Approach guiding the empirical investigation. The sequential components iteratively combine to foster a research framework that aligns with a decolonial and Indigenous-centered approach relevant for suicide prevention research with Native American young people. First, the standpoint adopted for this paper draws from the author’s worldview as a Yoeme person. In particular, the cultural protocol of “en tui hiapsimake” translated as “with good heart” guides the author’s engagement with suicide prevention research,indoor grow shelves reflecting the assertion that advocating for the well-being and comprehensive development of Native American young people is fundamentally tied to the larger project of decolonialization.
Advocating for the continuity of Indigenous life, rather than actively upholding the forces fostering Indigenous death, challenges and aims to dismantle the persisting systems of colonial oppression and violence that continue marginalize Native American communities. Second, Indigenous Wholistic Theory, as previously mentioned, forms the underpinning theoretical framework. This theoretical framework provides a comprehensive lens through which to situate and analyze data, acknowledging the interconnectedness of various factors from multiple levels impacting the lives of Native American high schoolers in California. Third, this paper leverages machine learning to structure a computational, algorithmic analysis of the collected data. The data analysis technique is further described in this section. The California Healthy Kids Survey is a school-based, multi-modular survey administered annually to students across California . The survey asks questions examining different aspects concerning youth health outcomes and school climate using a variety of different Likert-scaled, multiple-choice questions. CHKS is comprised of its “Core Module” which is administered to all participants. School districts have the option to opt-in to various other modules tailored to measure additional aspects and features. Data was drawn from self-identified Native American high schoolers in grades nine through twelve from the 2019-2020 school year. For analysis, participants who completed all questions from the “Core Module” were included .
For the spiritual-historical domain, no factors were included, because there were no suitable questions included in CHKS. For the emotional-social domain, factors included: foster care placement and school-based victimization . Additional factors for this domain included parent education level, internalization of positive substance use education, and access to alcohol and other drugs. For the mental-political domain, factors included depressive symptoms . For the physical-economic, factors included: homelessness and breakfast consumption. The construction of the selected predictors was motivated by the Indigenous Computational Approach to construct an Indigenous statistical space better reflective of Native American conceptions of health and wellness . As previously mentioned, Indigenous Wholistic Theory guided this process. The Indigenous Computational Approach is built on the understanding that Native American health outcomes can be better understood with an alignment linking Indigenous-centered theories with Indigenous research methodologies to guide purposeful data analysis. Please review supplemental materials for a more detailed description concerning the construction of the selected predictors. The data analysis was performed using STATA version 17.0. Descriptive statistics were reported in terms of numbers and percentages to depict the sample characteristics and the distribution of categorical predictors. Continuous predictors were standardized to have a mean of zero and a standard deviation of one, and the analyses were reported using median and interquartile ranges.
To select the optimal predictors of suicidal ideation, least absolute shrinkage and selection operator penalized regression was employed. Lasso regression shrinks the coefficients of the least influential variables to zero, effectively removing them from the model to identify the best subset of predictors . The selection of the tuning parameter lambda was accomplished using 10-fold cross validation . This approach determines both the level of coefficient shrinkage and provides a reliable estimate of the predictive performance of the final model for new cases. The use of ten-fold cross-validation is preferred over split sample validation to avoid overestimating the predictive performance in unseen cases . To evaluate the ability of the model to distinguish between the presence and absence of suicidal ideation, the area under the curve was estimated within the discriminative quality thresholds ranging from 50% indicating an inability to discriminate between individuals with the outcome or not; 70% to 80% indicating acceptable; 80% to 90% indicating excellent; >90% indicating outstanding performance . Additionally, Bootstrapped Bias Corrected 95% Confidence Intervals for the AUC were calculated. The agreement between the observed rates of suicidal ideation and the model predictions was assessed using a GiViTI calibration belt plot . This allowed for the visual identification of deviations or miscalibrations in observed frequencies compared to expected probabilities at certain confidence levels. This included a calibration test to assess whether any deviations from the bisector were significant . Finally, the overall performance of the model was assessed by estimating the Brier score, which is a measure of the discrepancy between the predicted and observed outcomes using the standard benchmark of 0 for no disagreement, 0.25 for predictions no better than chance , and 1 for complete disagreement .Two sensitivity analyses were performed to evaluate the clustered nature of the data, as well as to assess the impact of incomplete data on the model results. First, the AUC was reestimated using bootstrap-resampling to account for clustering by schools. Second, the expectation-maximization algorithm was employed to impute missing values and the predictive model was refitted to the imputed data using the same lasso method . The performance of the model was then re-evaluated using AUC and calibration statistics as described above for the main model. In the sample group, 16.79% of Native American high school students reported experiencing suicidal ideation in the past 12 months. Using lasso regression with a mean lambda of 0.0078412, the model identified that ten out of the 17 factors were significant in predicting which Native American high schoolers would report experiencing suicidal ideation . These predictors included: depressive symptoms; school-based victimization; sexual and gender minority status; lifetime use of alcohol, vapes, and cannabis; breakfast consumption; access to alcohol and other drugs; and parent education level.After conducting a 10-fold cross-validation to evaluate internal validation, the final model achieved a cross-validated mean AUC of 87.36 . This indicates that if a Native American high school student were chosen at random, there would be an 87% chance that the model would accurately assign a higher risk score to a student experiencing suicidal ideation compared to one not experiencing ideation. The model’s predicted risk scores ranged from 1.92% to 93.63%, with the median predicted risk score among the sample being 4.83% . The Brier score was calculated to be 0.10 signifying no disagreement between the predicted and observed outcomes using the standard benchmark. The calibration belt plot indicated that there were some ranges of miscalibration at the 95% and 99% confidence levels . In logistic regression, miscalibration refers to a situation where the predicted probabilities of a model do not correspond well with the observed outcomes. In other words, the model may predict a higher probability of an event occurring than the actual probability, or vice versa. This is a form of bias in the model and can lead to inaccurate predictions. For the 95% confidence level, indoor garden table miscalibration occurred under the bisector for predicted values ranging from 0.02 to 0.06 as seen with the inner belt . For the 99% confidence level, miscalibration also occurred under the bisector for predicted values ranging from 0.02 to 0.04 as seen with the outer belt .
For low-risk students, both belts are under the bisector where the observed presence of suicidal ideation is lower than expected indicating a potential for type 1 errors or false positives. For the purposes of preventing suicide, false positives are preferred to the alternative as students who are experiencing any degree of risk, but may not experiencing suicidal ideation, can still be identified and connected to the appropriate services.Overall, the model discriminations did not show much difference when AUCs were derived using bootstrap-resampling to account for clustering by schools . This indicates that the clustered nature of the data did not have a significant impact on the predictive accuracy of the model . In the model that imputed missing values, the AUC was consistent with the previous models , however there was more evidence of miscalibration . In the sensitivity analysis accounting for missing data, the ten previously identified predictors were retained as important for predicting suicidal ideation among Native American high schoolers. However, in the sensitivity analysis that accounted for clustering by schools, only nine of the ten predictors were retained— in this case, the predictor of lifetime cannabis was omitted. This suggests that when controlling for the variance within individual schools, the importance of lifetime cannabis use as a predictor for suicidal ideation can be challenged. Nevertheless, for the final model, lifetime use of cannabis was retained as a predictor due to the strong predictive ability in the original model as designated by the AUC.Suicidal ideation among Native American youth is a complex and multifaceted issue that requires a comprehensive understanding of the risk factors that contribute to suicide-related behavior. Unfortunately, traditional psycho-centric models of suicide have been inadequate in explaining the nuanced experiences of Native American youth, who often face unique cultural, historical, and social challenges that can exacerbate their risk for suicide. Therefore, this study represents an important preliminary step in identifying factors that are relevant in conceptualizing suicide-related behavior among Native American youth and moving from a psycho-centric model to a multi-level wholistic model. By identifying potentially relevant predictors and their relative predictive value, this study can help inform future research directions that seek to better understand the individualized risk of suicidal ideation across theoretical, conceptual, and practical areas. This research can eventually lead to better assessment of risk and more effective prevention. The study found that a combination of factors, including: individual ; emotional-social; mental-political ; and physical-economic domains, could be used to predict the individualized risk of experiencing suicidal ideation among Native American high schools in California. The multiplicity of these domains disrupts the domination of a psycho-centric conceptualization of suicide by substantiating a complex network of factors relevant for predicting suicidal ideation beyond a singular psychological conception. Despite some evidence of miscalibration, the final model had a good overall performance with the 95% CI indicating that the true discriminative ability, or AUC, of the model is acceptable. Therefore, the analysis offers a promising approach as a modeling tool to determine the individualized risk of suicidal ideation among Native American high schoolers in California. While population-level analyses are important for understanding trends and differences among sub-groups, gaining a culturally relevant understanding of suicidal ideation among Native American high school students requires a examination of factors that are relevant to their lived experiences. By identifying the adolescents who are at risk for suicide before its fatal outcome can occur, practitioners in the mental health, education, and social work fields can intervene early and prevent suicide. The resulting individualized risk score from the model can help practitioners identify those with the highest predicted risk and connect them with appropriate suicide prevention efforts. These efforts can be culturally relevant and may include programs such as Honoring the Children-Mending the Circle trauma-focused cognitive behavioral therapy , American Indian Life Skills , and Gathering of Native Americans . This study aimed to examine a set of predictors for suicide-related behavior among Native American youth using available data. However, the findings of this study should be interpreted with caution given its limitations. One major limitation of this study is that it relied on non-validated self-report instruments, which may have limited construct validity and generalizability. Non-validated self-report instruments may not accurately measure the construct of interest, leading to inaccurate conclusions about the variables being studied. Moreover, results from studies that use non-validated self-report instruments may not be generalizable to other populations since the instruments have not been validated across different groups of people.