A study found that people were significantly more likely to disclose personal information to artificial intelligence when they believed it was computer- rather than human-monitored. Users can develop a strong therapeutic alliance in the absence of face-to-face contact, even with a nonhuman app. Digital environments can promote honest disclosure due to greater ease of processing thoughts and reduced risk of embarrassment. Finally, although conversational agents can present in different modalities, including text, verbal, and animation, preliminary research on modality for psycho education delivery specifically found that text-based presentation resulted in higher program adherence than verbal presentation. Evidence for conversational agent interventions for addressing mental health problems is growing quickly and appears promising with regard to acceptability and efficacy. Developed as a mental health digital app, Woebot is a text-based conversational agent available to check in with users whenever they have smartphone access. Using conversational tones, Woebot is designed to encourage mood tracking and to deliver general psycho education as well as tailored empathy, cognitive behavioral therapy –based behavior change tools, and behavioral pattern insight. Among a sample of adults randomly assigned to Woebot or an information only control group, Woebot users had statistically and clinically significant reductions in depressive symptoms after 2 weeks of use,grow racks whereas those in the control group did not. Engagement with the app was high. However, the efficacy of conversational agents for treating SUDs remains unknown.
Woebot’s app-based platform and user-centered design philosophy make it a promising modality for SUD treatment delivery; it offers immediate, evidence-based tailored support in the peak moment of craving. An informal poll of Woebot users indicated that 63% had interest in content addressing SUDs; 22% of surveyed users reported having 5 or more alcoholic drinks in a row within acouple of hours, and 5% endorsed using nonprescription drugs. Although the efficacy of automated conversational agent digital therapeutics for SUDs is still untested, such products are commercially available, and few consumers are aware that the products lack evidence. This study aims to adapt the original Woebot for the treatment of SUDs , and test the feasibility, acceptability, and preliminary efficacy in a single-group pre-/post treatment design. In a single-group design, we examined within-subject changes in self-reported substance use behavior, cravings, confidence to resist urges to use substances, mood symptoms , and pain from pre- to post treatment. Intervention engagement data were collected from the Woebot app during the 8-week treatment period. Acceptability ratings were collected within the app and within the post treatment survey. The study procedures were approved by the Institutional Review Board of Stanford Medicine.Participants were recruited via the Woebot app, social media , Craigslist, and Stanford staff and student wellness listservs. In addition, study flyers were posted in the San Francisco Bay Area, and email invitations were sent to participants from previous studies. Recruitment materials included the URL on a web page describing the study for people with substance use concerns. Informed consent was required to screen for eligibility.
Those who screened as eligible were asked to provide informed consent for participation in the study. Inclusion criteria were all genders, aged 18 years to 65 years, residing in the United States, screening positive on the 4-item Cut down, Annoyed, Guilty, Eye opener-Adapted to Include Drugs , owning a smartphone for accessing Woebot, available for the 8-week study, willing to provide an email address, and English literate. The CAGE-AID has demonstrated validity, with high internal consistency in screening for problematic drug and alcohol use; a cutoff point of 2+ on the CAGE-AID has a sensitivity of 70% and specificity of 85% for identifying individuals with SUDs. Study exclusion criteria were current pregnancy, history of severe alcohol or drug-related medical problems , opioid overdose requiring Narcan , current opioid misuse without medication-assisted treatment, or attempted suicide within the past year. For this study, the target sample size was 50 participants; however, due to a high level of response and efficiency, enrollment was more than double our recruitment goal. Between March 27, 2020 and May 6, 2020, 3597 individuals were screened for study participation, with 3422 ineligible and 175 eligible individuals. Figure 1 shows the reasons for study exclusion, most frequently residing outside of the United States and endorsing fewer than 2 criteria on the CAGE-AID . Of the 175 eligible participants, 141 provided informed consent to participate in the study, of whom 128 completed the baseline survey. The analytic sample consisted of 101 participants who ultimately registered with W-SUDs and initiated use. Among the 101 participants enrolled, 11 reported previous use of the Woebot app. Described in detail previously, Woebot is an automated conversational agent that delivers CBT in the format of brief, daily text-based conversations. The Woebot program is deployed through its own native apps on both iPhone and Android smartphones or devices.
The app on boarding process introduces the automated conversational agent, explains the intended use of the device, how data are treated, and the limitations of the service . The user experience is centered around mood tracking and goal-oriented, tailored conversations that can, depending on user input and choice, focus on CBT psycho education, application of psychotherapeutic skills for change , mindfulness exercises, gratitude journaling, and/or reflecting upon patterns and lessons already covered. Each interaction begins with a general inquiry about context and mood to ascertain affect in the moment. Additional therapeutic process-oriented features of Woebot include delivery of empathic responses with tailoring to users’ stated mood, goal setting with regular check-ins for maintaining accountability, a focus on motivation and engagement, and individualized weekly reports to foster reflection. Users become familiar with Woebot, which is a friendly, helpful character that is explicitly not a human or a therapist but rather a guided self-help coach. Daily push notifications prompt users to check in. We adapted W-SUDs, drawing upon motivational interviewing principles, mindfulness training, dialectical behavior therapy, and CBT for relapse prevention. Sample screenshots from the W-SUDs app are shown in Figure 2. In total, the W-SUDs intervention was developed as an 8-week program with tracking of mood, substance use craving, and pain, with over 50 psycho educational lessons and psychotherapeutic skills. CBT evidence-based, guided self-help treatments have ranged in length from 2 to 12 week, and the National Institutes for Clinical Excellence describes guided self-help as including 6 to 8 face-to-face sessions. Early responsiveness to SUD treatment is predictive of long-term outcomes,growing rack and brief addiction treatments are efficacious. Brief intervention can minimize potential dropout, a problem common to SUD treatment; therefore, we designed W-SUDs as an 8-week treatment. Woebot is not designed to address active suicidal ideation or overdose, and this was stated in the study informed consent. In addition, Woebot conversationally informs first-time users that it is not a crisis service. Woebot also has safety net detection that uses natural language processing algorithms to detect and flag several hundred possible harm-to-self phrases with 98% accuracy . Woebot detects crisis language and asks to confirm it with the user. If the user confirms, Woebot offers resources , carefully curated with expert consultation. Woebot data indicate that users do not use Woebot for crisis management; approximately 6.3% trigger the safety net protocol, with 27% of those confirming that it is indeed a crisis when Woebot asks to confirm .Demographic items were assessed at pretreatment; substance use, mental health, and pain measures were administered at preand post treatment; serious adverse events and W-SUDs feasibility and acceptability were assessed at post treatment; and W-SUDs use data were collected via the Woebot app over the 8-week intervention. Demographic items included self-reported sex, race and ethnicity, age, marital status, employment status, residential zip code, and sheltering-in-place status given the COVID-19 pandemic. The Alcohol Use Disorders Identification Test-Concise , a widely used 3-item self-report measure based on the 10-item original AUDIT, assessed hazardous or harmful alcohol consumption in the past 3 months. A score of 4+ for men and 3+ for women indicated significant problems with alcohol consumption. The AUDIT-C has been found to be a valid screening test for heavy drinking and/or active alcohol abuse or dependence. The Drug Abuse Screening Test-10 , a 10-item self-report measure adapted from the 28-item DAST, assessed consequences related to drug abuse, excluding alcohol and tobacco in the past 3 months. The last item of the DAST-10 regarding medical problems resulting from drug use was not reassessed because it was an exclusion criterion in the study screener; hence, the total possible range for the sample was 0-9, not 0-10. Total scores of 3+ indicated significant problems related to drug abuse. The DAST-10 has moderate test-retest reliability, sensitivity, and specificity. For the AUDIT-C and DAST-10 measures at post treatment, the reference period was the past 2 months, to reflect the period of intervention.
Craving was assessed with a single item asking, “In the past 7 days, how much were you bothered by cravings or urges to drink alcohol or use drugs?”, with response options of not at all , a little bit , moderately , quite a bit , and extremely . The Brief Situational Confidence Questionnaire, a state-dependent measure, assessed self-confidence to resist the urge “right now” to drink heavily or use drugs in different situations reported on visual analog scales anchored from 0% “not at all confident” to 100% “totally confident.” The Patient Health Questionnaire-8 item , an 8-item scale, assessed depressive symptoms, and the Generalized Anxiety Disorder-7 item , a 7-item scale, assessed symptoms of generalized anxiety disorder. Both the PHQ-8 and GAD-7 have good internal consistency and demonstrated convergent validity with measures of depression, stress, and anxiety. A total of 2 items assessed the history of therapy for mental health or substance use concerns. Lifetime psychiatric diagnoses were assessed using 10 items plus a write-in option for others.The treatment feasibility and acceptability of W-SUDs were assessed post treatment using the Usage Rating Profile-Intervention Feasibility and Acceptability scales, the 8-item Client Satisfaction Questionnaire-8 questions , and the 12-item Working Alliance Inventory-Short Revised. The URP-I item response options ranged from strongly disagree to strongly agree; the items were summed for a total score within each scale, with one feasibility item reverse coded. The CSQ-8 items have 4-point rating scales with response descriptors that vary. Internal consistency exceeds 0.90, and the total sum score ranges from 8 to 32, with higher total scores indicating higher satisfaction. The WAI-SR has three 4-item sub-scales, with 5-point rating scales, that reflect development of an affective bond in treatment and level of agreement with treatment goals and treatment tasks. Serious adverse events occurring in the 8 weeks after the start of the study were assessed for hospitalization related to substance use, suicide attempt, alcohol or drug overdose, and severe withdrawal . Positive endorsements were followed up with questions about the timing, diagnosis, and resolution. If additional details were needed to determine whether the event was study related, a team member reached out to the participant. Serious adverse events were reported to the study’s Data Safety Monitoring Board within 72 hours of the team learning of the event. Participants’ W-SUDs app use, including days of app use, number of check-ins, and number of messages sent, was collected via the Woebot app, as were module completion rates, lesson acceptability ratings indicated on a binary scale , and mood impact after tools utilization . In addition, on a daily basis, the W-SUDs app assessed mood, cravings or urges to use, and pain. In-the-moment emotional state was reported through emoji selection with a default menu of 19 total moods, including options for negative , positive , and average mood , with an additional ability to type in free text emotion words and/or self-selected emoji expressions. Cravings were assessed as not at all , a little bit , moderately , quite a bit , or extremely . Physical pain was rated on a scale of 0 to 10. Descriptive statistics were used to describe the sample and examine the ratings of program feasibility and acceptability. Paired samples ttests and McNemar non-parametric tests examined within-subject changes from preto post treatment on measures of substance use, confidence, cravings, mood, and pain. Change scores were calculated , and bivariate correlations were used to examine associations between changes in AUDIT-C and DAST-10 scores and changes in use occasions, confidence, and depression and anxiety scores. ttests were conducted to examine changes from pre- to post treatment in substance use, confidence, mood, and pain by whether participants were currently in therapy or taking psychiatric medications.