I Struggled with PTSD, Now I Help to Address It Through AI

I Struggled with PTSD, Now I Help to Address It Through AI

Read the brave story of Anam from Pakistan who was struggling with Post-Traumatic-Stress Disorder (PTSD) after her dad was in a critical health condition. She had to prepare for entrance exams while taking care of her siblings for several months.



It is truly amazing how many inspiring individuals have applied to our Collaborative AI Projects. We very honored to share the story of Anam today from whom we learned a lot by just speaking and listening to her. Anam has been part of our AI challenge on building a machine learning model for PTSD assessment. 


Anam’s Story

I am a Computer Science student and before that, I was actually a pre-medical student. I switched a lot of majors. The thing in Pakistan is, after studying biology, you can either become a doctor or a dentist. I wanted to do research but there weren’t many options. That is why I decided to switch to computer science and for that, we have to study mathematics before college. I took a gap year to study math so that I was eligible to apply to an engineering university.

It was very hard to convince my parents to let me study maths at first because they were convinced that being a doctor would be a better choice for me. They finally agreed and I was studying maths and then I had to complete two years of the syllabus but I had only one year. Right after I started studying my dad got appendicitis and we went to get his appendix removed but it ended up being more than that.

His intestines stopped working, and he was in the hospital for a few months after that. We were just hoping his intestines would start working so we could go home. Then the surgical wound from where they opened him up, developed an infection. In order to get support, we had to move to Lahore, where the rest of my relatives live. When we moved to Lahore, they cleaned his wound, and it got infected again. He was on bed rest for about two months. His movements were minimal, which led to pulmonary embolism (blood clots had lodged in his lung). One day, he was going to the bathroom, when all of a sudden he passed out and nobody knew what was happening.

Everybody was at the hospital and nobody could figure out what happened. The doctors thought maybe he had a heart attack. He was taken to the ICU. The doctors started giving him CPR. I think he was gone for a minute or two, but the doctors were successful at bringing him back. They put him on life support forsupportfor a couple of days and that’s when we really lost all hope.

A couple of days later he finally woke up and we found out that he had had a pulmonary embolism.

I know a lot of people go to a lot of things and this is nothing compared to most of them. when my parents were in the hospital I was looking after my siblings. I had to tell them what was happening.

At the same time, I also had to focus on my studies. Even though there were a lot of people who did support us, at these times you really find out who is actually there for you and who isn’t. And a lot of people backed out. My friends would be telling me, you should be with your dad instead of even worrying about your studies. To avoid talks like these, I would hide while I studied. So after four months of being in the hospital and staying at my relatives, we could finally come home.

We were ecstatic.

I remember my mom telling me that she wanted to go out in the streets and shout for joy.


We came back home it was all fine and I gave my math exams after covering two years worth of syllabus in about 4 months, that too under extreme stress.

I came back after my last exam, ready to prepare for my college entrance tests, and something odd happened. I fell sick out of the blue. I had nausea 24/7. I couldn’t eat or drink. I would vomit if I tried, and I started to lose weight.

My parents took me to multiple doctors thinking that my stomach was upset. Months went by but we couldn’t figure out what was wrong. Later, I was diagnosed with severe anxiety which stemmed from the incident with my dad.

I remember my mom telling me, “When your dad was in the ICU, I would sit outside it all night and every day there was a new body being taken out of the ward. So, every time I saw the doors open, I hoped it wasn’t your dad’s body.” I could understand her feelings because that was exactly how I felt every time my mom called me from the hospital. I felt my heart drop every time my phone rang.

The fear had gotten stronger and now I had severe anxiety accompanied by recurring panic attacks. The fear that I might lose my parents kept me up all night. Whenever one of them left the house I would call them repeatedly to check up on them. I never turned my phone on silent while in class, because I was always fearing a call with a bad news.

I started medication, and my anxiety slowly started getting better. Throughout the recovery, my mother was always by my side. She distracted me when I had terrible thoughts. I felt safe only in her company.

My entrance test results finally came. I was accepted into one of the top CS universities in Pakistan.

My recovery still continues but today I feel great because I have never been able to share my story publicly before. I have always been told to keep it quiet as if talking about mental health problems is some sort of taboo. From my personal experience, I have realized that talking about it is what helps us get better. I hope I encourage people to speak up and share their stories


Building a Risk Classifier for a PTSD Assessment Chatbot

Building a Risk Classifier for a PTSD Assessment Chatbot


MLFlow to structure a Machine Learning project and support the backend of the risk classifier chatbot regarding PTSD.



The Problem: Classification of Text for a PTSD Assessment Chatbot

The input

A text transcript similar to:


therapist and client conversation snapshot


The output

Low Risk -> 0 , High Risk -> 1

One of the requirements of this project was to have a productionized model for Text Classification regarding PTSD that could communicate with a frontend, for example, using Machine Learning.

As part of the solution to this problem, we decided to explore the MLFlow framework.



MLflow is an open-source platform to manage the Machine Learning lifecycle, including experimentation, reproducibility, and deployment regarding PTSD. It currently offers three components:

MLFlow Tracking: Allows you to track experiments and projects.

MLFlow Models: Provides a model and framework to persist, version, and serialize models in multiple platform formats.

MLFlow Projects: Provides a convention-based approach to set up your ML project to benefit the maximum work being put in the platform by the developer’s community.

Main benefits identified from my initial research were the following:

  • Work with any ml library and language
  • Runs the same way anywhere
  • Designed for small and large organizations
  • Provides a best practices approach for your ML project
  • Serving layers(Rest + Batch) are almost for free if you follow the conventions



The Solution


The focus of this article is to show the baseline ML models and how MLFlow was used to aid in Text Classification and training model experiment tracking and productionization of the model.


Installing MLFlow

pip install mlflow


Model development tracking

The snippet below represents our cleaned and pretty data, after data munging:


snapshot of a table containing transcript_id, text, and label as the column headings


In the gist below a description of our baseline(dummy) logistic regression pipeline:


train, test = train_test_split(final_dataset, 
random_state=42, test_size=0.33, shuffle=True)
X_train = train.text
X_test = test.text

LogReg_pipeline = Pipeline([
('tfidf', TfidfVectorizer(sublinear_tf=True, min_df=5, 
norm='l2', encoding='latin-1', ngram_range=(1, 2), stop_words='english')),
The link to this code is given here.

One of the first useful things that you can use MLFlow during Text Classification and model development is to log a model training run. You would log for instance an accuracy metric and the model generated will also be associated with this run.


with mlflow.start_run():
LogReg_pipeline.fit(X_train, train["label"])
# compute the testing accuracy
prediction = LogReg_pipeline.predict(X_test)
accuracy = accuracy_score(test["label"], prediction)
mlflow.log_metric("model_accuracy", accuracy)
mlflow.sklearn.log_model(LogReg_pipeline, "LogisticRegressionPipeline")


The link to the code above is given here.


At this point, the model above is saved and reproducible if needed at any point in time.

You can spin up the MLFlow tracker UI so you can look at the different experiments:


╰─$ mlflow ui -p 60000                                                                                                                                                                                                                  130 ↵
[2019-09-01 16:02:19 +0200] [5491] [INFO] Starting gunicorn 19.7.1
[2019-09-01 16:02:19 +0200] [5491] [INFO] Listening at: (5491)
[2019-09-01 16:02:19 +0200] [5491] [INFO] Using worker: sync
[2019-09-01 16:02:19 +0200] [5494] [INFO] Booting worker with pid: 5494


The backend of the tracker can be either the local system or a cloud distributed file system ( S3, Google Drive, etc.). It can be used locally by one team member or distributed and reproducible.

The image below shows a couple of models training runs in conjunction with the metrics and model artifacts collected:


Experiment Tracker in MLFlow screenshot

Sample of experiment tracker in MLFlow for Text Classification


Once your models are stored you can always go back to a previous version of the model and re-run based on the id of the artifact. The logs and metrics can also be committed to Github to be stored in the context of a team, so everyone has access to different experiments and resulted in metrics.


MLFlow experiment tracker


Now that our initial model is stored and versioned we can assess the artifact and the project at any point in the future. The integration with Sklearn is particularly good because the model is automatically pickled in a Sklearn compatible format and a Conda file is generated. You could have logged a reference to a URI and checksum of the data used to generate the model or the data in itself if within reasonable limits ( preferably if the information is stored in the cloud).


Setting up a training job

Whenever you are done with your model development you will need to organize your project in a productionizable way.

The most basic component is the MLProject file. There are multiple options to package your project: Docker, Conda, or bespoke. We will use Conda for its simplicity in this context.


name: OmdenaPTSD

conda_env: conda.yaml

 command: "python train.py"


The entry point runs the command that should be used when running the project, in this case, a training file.

The conda file contains a name and the dependencies to be used in the project:


name: omdenaptsd-backend
- defaults
  - anaconda
- python==3.6
  - scikit-learn=0.19.1
  - pip:
- mlflow>=1.1


At this point you just need to run the command.


Setting up the REST API classifier backend

To set up a rest classifier backend you don’t need any job setup. You can use a persisted model from a Jupyter notebook.

To run a model you just need to run the models serve command with the URI of the saved artifact:


mlflow models serve -m runs://0/104dea9ea3d14dd08c9f886f31dd07db/LogisticRegressionPipeline
2019/09/01 18:16:49 INFO mlflow.models.cli: Selected backend for flavor 'python_function'
2019/09/01 18:16:52 INFO mlflow.pyfunc.backend: === Running command 'source activate 
mlflow-483ff163345a1c89dcd10599b1396df919493fb2 1>&2 && gunicorn --timeout 60 -b -w 1 mlflow.pyfunc.scoring_server.wsgi:app'
[2019-09-01 18:16:52 +0200] [7460] [INFO] Starting gunicorn 19.9.0
[2019-09-01 18:16:52 +0200] [7460] [INFO] Listening at: (7460)
[2019-09-01 18:16:52 +0200] [7460] [INFO] Using worker: sync
[2019-09-01 18:16:52 +0200] [7466] [INFO] Booting worker with pid: 7466


And a scalable backend server (running gunicorn in a very scalable manner) is ready without any code apart from your model training and logging the artifact in the MLFlow packaging strategy. It basically frees Machine Learning engineering teams that want to iterate fast of the initial cumbersome infrastructure work of setting up a repetitive and non-interesting boilerplate prediction API.

You can immediately start launching predictions to your server by:


curl -H 'Content-Type: application/json' -d 
'{"columns":["text"],"data":[[" concatenated text of the transcript"]]}'


The smart thing here is that the MLFlow scoring module uses the Sklearn model input ( pandas schema) as a spec for the Rest API. Sklearn was the example used here it has bindings for (H20, Spark, Keras, Tensorflow, ONNX, Pytorch, etc.). It basically infers the input from the model packaging format and offloads the data to the scoring function. It’s a very neat software engineering approach to a problem faced every day by machine learning teams. Freeing engineers and scientists to innovate instead of working on repetitive boilerplate code.

Going back to the Omdena challenge this backend is available to the frontend team to connect at the most convenient point of the chatbot app to the risk classifier backend ( most likely after a critical mass of open-ended questions).



More About Omdena

Omdena is an innovation platform for building AI solutions to real-world problems through the power of bottom-up collaboration.

Neural Transfer Learning in NLP for Post-Traumatic-Stress-Disorder Assessment

Neural Transfer Learning in NLP for Post-Traumatic-Stress-Disorder Assessment

The main goal of the project was to research and prototype technology and techniques suitable to create an intelligent chatbot to mitigate/assess PTSD in low resource settings.


The Problem Statement

“The challenge is to build a chatbot where a user can answer some questions and the system will guide the person with a number of therapy and advice options.”

We were allocated to the ML modeling team of the challenge. Our initial scope was nailing the problem to the most relevant specific use case. After some iterations and consultations among the team, we decided to tackle among multiple possible avenues (e.g. conversational natural language algorithms, expert system, etc.) the problem with a risk a binary assessment classifier suggestion based on labeled DSM5 criteria. The working hypothesis was that the classifier could be used as a backend of a chatbot in a low resource device that could detect the risk and refer the user to more specialized information or as a screening mechanism (in a refugee camp, in a resource depleted health facility, etc.).

The frontend of the system would be a chatbot ( potentially conversational mixed with open-ended questions) and one of the classifiers would be a risk assessment based on the conversation.

The tool is strictly informational/educational and in no circumstances, the intent is to replace health practitioners.

Our team Psychologist guided the annotation process. After a couple of iterations in the process, we ended up on a streamlined process that allowed us to classify ~50 transcripts (each with the transcripts of conversations).


The Baseline

Baseline algorithm implementation by different team members demonstrated that without further data-preprocessing with traditional ML methods accuracy rate was around 75%. Given the fact that we had a serious category imbalance issue, this is definitely not a metric to consider. An article is in the works with the details of the baseline infrastructure and traditional ML techniques applied to text classification problems ( ).


The Data

The annotation team ended up having access to 1,700 transcripts of sessions. After careful inspection, the team realized that only around 48 transcripts were for actual PTSD issues.

Training Examples: #48 PTSD transcripts each with an average of 2k+ lines

Example of an excerpt of a transcript available in [3]:



Target Definition: No-Risk Detected-> 0 or Risk Detected: 1



From an NLP/ML problem taxonomy perspective, the number of datasets is extremely limited. So this problem would be classified as a few shots of classification problems [4].

Prior art on using these techniques when the data is limited prompted the team to explore the Transfer Learning avenue in NLP with recent encouraging results in a few shots training and data augmentation through back-translation techniques.

The picture below elucidates a pandas data frame resulted in an intense data munging process and target calculations ( based on DSM5 manual recommendations) and the amazing work of our annotation team:




The Solution




The ULMFit algorithm was one of the initial techniques to provide effective neural transfer learning with success for the state of the art NLP benchmarks[1]

The algorithm and the paper introduce a myriad of techniques to improve the efficiency of RNNs training. We will delve below in the most fundamental ones.

The pre-assumption on modern transfer learning in NLP problems is that all the inputs of all the text will be transformed in numeric values based on word embeddings[8]. In that way, we ensure semantic representation and at the same time numeric inputs to feed the neural network architecture at hand.

From a context perspective. Traditional ML relies solely on the data that you have for the learning task while Transfer Learning trains on top of weights of neural networks (NLP) pre-trained on a large corpus (examples: Wikipedia, public domain books). Successes for transfer learning in NLP and Computer Vision are widespread in the last decade.


Copied from [5]



Transfer learning is a good candidate when you have few training examples and can leverage existing pre-trained powerful networks.

UMLFit works as shown by the diagram below:


Copied from [5]



  • Pre-trained Language Model (for example with Wikipedia data)
  • Data is fine-tuned with your corpus (not annotated)
  • A classifier layer is added to the end of your network.

A simple narrative for our case is the following: The model learns the general mechanics of the English language with the Wikipedia corpus. We specialize in the model with the available transcripts both annotated and not annotated and in the end, we are able to classify this model by chopping the sequence component final layer with a regular Softmax based classifier.


LSTM & AWD Regularization Technique

At the core of UMLFit implementation is a bidirectional LSTM’s and a technique called ASGD WD ( Average Stochastic Gradient Descent Weight Dropped).

LSTM ( Long Short Term Memory) networks are the basic block of state of the art deep learning approaches to solve Transfer Learning in NLP sequence 2 sequence prediction problems. A sequence prediction problem consists of predicting the next word given the previous text:


Copied from [6]



LSTM’s are ideal for language modeling and sequence prediction(increasingly being used in Time Series Forecasting as well ) because they maintain a memory of previous input elements. Each X element in our particular situation would be a token that would generate an output (sequence) and would be sent to the next block so it’s considered during the ht output calculation. Optimal weights will be backpropagated through the network-driven by the appropriate loss function.

One component of this regularization technique (WD) involves introducing dropouts on the weights of the hidden<->hidden states connections, which is quite unique compared with the drop out techniques.


Copied from [7]



Another component of the regularization is the Average Stochastic Gradient Descent, that basically instead of just including the current step it also takes into consideration the previous step and returns an average[7]. More details about the implementation can be found here.

A more detailed ULMFit Diagram can be seen below where the LSTM’s components are described with the different steps of the implementation of the algorithm:


Copied from [5]



General-Domain LM (Language Model) Pretraining

This is the initial phase of the algorithm where a Language model is pre-trained in powerful machines with a public corpus of data-set. The language model problem is very simple: given a phrase the set of probabilities of the next word (probably one of the most oblivious use of Deep Learning in our daily lives):



We will use for this problem, in particular, the available FastAI implementation of ULMFit to elucidate the process in practical terms:

In order to choose the ULMFit implementation in fastai, you will have to specify the language model AWD_LSTM as mentioned before.


language_model_learner(data_lm, AWD_LSTM, drop_mult=0.5)


The code above does a lot in the good style of using the libraries FastAI and sklearn is being used to produce a train and validation set and fastai is being used to instantiate and UMLFit language model learner.


Target task LM Fine-Tuning

On the code presented on LM model section, we basically instantiate a pre-trained ULMFit language model with the right configuration of the algorithm ( there are other options for language models TransformersXL + QNNs ):


from fastai import language_model_learner 
from fastai import TextLMDataBunch
from sklearn.model_selection import train_test_split
# split data into training and validation set
df_trn, df_val = train_test_split(final_dataset, 
stratify = df['label'], test_size = 0.3, random_state = 12)
data_lm = TextLMDataBunch.from_df(train_df = df_trn, 
valid_df = df_val, path = "")
learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.5)


The (pseudo)/code above basically retrieves our training and validation datasets stratified and creates a language_model_learner based on our own model. The important detail of the language model is that it doesn’t need annotated data ( perfect of our situation with limited annotated data but a bigger corpus of non annotated transcripts). Basically, we are creating a language model for our own specialized domain on top of the huge general Wikipedia kind of language model.


language_model_learner.fit_one_cycle(1, 1e-2)


The code above basically unfreezes the pre-trained language model and executes one cycle of training on top of the new data with the specified learning rate.

Part of the process of ULMFit is applying discriminative learning rates through the different cycles of learning :


For a neural language model, the accuracy of around 30% is considered acceptable given the size of the corpus and possibilities [1].


After this point we are able to generate text from our very specific language model:


Excerpt from text generated from our language model.


At this point, we have a reasonable text generator for our specific context. The ultimate value of UMLFit is on the ability to transform a language model in a relatively powerful text classifier.


The code above saves the model for further reuse.


Target Task Classifier:

The last step of the ULMFit algorithm is to replace the last component of the language model with a classifier softmax “head” and train on top of it the specific labeled data on our project. It means the PTSD annotated transcripts.


classifier = text_classifier_learner(data_clas, 
AWD_LSTM, drop_mult=0.5).to_fp16()
classifier.fit_one_cycle(1, 1e-2)
#Unfreezing a train a bit more
classifier.fit_one_cycle(3, slice(1e-4, 1e-2))


The same technique of discriminative learning rates was used above for the classifier with much better accuracy rates. Results on the classifier specifically were not the main goal for this article a subsequent article will delve into finetuning UMLFit comparison and addition of classifier specific metrics ranking and use of data augmentation techniques such us back-translation and different re-sampling techniques.


Initial Results of the UMLFit based classifier.


More About Omdena


A Faster Way to Annotate Transcript Data in PTSD Therapy Sessions

A Faster Way to Annotate Transcript Data in PTSD Therapy Sessions

The Problem


This project has been done with Christoph von Toggenburg, CEO of World Vision Switzerland, who was exposed to Post Traumatic Stress Disorder in an armed ambush in Africa. PTSD can be triggered when someone experiences a severe traumatic event, and instead of the trauma leveling off, it becomes a mental health condition.

Symptoms include panic attacks, anxiety, uncontrollable thoughts, and more, which can be triggered whenever they are reminded of the event.

“The difference between trauma and PTSD is that switch in your brain, and it becomes a part of your life. It is something you cannot reverse, but you can deal with the symptoms, and if treated properly, you can get much better” — Christoph

Christoph started BEATrauma, an initiative to help victims with PTSD therapy all around the world. His vision is to create a mobile app Risk Assessment chatbot to converse with users and determine a risk assessment for PTSD, by using Cognitive Behavioral Therapy(CBT), which would implement machine learning — that’s where we come in!


The Data Problems — Not Annotated, Not Enough

Data is not always easy to find, especially when dealing with sensitive user information like therapy sessions. Though through our community network, we were able to get around 1700 transcripts on therapy sessions, about only 50 which were for PTSD.


The Solution

From a traditional treatment point, we discovered that CBT (Cognitive Behavioral Therapy) was the best solution for PTSD therapy using a Risk Assessment Chatbox. CBT is having a therapist to talk to the patient more about their experiences and “expose” them more until they finally become comfortable with it. Knowing that we could implement a conversational agent in NLP for this purpose, we set our sights on training data using Risk Assessment Chatbox.

We split into two groups. One was in charge of risk assessment, creating a rule-based algorithm in rasa with sentiment analysis to converse with the user, along with a backend classification model trained on transcript data to determine if the user had PTSD. The other focused on CBT, training a seq-to-seq chatbot for therapy!

This article described the data annotation part. Since the transcripts came completely unlabelled, we had to give them a score between 0 to 1 so that the model could learn which patients had PTSD and which didn’t. One of our project collaborators had experience with statistics and psychology and guided the team of seven through reading through the transcripts and scoring them!


The Annotation Process

  • Understand each of the 6 criteria for PTSD. E.x., Exposure to actual or threatened death, serious injury, or sexual violence, Persistent avoidance of stimuli associated with the traumatic event(s), and more!
  • Keeping the criteria in mind, read an entire transcript (which can take from 45 min-1 hr).
  • Score each of the 6 criteria with either a 0, 0.5, or 1, of which 0 means not displaying the symptom at all, 0.5 meaning somewhat displaying it, and 1 representing a clear expression of that symptom.
  • Follow a formula to take in all 6 numbers and spit out a number between 0 and 1 for the risk assessment for PTSD.
  • Rinse and repeat for the other 49.


Points explained of Criterion A(CAPS-5)

Criterion A’s description


We faced two problems in our annotation process. The first was that it took far too long to annotate all the data. Through complications and busyness, it took around two weeks to finish with tons of hard work put in. The second was that the transcripts were often a bit unclear and difficult to understand.

We brainstormed several solutions to the annotation problem:

  • Determine a bag of words and their embeddings for each criterion and run LDA (Latent Dirichlet Allocation) on top of them for classification of each criterion to completely automate the process
  • Using USE (Universal Sentence Encoder) to determine the cosine similarity of each sentence to match sentences of the same criterion
  • Use GPT-2 to summarize each transcript to get the main idea, speeding up the annotations


Creating the Risk Assessment Chatbot

From there, we had to create a classification model that takes in user conversations and determine if they had PTSD. Another task group had a breakthrough with ULMFiT’s transfer learning technique, which resulted in 80% accuracy, which is a very good start that is currently further improved through data augmentation methods.


Ready to run the advanced models soon!




More About Omdena

Omdena is an innovation platform for building AI solutions to real-world problems through the power of bottom-up collaboration.

Stay in touch via our newsletter.

Be notified (a few times a month) about top-notch articles, new real-world projects, and events with our community of changemakers.

Sign up here