How to Build a Chatbot Using Rasa: Use Case of an AI Driving Assistant
February 17, 2022
A step-by-step tutorial on how to build a chatbot using Rasa (including a video explanation at the end of the article).
The project partner: The use case stems from Consenz which hosted an Omdena Challenge as part of Omdena´s AI Incubator for impact startups.
Introduction
Chatbots, or conversational agents, have become very popular in the past years. We come across them constantly in our everyday life, often without even noticing (remember the chat box in the app of your bank, mobile provider, or on your favorite online shopping website?).
Chatbots are there to help us with many tasks, such as setting up alarms, scheduling meetings, or finding an answer to our question on a company website. Why are chatbots important? They step in when there’s not enough human assistants to handle all the requests, thus making the assistance available for a large number of users.
Chatbots use technology such as Natural Language Understanding (NLU) and Machine Learning (ML) to understand users’ requests, produce adequate responses, and perform correct actions. Chatbot development is an exciting and fast-growing domain of ML, with a lot of open- or closed-source tools, courses, and tutorials available to help you build your own chatbot.
That’s exactly what we set out to do in a recent AI challenge organized by Omdena and Consenz: build a chatbot to be the core of Enzo – a hands-free driving assistant, whose main goal is reducing the number of traffic accidents. To that end, the chatbot should be able to understand the driver’s commands regarding navigation, music, calls, etc, and respond to them adequately.
We have all heard about chatbots and all were interested in developing one ourselves. We were aware of tools such as Dialogflow or Rasa and had previously followed some courses or tutorials on chatbot development. We were eager to apply in practice what we had learnt.
However, even with all available resources, it turned out not to be so easy to get started with building a chatbot. The abundance of information makes it hard to find a good starting point, and many technical details and features are tricky to master and can be confusing for a beginner. In our development process, we were learning a lot every day and had a lot of “I wish I’d known this before!” moments. That’s why we decided to write this article – not intending to repeat any courses or tutorials, but willing to share our experience and learnings to help others get started in chatbot development easier.
Read on to learn about the key concepts of chatbot development, an overview of the development, our thoughts regarding the choice of a chatbot tool, an introduction to Rasa conversational agents (with code snippets and examples of data and configurations), and a hands-on session allowing you to get started with creating a chatbot quickly.
[dipl_divi_shortcode id=”83536″]
Key chatbot concepts
Before we dive into the details of how we were building our driver’s assistant agent, it’s useful to define the key concepts of chatbot development:
- Utterance is anything the user says to the bot, for example, “Get me to 221b Baker St”, “Drive from my current location to Sherlock Holmes museum avoiding the center of the city”, “Start navigation”.
- Intent is the user’s goal when they are saying an utterance, for example in the examples above, the user wants to be navigated to a place, and we could call the intent “navigate_to”.
- Entities can be seen as “parameters” of an intent, making it specific and allowing to understand the exact action the user wants to be performed. Entities typically have types assigned to them. In the examples above, “221b Baker St”, ”my current location”, “Sherlock Holmes museum” and “the center of the city” are all entities. Their types could be “destination” for “221b Baker St” and “Sherlock Holmes museum”; “origin” for ”my current location”; and “avoid_point” for “the center of the city”.
- Response is anything the bot says in reaction to the user’s utterances, e.g. “you are welcome” in response to “thank you”.
- Action is something that the bot needs to do to adequately fulfill the goal of the user. For many intents, a simple response is not enough: the bot should not just say “starting navigation” in response to “navigate_to” intent, it should also memorise the destination, origin, points to be included or avoided, and actually connect to a maps API, calculate the route according to the given parameters, and start navigating.
- Conversation flow is the way a conversation develops over a few conversational turns. We want the interaction of the user and the chatbot to be smooth and natural, which would indicate that we designed the conversation flow well.
The scheme below shows an example of steps involved in a user/bot interaction:
How do you develop a chatbot?
One reason why developing a chatbot is a heavy task for a beginner is that it consists of two large, equally complex components.
First, there is the conversation design part, which includes answering the following questions:
- What is the user expected to say/request?
- Which variants of a request are the same request and which should be treated differently?
- How is the bot expected to reply and act?
- What are the possible ways a conversation can go?
and more. This part is largely independent of the chatbot tool. Working on the conversation design of your bot is an iterative process: there will always be some unexpected requests or turns or conversation when your users are interacting with the bot. So, you start with it, but don’t expect to finish the conversation design of your bot before you proceed.
Second there is implementation which deals with the following questions:
- How do we formalise the behaviour we want?
- What data do we need, in which formats? How do we obtain the data?
- Which packages do we need?
- Which model do we train?
- How do we test?
- What code do we need to write?
This part depends on the tool you choose, and will also impact the conversation design as you identify wrong conversation during testing.
Choice of a chatbot tool
There are many tools for chatbot development: Google’s Dialogflow, Amazon Lex, Mycroft, Rasa, and more. All tools have their own advantages and disadvantages with regards to being open- or closed-source, price, necessity (or absence thereof) to write any code, complexity of installation and getting started, and so on.
No one would like to realise that the tool they have been using for months does not have the desired functionality, so we all want to get it right from the beginning. But when you are just starting your chatbot journey and still defining what exactly you want from your bot, it can be daunting to define which tool would suit your needs best. After a good deal of research, we chose Rasa because:
- its core component is open-source and free
- it allows to incorporate many external state-of-the-art NLP models
- it is highly customizable and flexible (more on that later)
- it uses Python which is our language of choice
- It has an easy-to-use command-line interface for common tasks such as training, testing or running models
- it has great documentation, written and video tutorials (check out “Conversational AI with Rasa” series on Youtube).
Creating a Rasa bot
Installing Rasa
Rasa Open Source packages are available in Python Package Index Repository (PyPI). The default installation can be performed by using pip install rasa
command and it requires python version >=3.7, <3.9. But some of the most useful functionality, such as the most state-of-the-art transformer-based NLP models, are not included in the default installation, so we’d recommend using pip install rasa[full]
instead.
Rasa models
It’s very important to understand that Rasa agents consist of two models: an NLU model and a core model. The NLU model is responsible for recognising the intents of the user’s utterances and for extracting the entities present in them. The core model is responsible for managing the conversation flow: memorising the entities, producing the correct response and/or action, understanding when it’s time to wait for the next user’s utterance, and so on. The two models are trained on different data. It is possible to train NLU and core models separately with Rasa, but to get a complete agent capable of interacting with a user, we need to have both components.
Although the core model decides what is the correct action to perform, it does not actually perform it. There is a separate action server that is in charge of running actions.
Creating a project
To train a Rasa model, we need the data for the NLU component, for the core component, and also we need to define responses and actions the bot is expected to perform. Hence, the file structure needed to train a Rasa bot is quite complex. But luckily, you do not need to set it up manually, you can run rasa init
in a terminal to create your first project. Although you would get the correct folder structure, there are a lot of sub-folders and files, and figuring out what to do next can be overwhelming. So let’s take a closer look at what’s in there.
What does a Rasa bot consist of?
The folder structure of our newly created agent is as follows:
You’d typically start creating a bot from some amount of NLU training data, then core data and actions. But we are going to talk about actions first because we’ll need some concepts for explaining training data later on.
Custom actions
actions.py file defines custom actions that a bot can perform in reply to user’s commands. This is one of Rasa’s most powerful customisation capabilities: you can really do a lot with custom actions.
Two key components used in actions are the dispatcher, which can generate responses, and the tracker which stores the bot’s memory. The bot’s memory contains the information about what the user said: text of utterances, intents, entities. A separate part of Rasa bot’s memory are slots – variables where you can store important information. Slots can be of various types such as string, list, bool, etc. You don’t always need custom actions to work with slots: in Rasa 2.x, a slot is filled automatically with an entity of the same name (in Rasa 3.x, you need to specify which entity should fill which slot), e.g. a “destination” slot will be filled when a “destination” entity is extracted from the input utterance. However, in some cases, you might want to fill slots in a different way. In this case, you can use slot names that differ from entity names, and fill them through actions.
For example, when a user gives the bot a destination, we might want to check whether the “main_destination” slot is already filled: if it is, we add the new destination to the “include-point” list, instead of rewriting the main destination:
class ActionAddDestinaton(Action): def name(self) -> Text: return "action_add_destination" def run(self, dispatcher: CollectingDispatcher, tracker: Tracker, domain: Dict[Text, Any]) -> List[Dict[Text, Any]]: entity_to = next(tracker.get_latest_entity_values(entity_type="location", entity_role="to"), None) slot_main_destination = tracker.get_slot('main_destination') slot_include_point = tracker.get_slot('include-point') if slot_main_destination is None: return [SlotSet("main_destination", entity_to)] else: if slot_include_point is not None: slot_include_point.append(entity_include_point) else: slot_include_point = [entity_include_point] return [SlotSet("include-point", entity_to)]
A few points to note:
- We would use return
[SlotSet("main_destination", entity_to), SlotSet("include-point", slot_include_point ), …]
to fill several slots at a time. - To add a new item to a list slot (
include-point
), we first read the slot value byslot_include_point = tracker.get_slot('include-point')
, then append items toslot_include_point
, and then runningreturn [SlotSet("include-point", entity_to)]
. - name() function defines the name we use to refer to the action in core training data (
action_add_destination
in this example).
Another cool feature that can be implemented using custom actions is using a database to store user-specific information, e.g. a home address. We can write an action saving items to a database:
class ActionAddToDatabase(Action): def name(self) -> Text: return "action_add_to_database" def run(self, dispatcher: CollectingDispatcher, tracker: Tracker, domain: Dict[Text, Any]) -> List[Dict[Text, Any]]: entity_location = next(tracker.get_latest_entity_values(entity_type="location"), None) entity_db_item_name = next( tracker.get_latest_entity_values(entity_type="location", entity_role="to_database"), None) with open('utils/database.json', 'r', encoding = 'utf-8') as db: database = json.load(db) database.update({entity_db_item_name: entity_location}) with open('utils/database.json', 'w', encoding = 'utf-8') as db: json.dump(database, db) return []
When a user triggers the corresponding intent, database files gets updated:
After that, we just need to add a few more lines to our ActionAddDestinaton
to read that information back:
with open('utils/database.json', 'r', encoding = 'utf-8') as db: database = json.load(db) if entity_to in database: entity_to = database[entity_to]
The user can now ask to go home, instead of providing the full address each time:
We can further use a database to store the user’s navigation history, e.g:
database = { "destinations": [ ["5 main street", "17.50 23.11.21"], ["12 royal court drive", "17.45 16.11.21"], ["12 royal court drive", "17.51 09.11.21"], ["5 main street", "17.48 02.11.21"], ["5 main street", "17.49 26.10.21"], ] }
We can notice that our user tends to drive to the same places at similar times. If he triggers the navigate_to intent at one of those times without giving a destination, it would make sense to suggest the place where he usually goes at this time. We add a function to do this:
def get_frequent_destination( database, weeks_limit=5, weekly_events_min_count=3, daily_events_min_count=10 ): time_now = datetime.now() weekday_now = time_now.weekday() hour_now = time_now.hour time_limit = time_now - timedelta(weeks=weeks_limit) time_limit = time_limit destinations_by_day_and_time = Counter() destinations_by_time = Counter() if "destinations" in database: for destination in database["destinations"]: address, trip_date = destination address = address.lower() trip_date = datetime.strptime(trip_date, date_string_pattern) weekday = trip_date.weekday() hour = trip_date.hour if trip_date >= time_limit: if weekday == weekday_now and hour == hour_now: destinations_by_day_and_time[address] += 1 if hour == hour_now: destinations_by_time[address] += 1 if len(destinations_by_day_and_time) > 0 or len(destinations_by_time) > 0: top_destination, count = destinations_by_day_and_time.most_common(1)[0] if count >= weekly_events_min_count: return top_destination else: top_destination, count = destinations_by_time.most_common(1)[0] if count >= daily_events_min_count: return top_destination return
Now we see the following behaviour – just what we wanted:
Finally, one more fun feature we created using actions and slots is customising the bot’s response to some parameters of the user that are calculated based on his previous utterance(s). A very simple implementation of this feature is to store average length of every intent, mark the user as “talkative” if his/her utterance is longer than average, “not talkative” otherwise, and return a longer/shorter response accordingly. We take entities’ lengths to be 1: “Drive to 221B Baker Street” and “Drive to London” should be considered the same length. We use a custom action to access the text of the last user’s message, entities in it, do some length calculations (tokenising simply by white space), and fill the is_talkative
slot:
class ActionSetUserFeats(Action): def name(self) -> Text: return "action_set_user_feats" def run(self, dispatcher: CollectingDispatcher, tracker: Tracker, domain: Dict[Text, Any]) -> List[Dict[Text, Any]]: user_message = tracker.latest_message['text'] entities = tracker.latest_message['entities'] entity_length = sum([len(e['value'].split()) for e in entities]) intent = tracker.latest_message['intent'].get('name') l = len(user_message.split()) - entity_length + len(entities) # count each entity as one word if l > avg_lengths[intent]: is_talkative = True else: is_talkative = False return [SlotSet("is_talkative", is_talkative)]
To choose a longer response if the user is talkative, and a default response otherwise, we use a conditional response (responses are specified in domain.yml – see below):
utter_navigation_started: - condition: - type: slot name: is_talkative value: true text: Hey, sure. Starting navigation from {from} to {main_destination} - text: Navigating from {from} to {main_destination}
Now the bot will reply differently depending on the user’s input:
Now you know how to write custom actions, but one thing is missing: how does the bot know which actions to perform? Read on to the next sessions to learn that!
Training data
Data folder contains data for both NLU and core model training.
NLU training data
The data for training the NLU model is contained in nlu.yml. Its main part is intent names and examples of intent sentences with annotated entities:
- intent: Navigate examples: | - navigate me to [london](to) - navigate to [Barrgatan 7](to) from [Moskosel](from) - navigate to [Birmingham](to) from [London](from)
Note: entity text is enclosed in square brackets, and entity type in parentheses.
After testing, we realised that “to” and “from” entity types often get mixed up: e.g. in “Navigate from Birmingham to London”, both cities would be recognised as the entity type “to”. It makes sense: without context, entities “Birmingham” and “London” are very similar. Entity type “to” is much more frequent in navigation than entity type “from”, which would typically be assumed to be the user’s current location. Hence, “to” is predicted more frequently. Rasa has a solution for this problem: we can assign the same type, but different roles, to the origin and destination, which resolves the issue:
- intent: Navigate examples: | - navigate me to [london]{"entity": "location", "role": "to"} - navigate to [Barrgatan 7]{"entity": "location", "role": "to"} from [Moskosel]{"entity": "location", "role": "from"} - navigate to [Birmingham]{"entity": "location", "role": "to"} from [London]{"entity": "location", "role": "from"}
Some more options for extracting entities include:
- synonyms: variants of an entity that would all be mapped to a standard variant. In the navigation bot, we had an intent that would save an address to a database, and we’d like the versions like “My address is 6 London drive”, “Save 6 London drive as home”, “I live at 6 London drive”, etc. to be all saved under “home”. Hence we used a synonym mapping:
- synonym: home examples: | - I live at - my home address - my address - my home
Note that you do not need to include the standard variant (‘home) in the synonyms list, it only needs to be mentioned at the first line: – synonym: home. Also note that all variants need to be present in the training examples for intents. Finally, make sure to include the following into your config.yml (more about the config file later):
- name: EntitySynonymMapper
- regular expressions: you can define a regex that will be used for rule-based entity extraction:
- intent: call_number examples: | - call [07869613248](phone_number) - dial [7869613248](phone_number) - regex: account_number examples: | - d{10,12}
- Lookup tables: you can define a list of words and phrases belonging to an entity:
- lookup: city examples: | - London - Birmingham - ...
Note that this would only work well if an entity contains a closed list of words. Otherwise you need to train an NER (Named Entity Recognition) model or use an existing one (see Configuration section).
To use regexes and lookup tables,you need to add RegexEntityExtractor and RegexFeaturizer to your config.yml.
Core training data
Core training data is vital for the core component of a Rasa model – the one that handles the conversation flow. A core model consists of rules and an ML model. The data is stored in rules.yml and stories.yml, correspondingly.
Rules define short dialogue fragments that are fixed: typically it is one intent and a response and/or action that should always follow that intent. For example, if the user asks what the bot can do (what_scope
intent), the bot always explains the scope of its capabilities (utter_scope
response):
- rule: what_scope steps: - intent: what_scope - action: utter_scope
Stories are used to train the ML part of the core model. They describe longer, more complex dialogue fragments that will help the bot to learn how to handle more varied conversations and generalise to unseen data.
We described above the action_set_user_feats
action that compares the length of the user’s input to the average length of the corresponding intent and sets is_talkative
slot. We defined a conditional response utter_navigation_started
that would return a longer reply if is_talkative
slot is set to True. To teach the bot to run this action after the user triggers the navigate_to
intent and before returning the utter_navigation_started
response, we write the following story:
- story: Navigate steps: - intent: navigate_to entities: - location: "221B Baker street London" role: to - action: action_add_destination - slot_was_set: - main_destination: "221B Baker street London" - action: action_set_user_feats - slot_was_set: - is_talkative: true - action: action_start_navigation - action: utter_navigation_started - action: action_listen
Note that, unlike rules, in stories we provide examples of entities and reflect slot filling.
If the user triggers navigate_to intent but does not provide a destination, the main_destination slot would not be filled after the first message. The bot would need to additionally ask the user to provide a destination. This scenario needs to be described by a separate story:
- story: Open_gps steps: - intent: navigate_to entities: - location: null - action: action_add_destination - slot_was_set: - main_destination: null - action: action_set_user_feats - slot_was_set: - is_talkative: true - action: utter_ask_for_destination - action: action_listen - intent: navigate_to entities: - location: "221B Baker street London" role: to - action: action_add_destination - slot_was_set: - main_destination: "221B Baker street London" - action: action_start_navigation - action: utter_navigation_started - action: action_listen
In this case, the value of the main_destination
slot (None vs filled) impacts the conversation. When we are specifying slots in the domain file (see below), we need to set the influence_conversation flag to True for this kind of slots.
Creating stories manually can be a pain, particularly when you are just getting familiar with Rasa. An extremely useful tool for creating stories (and better understanding them) is the command line rasa interactive
option: it launches an interactive interface where you can type messages as a user, check whether the intent and entities are recognised correctly, fix them if not, track the slot filling, check and correct the next action:
Be careful: rasa interactive will retrain your models every time you launch it if the data has changed! Using rasa interactive
to write just one story is not time-efficient, it’s best to create several stories at a time and retrain after that. To start a new story, run /restart
in rasa interactive.
As conversation flow management in Rasa is mainly ML-based, its behaviour can change every time you add new data and retrain your model. It means that something that used to work perfectly can stop working at some iteration. What does this mean? We need testing! Luckily, Rasa provides possibilities for automated testing using test stories. They are very similar to the training stories, except for the need to spell out the whole user’s input (instead of just entities):
- story: Navigate to location steps: - user: | hello intent: hello - action: utter_hello - user: | navigate to [7 Victor Hugo drive]{"entity": "location", "role": "to"} please intent: navigate_to - action: action_add_destination - slot_was_set: - main_Destination: 7 Victor Hugo drive - action: action_set_user_feats - slot_was_set: - is_talkative: false - action: action_start_navigation - action: utter_navigation_started
To run the test, use rasa test in the command line.
Domain
Finally, the domain file lists all the information about training data and actions: intents, entities, slots with their types, default values and influence_conversation flag, forms, actions:
version: '2.0' session_config: session_expiration_time: 60 carry_over_slots_to_new_session: true intents: - ... entities: - ... slots: <slot name>: type: text influence_conversation: true <slot name>: type: list influence_conversation: false responses: utter_<response name>: - text: ... actions: - action_<action name>
Forms in Rasa provide a way to collect some required information from the user, for example the destination for “navigate_to”intent”. Forms are defined in the domain file as follows:
forms: navigate_form: required_slots: - main_destination
You can use the forms in stories and rules by referring to the form’s name (navigate_form
in our example):
- rule: Activate form steps: - intent: navigate_to - action: navigate_form - active_loop: navigate_form
active_loop states that the form should be activated at this point of the dialogue. The form gets deactivated automatically when all required slots are filled.
Configuration
Now it’s time to look at the configuration file used for training models: config.yml. It includes two parts: configuration for the NLU model and configuration for the core model. As we have already mentioned, Rasa models are highly customisable, and we can see it now: you can choose between a number of pre-processing options, featurisers, ML models for intent classification and named entity recognition (NER). Pre-trained NER models from spacy and Duckling are available. You can use classic ML or state-of-the-art deep learning models, including transformers (Rasa’s DIETClassifier). Look-up dictionaries and regular expressions can also be used for NER. The full list of pipeline components can be found at https://rasa.com/docs/rasa/components/.
It’s good to remember that, though transformers are incredibly powerful, training a transformer model takes a while, and you need to have a decent amount of data to get the desired results. It is worth trying out spacy and/or Duckling NER models before training your own: they might already have all you need.
In the case of our navigation bot, we needed a module for address extraction. Spacy recognises locations such as cities and countries, but not full addresses such as “221B Baker street London”, so we had to collect some examples of addresses and train our own NER model. We started with the following config.yml that showed a good performance:
# Configuration for Rasa NLU. # https://rasa.com/docs/rasa/nlu/components/ language: "en" pipeline: - name: "HFTransformersNLP" model_weights: "distilbert-base-uncased" model_name: "distilbert" - name: "LanguageModelTokenizer" - name: "LanguageModelFeaturizer" - name: "DIETClassifier" random_seed: 42 intent_classification: True entity_recognition: True use_masked_language_model: True epochs: 5 number_of_transformer_layers: 4 transformer_size: 256 drop_rate: 0.2 batch_size: 32 embedding_dimension: 50 # other components - name: FallbackClassifier threshold: 0.5 - name: EntitySynonymMapper |
configuration for the NLU model |
Policies: # # See https://rasa.com/docs/rasa/policies for more information. - name: MemoizationPolicy - name: RulePolicy - name: UnexpecTEDIntentPolicy max_history: 5 epochs: 100 - name: TEDPolicy max_history: 5 epochs: 100 constrain_similarities: true |
configuration for the core model |
Table. 1. Example of config.yml
Note that you can run into memory issues while training a transformer. We found that experimenting with the following parameters: number_of_transformer_layers, transformer_size, batch_size, embedding_dimension can help solve the problem.
After we started adding intents for calling/messaging, we realised we needed entities such as names and phone numbers. Names can be very diverse, and phone numbers in different countries follow different patterns (and we can never be sure of the exact formatting we get as the input), so it proved more reasonable to use spacy’s PERSON entity and a few Duckling’s entity including distance, duration, and phone-number. To incorporate these models, we changed the NLU configuration as follows (note how we add our own preprocessor):
# Configuration for Rasa NLU. # https://rasa.com/docs/rasa/nlu/components/ language: en pipeline: - name: ourpreprocessor.OurPreprocessor # language model - name: WhitespaceTokenizer - name: LanguageModelFeaturizer model_weights: distilbert-base-uncased model_name: distilbert # Regex for phone numbers - name: RegexFeaturizer # dual intent and entity - name: DIETClassifier random_seed: 42 batch_size: 32 intent_classification: True entity_recognition: True use_masked_language_model: False constrain_similarities: True epochs: 35 evaluate_on_number_of_examples: 200 evaluate_every_number_of_epochs: 1 tensorboard_log_directory: "./tbdiet" tensorboard_log_level: "epoch" # pretrained spacy NER for PERSON - name: SpacyNLP model: en_core_web_md - name: SpacyEntityExtractor dimensions: [PERSON] - name: DucklingEntityExtractor url: "http://localhost:8000" locale: "en_GB" timezone: "UTC" dimensions: [distance, duration, number, ordinal, phone-number, time, temperature] # other components - name: FallbackClassifier threshold: 0.4 - name: EntitySynonymMapper
An important part of the configuration is handling of fallbacks which can be of two types:
- NLU fallback: the bot does not reliably recognise the user’s intent. This happens when the confidence of predicted intents is below the threshold set by:
- name: FallbackClassifier threshold: 0.4
In this case, by default, action_default_fallback will be performed, sending utter_default to the user. But we did not want our bot to keep repeating the same default fallback response in case it does not understand the user repeatedly – that’s quite unnatural. We wanted to have several options of the reprompt that would be selected randomly every time, making sure that they do not repeat. We achieved this by writing a custom action:
class ActionReprompt(Action): """Executes the fallback action and goes back to the previous state of the dialogue""" def name(self) -> Text: return 'action_reprompt' async def run( self, dispatcher: CollectingDispatcher, tracker: Tracker, domain: Dict[Text, Any], ) -> List[Dict[Text, Any]]: reprompts = ["I'm sorry, I didn't quite understand that. Could you rephrase?", "Sorry, I didn't catch that, can you rephrase?", "Apologies, I didn't understand. Could you please rephrase it?"] last_reprompt = tracker.get_slot('last_reprompt') if last_reprompt in reprompts: reprompts.remove(last_reprompt) reprompt = random.choice(reprompts) dispatcher.utter_message(text=reprompt) return [SlotSet("last_reprompt", reprompt)]
To use this action instead of the default one, we introduce a new rule:
- rule: Implementation of Fallback steps: - intent: nlu_fallback - action: action_reprompt
- Core fallback: the bot can not reliably predict the next action. action_default_fallback will be performed by default. Threshold for this fallback is set in the core model configuration:
- name: "RulePolicy" core_fallback_threshold: 0.3 core_fallback_action_name: action_default_fallback enable_fallback_prediction: true restrict_rules: true check_for_contradictions: true
How to train and run Rasa models?
You can train and run models locally via command line:
rasa train
to train an NLU and core modelrasa train nlu
to train an NLU modelrasa train core
to train a core model
To launch your Rasa agent in the command line, use rasa shell
. If you have custom actions, you need to launch rasa run actions
in a separate terminal window first to start the Rasa action server (keep it open while running rasa shell
).
Hands-on: your first agent
That was a long read! You see now why it’s hard to get started with Rasa?
There is a little trick that can be used as a first learning project with Rasa: use an existing chatbot from the list of Dialogflow pre-built agents (under the ‘Pre-built agents’ tab in the left sidebar menu) and convert it to Rasa. This allows you to get familiar with the development process without thinking of the conversational design part too much. You can create a free Dialogflow account, create a new agent, and import one of pre-built agents into your agent. Then, going to the Settings, you can export a ZIP archive with the agent’s data:
Unzip the downloaded archive to a directory. Create a new folder where you want to create your Rasa bot. To convert Dialogflow NLU data to Rasa NLU data, use:
rasa data convert nlu -f yaml --data <Dialogflow agent folder> --out <Rasa agent folder>
Now you have your NLU data! You can train an NLU model using rasa train nlu
. What’s missing is rules and stories: Dialogflow has a different system of dialogue management that can’t be converted directly to Rasa format.
Don’t worry: you only need a couple of short stories to train your first core model. Choose the simplest conversations that your agent should handle and try to write a couple of stories with 1-2 conversation turns. You can check out some examples of stories for various use cases (booking an appointment, reporting that something is broken, FAQs, etc) here: rasa-demo/data/stories at main · RasaHQ/rasa-demo · GitHub. You can train a full model now! From here, you can work iteratively: test your NLU and core components, see what intents and entities are not recognised correctly, add more examples. Write more rules and stories (remember that rasa interactive
is of great help here) for the core model. Define your custom actions and add them to stories to make sure the model predicts them at the right time.
Conclusions
In this blog, we wanted to provide an introduction to chatbots in general and to show how to create Rasa agents, using an example of an AI driving assistant we developed. We hope that it will help you get through a few important steps of creating a Rasa chatbot: installation, creating a project, preparing data, choosing a config, training, adding custom actions. Enjoy developing your first Rasa bot!
Demo
—
This article is written by authors: Anna Koroleva, Simon Mackenzie, Saheen Ahamed.
Ready to test your skills?
If you’re interested in collaborating, apply to join an Omdena project at: https://www.omdena.com/projects