To demonstrate this concept, let’s say we are trying to test whether the chatbot understands the concept of machine insurance. To confirm whether the chatbot will be able to recognize language about machine insurance, but not confuse it with other language entered in as learning data, we need to write tests (in the form of phrases) that contain features typical of the language and define reports with appropriate measures for assessing the chatbot’s precision. Quality measures for the chatbot can be defined in different ways. But overall, you must answer one very important question: What do we mean by saying the chatbot learns to improve the classification of phrases?!
The answer is not so simple. Let’s assume we have the following categories defined for the chatbot:
- Machine insurance
- Machine technology
- Type of the machine
- Cost of credit
If the user types the sentence ‘I want to pay insurance for my new machine’, it does not mean that the chatbot will classify it into only one category. The classifier should assign a sentence with a very large “value” to only one category, but this sentence can also be assigned to other categories with a small value, eg:
|Classification Score||Category Name|
|30%||Type of the machine|
|17%||Cost of credit|
The expression ‘I want to pay insurance for my new machine’ has been classified as Machine Insurance with a value of 91%, while the amount in the line underneath indicates matching this sentence to the cost of machine insurance with a value of 41%. The other two categories have an even smaller value that match with the entered sentence. Let us assume that the values of assigning a phrase to a category are in the interval (0; 1).
Therefore, looking at the results shown above, it can be concluded that the chatbot is confident in classifying this phrase because the difference between the first valid classification value and the second is equal to 50%.
During the classification of a phrase other issues which could cause problems include:
- too small a difference between the first two categories assigned
- the correct phrase’s value being too low
- uniform distribution of the category classification, indicating the chatbot is unsure how to classify a phrase
By testing a chatbot, not only is one able to train it and increase its levels of comprehension, but one can establish a systematic approach to handling new language which results in a chatbot performing at more advanced levels with increased comprehension and communication skills.
Checking the accuracy of the chatbot’s phrase classification is a crucial aspect of developing a chatbot’s proficiency, and just like in teaching children, enables it to learn on its own and build on its knowledge base.
Read other articles in the series: Technically Speaking:
- Part II: Chatbots: The Cutting-Edge Synergy of Human Genius and Imagination with Technological Precision and Efficiency
- Part III: Reflections on Testing Natural Language, Natural Language Processing and Dealing with Language Peculiarities
- Part IV: The Education of a Chatbot:Developing a Chatbot’s Comprehension Skills and Testing the Correct Classification of Categories