Localization using CLU multilingual capabilities
By integrating Conversational Language Understanding (CLU) with Teneo, you can benefit from the combined conversational power of both tools. One significant advantage of this approach is the ability to combine a Teneo bot with CLU multilingual capabilities in order to create a powerful multilingual bot based on a single CLU model. These already powerful capabilities of CLU are ideally combined with Teneo Linguistics Modeling Language (TLML) for control and accuracy.
This is what a conversation made in Spanish, English, Swedish, Turkish, and German with the final multilingual bot could look like:
- Spanish - Hola! (Hello)
- English - My name is John
- Swedish - Minns du vad jag heter? (Do you remember my name?)
- Turkish - İyi misin? (Are you well?)
- German - Auf wiedersehen (Goodbye)
On this page, we will cover the concepts behind this approach. For detailed journey on how to implement all of the following, check out our guide on the topic.
With Conversational Language Understanding (CLU), you can train a model in one language and use it to predict intents and entities from utterances in other languages. This powerful feature can save a huge amount of time and effort; instead of building separate projects for every language, you can handle multilingual datasets in one project. Your dataset doesn't have to be entirely in the same language but you should enable the multilingual option for your project in the project settings.
These instructions assume you have access to a Teneo Sandbox as well as access to CLU.
In order to train a CLU model, you can import intents from the relevant Teneo solution in a JSON format. In this way, the CLU project will have all the relevant intents for your project, and you can also include intents from the Teneo Dialogue Resources for a more robust solution.
Once the CLU project has all the required training data, you can train and deploy a new model. You can also test the model once it has been deployed and start linking this new CLU model with your Teneo project.
The CLU project and Teneo solution can be connected using a few select pieces of data pertaining to the CLU project. The information you will need are:
- Primary key (Subscription key)
- Project name
- Deployment name
These can be added to the Teneo solution as a single global variable with the following format; simply populate with the relevant values from the CLU project.
1new CluPredict( 2 'subscriptionKey', 3 'region', 4 'projectName', 5 'deploymentName' 6) 7
Finally, to actually set up the connection, the CLU project can be connected to the Teneo solution using a simple groovy script added to the Teneo solution's resources. This script can then be invoked from the global Pre-Matching script.
Once the CLU model has been trained, deployed, and connected to your Teneo solution, your Teneo solution will bypass Teneo's native classifier and classes in the Class Manager and instead use the CLU native intent classifier. You can now localize bots to other languages and make use of the same CLU intent classifier.
To localize a bot, you must first include relevant content for branching, including flows, entities, variables, and any other required solution content. If you forget to include something before branching, it can be added after the localized solution has been created.
After including content, you can create the localized solution via branching and translate flow components like names and outputs into the desired language. You can read more about Master and Local solutions here.
We recommend for one common language to be chosen when creating the main Master solution. In large companies where the common language is English, this may mean that the best approach is using a solution in English even if the bot was developed in a different language. This is because the local solution owners must be comfortable with the content inside the Master solution to be able to create accurate Local solutions.
We recommend for the developer to have a deep understanding of how the Master bot works before adding everything to the Local solution. Consider the following recommendations when choosing what content is relevant:
- Cultural aspect - for countries that use similar language but have different cultures, it is important not to create these flows inside the Master solution. These are instead created directly inside the Local solution.
- Lingual aspect - Paths inside flows might differ depending on the language of the solution. The solution owner should think of use cases that work for specific languages and create these in the Local solutions.
- Market-specific cases - one thing that is relevant for a certain market might not be relevant across all markets. For example, there may be a promotion going on in Europe that might not be relevant to the Asian side of the business.
Here are some best practices when working with CLU together with Teneo:
- If a specific language is underperforming, add additional training data in that language.
- Make use of Teneo Linguistic Modelling Language
- Language Objects can store many variants of a single phrase/word
- Part of Speech and Named Entity Recognizer tags
- Normalize the inputs before sending it to CLU by:
- Removing punctuation
- Using all lowercased inputs
- Remember to disable the Language Detector flow, located under 'Dialogue' > 'Connecting Phrases' > 'Support' if you want your bot to work in more languages than the one specified for your Teneo solution.