The NLU Generator is a functionality in the Teneo Platform that allows users to automatically draft condition suggestions for Intent Triggers and conditional Transitions with Condition Match Requirements (MRs).
The drafted condition suggestion is based on the set of positive and negative examples available in the Intent Trigger or Transition; on Language Objects (LOBs) and Entities, and takes into account information from Part-of-Speech (POS) taggers and NERs (when available).
The condition suggestions can be created within a Condition Match Requirements, by selecting the Draft Condition option available under Advanced Options. If the user updates the Examples available it is possible to click Draft Condition again to get a new condition suggestion.
Based on both positive and negative examples, the NLU Generator:
- Chooses the best condition at the very end of the process, based on a wide range of criteria, minimizing the risk that the optimal alternative gets discarded early in the process only because it doesn't seem like the best one in a very local context.
- Uses the Different Match (&^) operator, enabling better example coverage without having to ignore the Language Objects that stretch across words they do not match. Thanks to this engine operator, the NLU Generator do not have to choose between either more reliable Language Objects (because they match more words in an example) or good example coverage (where all relevant words are used in the condition, without any long-matches stretching over them), it can do both!
- Generates many alternative conditions for each example, and then the NLU Generator waits until the very end before selecting the final condition. Then it judges the intended scope of words and phrases in the examples, given by the resulting Language Object and Entity selection, zeroing in on the best result.
To take full advantages of the NLU Generator, it is recommended to add more than one positive example and a maximum of 15 as tests have shown that using more than 15 positive examples affects the performance heavily without improving the condition quality much.
Also, the positive examples should not be more than 35 words/tokens long, as performance is affected when examples are lengthy and many.
Note that an Intent Trigger or Transition of course can contain more than 15 positive examples, as this might be useful for running Auto-test, Suggest ordering or for manual rendering of the condition.
By using more than one positive example, the NLU Generator can suggest better conditions as it will also look for synonyms and better phrases.
The NLU Generator generates a large set of condition suggestions for each of the positive examples and will, at the very end of the process, select the condition that covers all meaningful words (non-stopwords) in the examples by choosing the longest (covering most words, usually phrase level), the most common (shared by most examples), and the most exact (as narrow as possible) Language Objects and Entities.
The NLU Generator also makes use of the negative examples provided in an Intent Trigger or Transition. When providing negative examples, the NLU Generator will either discard conditions that match negative examples (if there are alternative conditions that don't) or expand the condition with negations.
The negative examples only have influence on the condition generation if they match the condition generated based on the positive examples. If a negative example doesn't match the condition, this means that everything is as it should be (the Auto-test would not fail for that Intent Trigger, for example) and the condition would remain the same as it was before the negative example was added.
Selected LOBs and Entities
When drafting conditions, the NLU Generator's algorithm uses Language Objects / Entities from lexical resources assigned to the solution, such as the Teneo NLU Ontology and Semantic Networks, and any project-specific Language Objects / Entities located in the solution if these follow the naming conventions of the Teneo NLU Ontology and Semantic Networks.
The Teneo NLU Ontology and Semantic Network's Lexical Resources contain different types of hierarchically structured Language Objects, and Engine has no way of discriminating between those types, i.e. for the Teneo Engine, they are just "Language Objects"; the NLU Generator, however, is designed to depend on this structure.
The NLU Generator's algorithm relies on all the pieces of information contained in the Language Objects' names to select the most appropriate Language Object in each context and only Language Objects of the type LEX, MIX, MUL, SYN and PHR are used in the generated condition.
Entities, on the other hand, are always preferred by the NLU Generator over any Language Object, except PHR, which is preferred over any Entity in the selection of objects for the drafted condition.
If no fitting Language Object or Entity is found, the NLU Generator will use the bare word itself in the condition.
Further descriptions of each of the different types of Language Objects and Entities are available in this section.
When creating project-specific Language Objects, it is advised to follow the naming conventions and also to add a project prefix to the Language Object's name:
This is to make the project-specific Language Objects easily distinguishable from the objects available in the Teneo Lexical Resources.
Overriding TLR LOBs
Sometimes in projects, a project-specific Language Object is preferred over a Language Object coming from a lexical resource. It is not possible to force the NLU Generator to use a project-specific Language Object, but by following the below steps, the NLU Generator normally selects the project-specific object over the one in the TLR.
- Create a local Language Object, for example,
- Include the object from the lexical resource in this local object, e.g. include
- Add project-specific variations for the word/phrases in the local Language Object which are not represented in the object of the lexical resource
- Create a few positive examples that use the project-specific words/phrases that are in the local object, alongside the examples that use words/phrases known to the object of the lexical resource.
Overriding TLR LOBs by using the same name
If a project-specific Language Object has the exact same name as a Language Object in a referred lexical resource (e.g. the Teneo Lexical Resource), the NLU Generator, just as Teneo Studio in general, uses the local, project-specific, version.
The NLU Generator makes use of Part-of-Speech (POS) tags in several languages. This means that the NLU Generator is capable of recognizing relevant
POS_XX.ANNOT Language Objects as well as retrieving and storing POS information.
Furthermore, it means that the NLU Generator can choose the correct Language Object in situations of disambiguation, for example, choosing a
VB.LEX Language Object over a
NN.LEX Language Object or the other way around.
For more information related to the Part-of-Speech Taggers and Morphological Analyzers, please see the Input Processors section.