Teneo Discovery mines unstructured texts, such as documents, live chat logs, call transcriptions, exports of user-inputs from Teneo Log Data, or collections of search queries from a webpage, etc., to automatically discover frequent word combinations in form of concepts and associations that seem to be statistically representative of the text.
Discovery can group the findings automatically into categories, either natively, using unsupervised machine learning, or in a supervised way, by applying a custom classification model.
Teneo Discovery takes the uploaded data and creates an analysis based on it where the results of the analysis can be presented in different visualizations. The user can explore and organize the results interactively further through filtering, searching, and sorting, and even by drilling down to the original text input.
Teneo Discovery can be used to:
- get an overview of what a text or document is all about
- identify what customers are commonly asking and how they express themselves
- find the typical domain vocabulary and identify required Language Objects or Entities for Teneo Studio
- decide what knowledge areas to focus on when building a conversational AI application
- scope a conversational AI application (Teneo solution) and import the content into Teneo Studio
- identify patters that could be turned into Teneo Studio Intent Triggers
- analyze user inputs from Log Data to identify missing knowledge areas
- analyze text data and classify / categorize and export as training data for machine learning models
- analyze the performance of the Classes of a Teneo solution.
Teneo Discovery has three main section views:
Inputs / Categories / Lists.
When first opening Teneo Discovery, the user can access the main views from the top left corner of the window, by clicking the Analyses, Visualizations or Inputs buttons.
In the Analyses view, all the analyses in the same account are visible and it is here the user can run new analyses or select one to open and view. To run an analysis, see here.
The Visualizations view, is where the user graphically can view one or several analyses in different visualizations and navigate the discoveries.
The Inputs / Categories / Lists view is a multi-purpose view and here the user can either see the actual inputs that make up a concept or an association, see and work on categories, or see any of the configuration lists used in the analysis.
Read more about the different views.
Teneo Discovery automatically mine the text data to discover concepts and associations in the data and optionally group these in categories.
Inputs from the text data are first processed applying, depending on language and settings: tokenization, spelling correction, stemming, sentiment analysis, Entity detection and Part-of-Speech (POS) tagging. All this to clean up the data, identify word tokens, group together word forms found in the data and annotate part of speech and entity type, when applicable.
The processed inputs are passed to a statistical algorithm called the ConceptMiner to discover concepts in the text data without any predefined rules or dictionaries.
The ConceptMiner uses lists of stopwords, part of speech information, sentiment analysis, and named-entity recognition to create as informative concepts as possible. Concepts are generally single words or consecutive words (immediate proximate) appearing frequently in the data.
Teneo Discovery creates a hierarchical structure of the concepts and the hierarchy can be visualized with the ConceptBurst view as illustrated in the below image, where internet banking, internet usage or mobile internet are related as children to the parent concept internet.
Teneo Discovery sets the size of an item depending on how frequent concepts are and colors to show relations and morphological information.
The AssociationMiner discovers concepts that frequently co-occur in the same context: so-called associations.
In the ConceptBurst sibling mode, illustrated in the below image, Teneo Discovery has automatically found that the concept how many most frequently appears together with the concept work, but that it also frequently appears with the concepts year, talk, day, and the merged concepts of countries and cities.
Teneo Discovery allows users to drill down to the actual, original sentences of the text where how many and work co-occurred and automatically finds common patterns, i.e. users asking how many employees work in the company expressed by users in different ways (see below image) and creates an association.
Associations can be looked at as entities and intents, or objects and actions. If an entity and an intent frequently co-occur in a sentence, they will together create an association; or if two or three different entities co-occur often together in a sentence, they will create an association.
It is possible to classify, group, categorize, or in other ways organize the discoveries (concepts and associations) in Teneo Discovery per custom needs by the use of categorization.
This can be done manually or Teneo Discovery can perform this automatically: either natively, using unsupervised machine learning in form of topic modelling or, if the user uploads a customized classification model or Teneo Solution with Classes, using the uploaded model to categorize the discoveries in a supervised way.
The categories are represented visually as a folder structure and each category folder contains the concepts and associations related to it.
Teneo Discovery supports the following languages:
Chinese, Danish, Dutch, English, French, German, Italian, Japanese, Norwegian, Portuguese, Russian, Spanish, and Swedish.
Teneo Discovery performs POS-tagging for the following languages:
Chinese, Danish, Dutch, English, French, German, Italian, Japanese, Spanish, and Swedish.
Teneo Discovery performs Named Entity Recognition for the following languages:
English, French, German, Italian, Spanish, and Swedish.
Teneo Discovery performs Sentiment Analysis for English.