Text Matching with Cosine Similarity

Cosine Similary can be used to find matching items in a list, based on a user input. This class is especially useful for finding matching items in lists that are dynamically populated, which might rule out the use of custom entities.

Installation

Add the file CosineSimilarity.groovy to the Resources in your solution and set the path to /script_lib.

For more details on managing files in your solution, see Resource File Manager.

Usage

You can call the CosineSimilarity class in any script in Teneo Studio, for example in script nodes, in listeners and for example as a script condition in transitions. The code looks as follows:

CosineSimilarity.mostSimilar(String pattern, List candidates, double threshold)

Arguments

The mostSimilar method has the following arguments:

Argument Description
pattern The input string
candidates The possible matches
threshold The matching threshold, a value between 0 and 1

Results

An ordered list of matching candidates, ordered by closest match first.

Example

Suppose we want to allow someone to use natural language to choose a restaurant from a list of nearby restaurants. Let's say the list of nearby restaurants is retrieved using an API and stored in a flow variable 'restaurantNames'. To check if an input contains a restaurant name that is in the list, we can use the following code:

def matchingItems = CosineSimilarity.mostSimilar(_.userInputText, restaurantNames, 0.40)

If the value of 'restaurantNames' was ["Happy Thai", "Delicious Seafood", "Pete's Deli"] and the user input text was Deli, the value of 'matchingItems' would be:

["Pete's Deli", "Delicious Seafood"]

Credits

The CosineSimilarity class was written by Burt Beckwith. The source can be found in Grail core. For more details on the Cosine Similarity algorithm, see Fuzzy Matching with Cosine Similarity

Was this page helpful?