Machine learning system for digital assistants

US12067006B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12067006-B2
Application numberUS-202117350294-A
CountryUS
Kind codeB2
Filing dateJun 17, 2021
Priority dateJun 23, 2020
Publication dateAug 20, 2024
Grant dateAug 20, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A machine learning system for a digital assistant is described, together with a method of training such a system. The machine learning system is based on an encoder-decoder sequence-to-sequence neural network architecture trained to map input sequence data to output sequence data, where the input sequence data relates to an initial query and the output sequence data represents canonical data representation for the query. The method of training involves generating a training dataset for the machine learning system. The method involves clustering vector representations of the query data samples to generate canonical-query original-query pairs in training the machine learning system.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of training a machine learning system for use with a digital assistant, the method comprising: obtaining training data comprising query data samples; obtaining vector representations of the query data samples; clustering the vector representations; determining canonical queries and corresponding query groups based on the clustered vector representations, wherein corresponding query groups correspond to determined canonical queries; performing named entity recognition on the query data samples and canonical queries; replacing a text data for tagged named entities with a named entity type tag; generating paired data samples based on determined canonical queries and selections from the corresponding query groups; and training an encoder-decoder neural network architecture using the paired data samples, wherein the selections from the corresponding query groups are supplied as input sequence data and the determined canonical queries are supplied as output sequence data, wherein the digital assistant is configured to map data representing an initial query to data representing a revised query associated with one of the canonical queries, via the encoder-decoder neural network architecture, the data representing the revised query being further processed to provide a response to the initial query, wherein generating paired data samples comprises filtering generated paired data samples, and wherein filtering comprises: removing paired data samples with a canonical query whose named entity tags do not match the named entity tags in the corresponding selection from the query group. 2. The method of claim 1 , wherein obtaining training data comprises obtaining pairs of text data representing queries and responses, and wherein obtaining vector representations of the query data samples comprises converting the pairs of text data to corresponding vector representations. 3. The method of claim 2 , wherein a first portion of the text data represents an output of speech-to-text processing that is performed on audio data for a voice query and a second portion of the text data comprises data for use in providing the response to the voice query. 4. The method of claim 1 , wherein a given vector representation comprises at least a response vector representation, the response vector representation being a vector representation of the data representing the response to query, the response vector representation being paired with data representing a corresponding query, wherein clustering the vector representations comprises: clustering the response representations based on distances between the response vector representations within vector space, and wherein a canonical query is determined for a given cluster within the clustered vector representations based on a frequency of the query data paired with response vector representations within the given cluster. 5. The method of claim 4 , wherein clustering the vector representations comprises: applying a hierarchical clustering method to iteratively combine separate clusters. 6. The method of any one of claim 1 , wherein obtaining vector representations comprises: obtaining text representations of at least the responses to the queries; generating embedding vectors for the text representations; and projecting the embedding vectors to a lower dimensionality vector space to output the vector representations. 7. The method of claim 6 , wherein generating embedding vectors for the text representations comprises: tokenizing the text representations; and applying a transformer neural network architecture to the tokenized text representations to generate the embedding vectors. 8. The method of claim 6 , wherein projecting the embedding vectors to a lower dimensionality vector space comprises selecting a subset of principal components, the principal components being determined following principal component analysis of query data samples. 9. The method of claim 1 , wherein clustering the vector representations comprises: performing a first stage of clustering based on vector representations of responses to queries; and performing a second stage of clustering based on vector representations of the queries preceding the responses. 10. The method of claim 1 , comprising: filtering the clustered vector representations prior to generating the paired data samples. 11. The method of claim 10 , wherein said filtering comprises, for a given cluster: determining a centroid for the given cluster in the clustered vector representations; and unassigning vector representations of queries for the given cluster that are more than a predefined distance from the centroid. 12. The method of claim 10 , wherein said filtering comprises, for a given cluster: obtaining a size of the given cluster; and unassigning vector representations of queries for the given cluster responsive to the size being below a predefined threshold. 13. The method of claim 12 , wherein said filtering further comprises: reassigning unassigned vector representations of queries to a closest cluster. 14. The method of claim 1 , comprising optimizing one or more of the following metrics: one or more clustering distance thresholds; one or more cluster size thresholds; and one or more frequency thresholds for the selection of canonical queries. 15. A non-transitory computer-readable storage medium storing instructions which, when executed by at least one processor, cause the at least one processor to: obtain training data comprising query data samples; obtain vector representations of the query data samples; cluster the vector representations; determine canonical queries and corresponding query groups based on the clustered vector representations, wherein corresponding query groups correspond to determined canonical queries; perform named entity recognition on the query data samples and canonical queries; replace a text data for tagged named entities with a named entity type tag; generate paired data samples based on determined canonical queries and selections from the corresponding query groups; and train an encoder-decoder neural network architecture using the paired data samples, wherein the selections from the corresponding query groups are supplied as input sequence data and the determined canonical queries are supplied as output sequence data, wherein a query interface is configured to map data representing an initial query to data representing a revised query associated with one of the canonical queries, via the encoder-decoder neural network architecture, the data representing the revised query being further processed to provide a response to the initial query, wherein generating paired data samples comprises filtering generated paired data samples, and wherein filtering comprises: removing paired data samples with a canonical query whose named entity tags do not match the named entity tags in the corresponding selection from the query group. 16. The non-transitory computer-readable storage medium of claim 15 , wherein obtaining training data comprises obtaining pairs of text data representing queries and responses, and wherein obtaining vector representations of the query data samples comprises converting the pairs of text data to corresponding vector representations. 17. The non-transitory computer-readable storage medium of claim 15 , wherein obtaining vector representations of the query data samples comprises: obtaining text representations of at least the responses to the queries; generating

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Transfer learning · CPC title

  • Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Combinations of networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12067006B2 cover?
A machine learning system for a digital assistant is described, together with a method of training such a system. The machine learning system is based on an encoder-decoder sequence-to-sequence neural network architecture trained to map input sequence data to output sequence data, where the input sequence data relates to an initial query and the output sequence data represents canonical data re…
Who is the assignee on this patent?
Soundhound Inc, Soundhound Ai Ip Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/2425. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 20 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).