What technology area does this patent fall under?

Primary CPC classification G06F40/20. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Feb 02 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Spoken language understanding using dynamic vocabulary

US10909972B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10909972-B2
Application number	US-201715805452-A
Country	US
Kind code	B2
Filing date	Nov 7, 2017
Priority date	Nov 7, 2017
Publication date	Feb 2, 2021
Grant date	Feb 2, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An example apparatus for detecting intent in voiced audio includes a receiver to receive one or more word sequence hypotheses related to a voiced audio and a dynamic vocabulary. The apparatus also includes a natural language understander (NLU) to detect an intent and recognize a property related to the intent based on the word sequence hypothesis and the dynamic vocabulary. The apparatus further includes a transmitter to transmit the detected intent and recognized associated property to an application.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus for detecting intent in voice audio, the apparatus comprising: a receiver to receive a common vocabulary including a list of static words and a parameter value from an application, the parameter value including a dynamic word to be added to a dynamic vocabulary including a set of relations between word sequences and semantic classes and a list of parameters that are to be used to detect dynamic vocabulary phrases; an automatic speech recognizer (ASR) to receive voiced audio and generate a word sequence hypothesis based on a language model including word probabilities derived from the dynamic vocabulary based on the semantic classes; a natural language understander (NLU) including: a feature front-end to generate a bag of features vector including a first sub vector including bag of words feature vector of distinguishing words derived from the common vocabulary based on weighted word counts in the word sequence hypothesis and a second sub vector including a feature vector of dynamic vocabulary detected in the word sequence hypothesis; an intent detector to detect an intent based on the bag of features vector; a property recognizer to compute a semantic tag for each word in the word sequence hypothesis based on the bag of features; and a transmitter to transmit the detected intent and a canonical representation generated based on the semantic tags to the application. 2. The apparatus of claim 1 , wherein the NLU includes a trained neural network to detect the intent based on the bag of features generated from the word sequence hypothesis. 3. The apparatus of claim 1 , wherein the feature front-end is to generate a set of continuous features based on a received common vocabulary and a set of discrete features based on the dynamic vocabulary, and generate the bag of features to be used to compute the semantic tags. 4. The apparatus of claim 1 , wherein the NLU includes a type caster to generate the canonical representation based on one or more words in the word sequence hypothesis with the semantic tags. 5. The apparatus of claim 1 , wherein the dynamic vocabulary is generated based on user data received from the application. 6. The apparatus of claim 1 , wherein the language model is trained using representative dynamic training data and updated with the parameter value from the application. 7. The apparatus of claim 1 , further including a semantic model communicatively coupled to the NLU, wherein the semantic model is trained using the dynamic vocabulary and updated with the parameter value from the application. 8. The apparatus of claim 1 , wherein the NLU includes a classifier trained to generate the intent based on the bag of features vector using a model trained by considering a subset of representative training data as dynamic. 9. The apparatus of claim 1 , wherein the property recognizer includes a model including a condition random field, a hidden Markov model or a recurrent neuronal network trained considering a sub-set of training data vocabulary as being dynamic. 10. The apparatus of claim 1 , wherein the intent detector includes a trained recurrent neural network or deep neural network trained considering a sub-set of training data vocabulary as being dynamic. 11. A method for detecting intent in voiced audio, the method comprising: receiving, via a processor, a common vocabulary including a list of static words and a parameter value from an application, the parameter value including a dynamic word to be added to a dynamic vocabulary including a set of relations between word sequences and semantic classes and a list of parameters used to detect dynamic vocabulary phrases; receiving, via the processor, voiced audio and generating a word sequence hypothesis based on a language model including word probabilities derived from the dynamic vocabulary based on the semantic classes; generating, via the processor, a bag of features vector including a first sub vector including bag of words feature vector of distinguishing words derived from the common vocabulary based on weighted word counts in the word sequence hypothesis and a second sub vector including a feature vector of dynamic vocabulary detected in the word sequence hypothesis; detecting, via the processor, an intent based on the bag of features vector; computing, via the processor, a semantic tag for each word in the word sequence hypothesis based on the bag of features; and sending, via the processor, the detected intent and a canonical representation generated based on the semantic tags to the application. 12. The method of claim 11 , wherein detecting the intent includes processing the bag of features using a model trained using representative dynamic training data. 13. The method of claim 11 , wherein generating the bag of features includes generating a set of continuous features based on the received common vocabulary, generating a set of discrete features based on the dynamic vocabulary, and generating the bag of features to be used to compute the semantic tags. 14. The method of claim 11 , wherein computing the semantic tags includes semantically tagging a word in the word sequence hypothesis based on a generated bag of features. 15. The method of claim 11 , further including generating a canonical representation based on one or more words in the word sequence hypothesis with the semantic tags. 16. The method of claim 11 , further including training a model to detect the intent, wherein training the model includes: receiving, via the processor, training data; randomly sampling, via the processor, the training data to generate common training data and representative dynamic training data; calculating, via the processor, the common vocabulary based on the common training data and the dynamic vocabulary based on the representative dynamic training data; and training, via the processor, the model based on the common training data, the representative dynamic training data, the common vocabulary, and the dynamic vocabulary. 17. The method of claim 11 , wherein detecting the intent and computing the semantic tags includes detecting a longer dynamic vocabulary before a shorter dynamic vocabulary. 18. The method of claim 11 , further including: receiving, via the processor, user data from the application: and generating the dynamic vocabulary based on the user data. 19. At least one non-transitory computer readable medium for detecting intent in voiced audio comprising instructions stored therein that, in response to being executed on a computing device, cause the computing device to at least: receive a common vocabulary including a list of static words and a parameter value from an application, the parameter value including a dynamic word to be added to a dynamic vocabulary including a set of relations between word sequences and semantic classes and a list of parameters that are to be used to detect dynamic vocabulary phrases; receive voiced audio and generating a word sequence hypothesis based on a language model including word probabilities derived from the dynamic vocabulary based on the semantic classes; generate a bag of features vector including a first sub vector including bag of words feature vector of distinguishing words derived from the common vocabulary based on weighted word counts in the word sequence hypothesis and a second sub vector including a feature vector of dynamic vocabulary detected in the word sequence hypothesis; detect an intent based on the bag of features vector; compute a semantic tag for ea

Assignees

Intel Corp

Inventors

Classifications

G06F40/20Primary
Natural language analysis (semantic analysis of natural language G06F40/30) · CPC title
G10L15/1815Primary
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title
G10L15/063
Training · CPC title
G10L2015/0635
updating or merging of old and new templates; Mean values; Weighting · CPC title
G10L15/193
Formal grammars, e.g. finite state automata, context free grammars or word networks · CPC title

Patent family

Related publications grouped by family.

View patent family 65023108

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10909972B2 cover?: An example apparatus for detecting intent in voiced audio includes a receiver to receive one or more word sequence hypotheses related to a voiced audio and a dynamic vocabulary. The apparatus also includes a natural language understander (NLU) to detect an intent and recognize a property related to the intent based on the word sequence hypothesis and the dynamic vocabulary. The apparatus furthe…
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06F40/20. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Feb 02 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).