What technology area does this patent fall under?

Primary CPC classification G10L15/32. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 07 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Hierarchical speech recognition resolution

US10614811B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10614811-B2
Application number	US-201715858763-A
Country	US
Kind code	B2
Filing date	Dec 29, 2017
Priority date	Dec 29, 2017
Publication date	Apr 7, 2020
Grant date	Apr 7, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system, method, apparatus and computer readable medium for hierarchical speech recognition resolution. The method of hierarchical speech recognition resolution on a platform includes receiving a speech stream from a microphone. The speech stream is resolved using a lowest possible level automatic speech recognition (ASR) engine of multi-level ASR engines. The selection of the lowest possible level ASR engine is based on policies defined for the platform. If resolution of the speech stream is rated less than a predetermined confidence level, the resolution of the speech stream is pushed to a next higher-level ASR engine of the multi-level ASR engines until the resolution of the speech stream meets the predetermined confidence level without violating one or more policies.

First claim

Opening claim text (preview).

What is claimed is: 1. A platform having hierarchical speech resolution, comprising: network interface circuitry to receive a speech stream from a microphone; a processor coupled to the network interface circuitry; one or more memory devices coupled to the processor, the one or more memory devices including instructions, which when executed by the processor, cause the platform to: resolve the speech stream using a lowest possible level automatic speech recognition (ASR) engine of multi-level ASR engines, wherein the multi-level ASR engines include a voice trigger based ASR engine for limited keywords and a vocabulary set of 30-40 words, an audio digital signal processor (DSP) based ASR engine having a vocabulary of a few hundred words, a local processor based ASR engine having a large vocabulary, and a cloud based ASR engine having a very large vocabulary with unlimited processing and memory and wherein selection of the lowest possible level ASR engine is based on policies defined for the platform; and when resolution of speech is less than a predetermined confidence rating, push resolution of the speech stream to a next higher-level ASR engine of the multi-level ASR engines until the resolution of the speech stream meets the predetermined confidence rating without violating one or more policies. 2. The platform of claim 1 , wherein the multi-level ASR engines comprise a hierarchical structure to provide more compute power and word recognition with each higher-level ASR engine. 3. The platform of claim 2 , wherein the hierarchical structure of the multi-level ASR engines comprises additional processing power and a larger vocabulary for each higher-level ASR engine of the multi-level ASR engines. 4. The platform of claim 1 , wherein the confidence rating indicates how well an ASR engine resolved the speech stream, wherein if accuracy of the resolved speech stream is below a predefined level, the instructions, when executed, are to push resolution of the speech stream to the next higher-level ASR engine, wherein the next higher-level ASR engine includes more compute power and a larger vocabulary subsystem. 5. The platform of claim 4 , wherein if the accuracy of the resolved speech stream is equal to or exceeds the predefined level, the instructions, when executed, are to accept the resolution of the speech stream without pushing resolution of the speech stream to the next higher-level ASR engine. 6. The platform of claim 1 , wherein the one or more policies include the confidence rating, a privacy setting, user identity, system connection states, time of day, response time, and other indicators requiring resolution of the speech stream at lower-level or higher-level ASR engines. 7. The platform of claim 6 , wherein the privacy setting prevents specific speech from going to a cloud based ASR engine, wherein all data remains local to the platform. 8. An apparatus having hierarchical speech resolution on a platform comprising: one or more substrates; and logic coupled to the one or more substrates, wherein the logic includes one or more of configurable logic or fixed-functionality hardware logic, the logic coupled to the one or more substrates to: receive a speech stream from a microphone; resolve the speech stream using a lowest possible level automatic speech recognition (ASR) engine of multi-level ASR engines, wherein the multi-level ASR engines include a voice trigger based ASR engine for limited keywords and a vocabulary set of 30-40 words, an audio digital signal processor (DSP) based ASR engine having a vocabulary of a few hundred words, a local processor based ASR engine having a large vocabulary, and a cloud based ASR engine having a very large vocabulary with unlimited processing and memory and wherein selection of the lowest possible level ASR engine is based on policies defined for the platform; and when resolution of speech is less than a predetermined confidence rating, push resolution of the speech stream to a next higher-level ASR engine of the multi-level ASR engines until the resolution of the speech stream meets the predetermined confidence rating without violating one or more policies. 9. The apparatus of claim 8 , wherein the multi-level ASR engines comprise a hierarchical structure to provide more compute power and word recognition with each higher-level ASR engine. 10. The apparatus of claim 9 , wherein the hierarchical structure of the multi-level ASR engines comprises additional processing power and a larger vocabulary for each higher-level ASR engine of the multi-level ASR engines. 11. The apparatus of claim 8 , wherein the one or more policies include the confidence rating, a privacy setting, user identity, system connection states, time of day, response time, and other indicators requiring resolution of the speech stream at lower-level or higher-level ASR engines. 12. The apparatus of claim 11 , wherein the privacy setting prevents specific speech from going to a cloud based ASR engine, wherein all data remains local to the platform. 13. A method of hierarchical speech resolution on a platform comprising: receiving a speech stream from a microphone; resolving the speech stream using a lowest possible level automatic speech recognition (ASR) engine of multi-level ASR engines, wherein the multi-level ASR engines include a voice trigger based ASR engine for limited keywords and a vocabulary set of 30-40 words, an audio digital signal processor (DSP) based ASR engine having a vocabulary of a few hundred words, a local processor based ASR engine having a large vocabulary, and a cloud based ASR engine having a very large vocabulary with unlimited processing and memory and wherein selection of the lowest possible level ASR engine is based on policies defined for the platform; and when resolution of speech is less than a predetermined confidence rating, pushing resolution of the speech stream to a next higher-level ASR engine of the multi-level ASR engines until the resolution of the speech stream meets the predetermined confidence rating without violating one or more policies. 14. The method of claim 13 , wherein the multi-level ASR engines comprise a hierarchical structure to provide more compute power and word recognition with each higher-level ASR engine. 15. The method of claim 14 , wherein the hierarchical structure of the multi-level ASR engines comprises additional processing power and a larger vocabulary for each higher-level ASR engine of the multi-level ASR engines. 16. The method of claim 13 wherein the one or more policies include the confidence rating, a privacy setting, user identity, system connection states, time of day, response time, and other indicators requiring resolution of the speech stream at lower-level or higher-level ASR engines. 17. The method of claim 16 , wherein the privacy setting prevents specific speech from going to a cloud based ASR engine, wherein all data remains local to the platform. 18. At least one non-transitory computer readable medium, comprising a set of instructions, which when executed by a computing device, cause the computing device to: receive a speech stream from a microphone; resolve the speech stream using a lowest possible level automatic speech recognition (ASR) engine of multi-level ASR engines, wherein the multi-level ASR engines include a voice trigger based ASR engine for limited keywords and a vocabulary set of 30-40 words, an audio digital signal processor (DSP) based ASR engine having a vocabulary of a few hundred words, a local processor based ASR engine having a large vocabu

Assignees

Intel Corp

Inventors

Classifications

G10L2015/228
of application context · CPC title
G10L15/32Primary
Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title
G10L2015/088
Word spotting · CPC title
G10L15/22
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G10L15/30
Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

Patent family

Related publications grouped by family.

View patent family 65038931

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10614811B2 cover?: A system, method, apparatus and computer readable medium for hierarchical speech recognition resolution. The method of hierarchical speech recognition resolution on a platform includes receiving a speech stream from a microphone. The speech stream is resolved using a lowest possible level automatic speech recognition (ASR) engine of multi-level ASR engines. The selection of the lowest possible …
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G10L15/32. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 07 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).