Dynamic thresholds for always listening speech trigger
US-10789041-B2 · Sep 29, 2020 · US
US11087749B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11087749-B2 |
| Application number | US-201816227996-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 20, 2018 |
| Priority date | Dec 20, 2018 |
| Publication date | Aug 10, 2021 |
| Grant date | Aug 10, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems, methods, and devices for human-machine interfaces for improving machine understanding and fulfillment of utterance-based requests provided via the interfaces. Multiple candidate understandings from multiple stages of a natural language processing flow are preserved for arbitration and choosing by an arbitrator that applies arbitration rules to the plurality of candidates and chooses a single candidate for initiation of a corresponding service. In an embodiment, the arbitrator uses a media content taste profile to choose a candidate understanding for initiation of a corresponding service.
Opening claim text (preview).
The invention claimed is: 1. A natural language processing system, comprising: an automated speech recognizer configured to generate a plurality of text transcriptions from an utterance; a natural language understanding subsystem configured to receive the plurality of text transcriptions and provide a plurality of slot-intent models as output, wherein each slot-intent model includes an intent and one or more slots having key-value pairs; and a fulfillment manager configured to receive the plurality of slot-intent models and start a service based thereon, wherein the fulfillment manager includes: a fulfillment strategy data store that stores a plurality of fulfillment strategies, wherein each fulfillment strategy of the plurality of fulfillment strategies describes rules for starting at least one of a plurality of services; a strategy selector that selects, for each of the plurality of slot-intent models, one or more selected fulfillment strategies from the plurality of fulfillment strategies based on a given slot-intent model wherein the selected fulfillment strategies are each paired with a corresponding one of the plurality of slot-intent models to generate a plurality of pairings; and an arbitrator configured to receive the pairings, choose a chosen pairing of the pairings, and initiate one of the services associated with the chosen pairing, wherein the arbitrator is configured to choose the chosen pairing based on a mode analysis performed on the plurality of pairings, the mode analysis including identifying a number of incidences of components of each of the pairings, a first identified number of incidences of a first of the components being less than a second identified number of incidences of a second of the components, the mode analysis further including, based on the first number and the second number, eliminating a first of the pairings corresponding to a first of the components and choosing a second of the pairings corresponding to the second of the components as the chosen pairing. 2. The system of claim 1 , wherein the arbitrator is configured to choose the chosen fulfillment strategy based on the taste profile of the account associated with the utterance. 3. The system of claim 1 , wherein the arbitrator is configured to choose the chosen fulfillment strategy based on a first set of confidence scores provided by the automated speech recognizer, a second set of confidence scores provided by the natural language understanding system, and a third set of confidence scores provided by the strategy selector. 4. The system of claim 1 , wherein the automated speech recognizer, or the natural language understanding subsystem, or the fulfillment manager are configured to apply one or more elimination rules to eliminate one or more of the plurality of text transcriptions, or one or more of the plurality of slot-intent models, or one or more of the plurality of selected fulfillment strategies before receipt of the selected fulfillment strategies by the arbitrator. 5. The system of claim 1 , wherein the plurality of fulfillment strategies include a play media content strategy, a recommend media content strategy, and a search media content strategy. 6. The system of claim 1 , wherein every one of the plurality of text transcriptions corresponds to one of the selected fulfillment strategies received by the arbitrator. 7. A method, comprising: generating, using an automated speech recognizer, a plurality of text transcriptions from an utterance; providing, using a natural language understanding system, a plurality of slot-intent models as output, wherein each slot-intent model includes an intent and one or more slots having key-value pairs; and starting a service, using a fulfillment manager and based on the plurality of slot-intent models, wherein the using the fulfillment manager includes: selecting, for each of the plurality of slot-intent models, using a strategy selector, one or more selected fulfillment strategies from a plurality of fulfillment strategies based on a given slot-intent model, wherein the selected fulfillment strategies are each paired with a corresponding one of the plurality of slot-intent models to generate a plurality of pairings; choosing, using an arbitrator, a chosen pairing of the pairings; and initiating one of a plurality of services, the one of a plurality of services being associated with the chosen pairing, wherein the arbitrator is configured to choose the chosen pairing based on a mode analysis performed on the plurality of pairings, the mode analysis including identifying a number of incidences of components of each of the pairings, a first identified number of incidences of a first of the components being less than a second identified number of incidences of a second of the components, the mode analysis further including, based on the first number and the second number, eliminating a first of the pairings corresponding to a first of the components and choosing a second of the pairings corresponding to the second of the components as the chosen pairing. 8. The method of claim 7 , wherein the choosing is based on the taste profile of the account associated with the utterance. 9. The method of claim 7 , wherein the choosing is based on a first set of confidence scores provided by the automated speech recognizer, a second set of confidence scores provided by the natural language understanding system, and a third set of confidence scores provided by the strategy selector. 10. The method of claim 7 , further comprising applying, by the automated speech recognizer, or the natural language understanding subsystem, or the fulfillment manager, one or more elimination rules to eliminate one or more of the plurality of text transcriptions, or one or more of the plurality of slot-intent models, or one or more of the plurality of selected fulfillment strategies before receipt of the selected fulfillment strategies by the arbitrator. 11. The method of claim 7 , wherein the plurality of fulfillment strategies include a play media content strategy, a recommend media content strategy, and a search media content strategy.
Semantic analysis · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Execution procedure of a spoken command · CPC title
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.