Techniques for wake-up word recognition and related systems and methods

US11600269B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11600269-B2
Application numberUS-201616308849-A
CountryUS
Kind codeB2
Filing dateJun 15, 2016
Priority dateJun 15, 2016
Publication dateMar 7, 2023
Grant dateMar 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system for detection of at least one designated wake-up word for at least one speech-enabled application. The system comprises at least one microphone; and at least one computer hardware processor configured to perform: receiving an acoustic signal generated by the at least one microphone at least in part as a result of receiving an utterance spoken by a speaker; obtaining information indicative of the speaker's identity; interpreting the acoustic signal at least in part by determining, using the information indicative of the speaker's identity and automated speech recognition, whether the utterance spoken by the speaker includes the at least one designated wake-up word; and interacting with the speaker based, at least in part, on results of the interpreting.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for detecting at least one designated wake-up word for at least one speech-enabled application, the system comprising: at least one computer hardware processor configured to perform: receiving a first acoustic signal generated by at least one first microphone at least in part as a result of receiving an utterance spoken by a first speaker; obtaining information indicative of a first speaker's identity by processing, at least in part, the first acoustic signal; using the information indicative of the first speakers identity to determine whether the utterance spoken by the first speaker includes at least one or more wake-up words associated with the fast speaker's identity; in response to determining that the utterance spoken by the first speaker includes the at least one designated wake-up word, interacting with the speaker, wherein the at least one designated wake-up word includes a first designated wake-up word for a first speech-enabled application of the at least one speech-enabled application, and wherein the first designated wake-up word is specific to the first speaker such that no other speaker can use the first designated wake-up word, receiving a second acoustic signal generated by at least one second microphone at least in part as a result of receiving, concurrently with the first microphone, a second utterance spoken by a second speaker; obtaining information indicative of the second speaker's identity; interpreting the second acoustic signal at least in part by determining, using the information indicative of the second speaker's identity and automated speech recognition, whether the second utterance spoken by the second speaker includes a second designated wake-up word for a second speech-enabled application specific to the speaker's identity; and interacting with the second speaker based, at least in part, on results of the interpreting. 2. The system of claim 1 , wherein interacting with the speaker comprises allowing the speaker to control the at least one speech-enabled application. 3. The system of claim 1 , wherein the at least one computer hardware processor is configured to use the information indicative of the speaker's identity to determine whether the speaker is authorized to control the at least one speech-enabled application, and to allow the speaker to control the at leas one speech-enabled application if it is determined that the speaker is authorized to control the at least one speech-enabled application, and not allow the speaker to control the at least one speech-enabled application if it is determined that the speaker is not authorized to control the at least one speech-enabled application. 4. The system of claim 1 , wherein obtaining the speakers identity comprises: obtaining speech characteristics from the first acoustic signal; comparing the obtained speech characteristics against stored speech characteristics for each of multiple speakers registered with the system. 5. The system of claim 1 , wherein determining whether the utterance spoken by the speaker includes the at least one designated wake-up word comprises: using automated speech recognition to determine whether the utterance spoken by the speaker includes a wake-up word in the one or more wake-up words, wherein the automated speech recognition is performed using the one or more wake-up words associated with the speaker identity. 6. The system of claim 1 , wherein obtaining information indicative of the speaker's identity comprises determining a position of the speaker in an environment. 7. The system of claim 6 , wherein the at least one computer hardware processor is configured to determine, using the position of the speaker in the environment, whether the speaker is authorized to control the at least one speech-enabled application, and to allow the speaker to control the at least one speech-enabled application if it is determined that the speaker is authorized to control the at least one speech-enabled application, and not allow the speaker to control the at least one speech-enabled application if it is determined that the speaker is not authorized to control the at least one speech-enabled application. 8. The system of claim 6 , wherein the at least one computer hardware processor is configured to determine the position of the speaker inside a vehicle based, at least in part, on information gathered by at least one sensor in the vehicle. 9. The system of claim 6 , wherein the at least one computer hardware processor receives the first and second acoustic signals from a plurality of microphones, and wherein the position of the speaker is determined using the acoustic signals received from the plurality of microphones. 10. The system of claim 1 , wherein the at least one microphone comprises a plurality of microphones installed in a respective plurality of acoustic ones inside of a vehicle, wherein each of the plurality of acoustic zones comprises a seating area for a passenger in the vehicle. 11. The system of claim 1 , wherein interacting with the speaker comprises inferring, based at least in part on the information indicative of the speaker's identity, at least one action to take when interacting with the speaker. 12. The system of claim 1 , wherein obtaining information indicative of the speaker's identity composes obtaining the speaker's identity; and wherein determining whether the utterance spoken by the speaker includes the at least one designated wake-up word comprises: accessing a list of wake-up words associated with the speaker's identity; and determining whether the utterance includes any wake-up word in the list of wake-up words associated with the speaker's identity. 13. The system of claim 1 , wherein determining whether the utterance spoken by the speaker includes the at least one designated wake-up word comprises: compensating for interference received by the at least one microphone by using the information associated with the speaker's identity. 14. The system of claim 1 , wherein the at least one computer hardware processor is further configured to store the information about the speaker's identity in at least one data store. 15. The system of claim 14 , wherein the at least one data store comprises a plurality of data records including a first data record, the first data record comprising information selected from the group consisting of an identity of a particular speaker, a position of the particular speaker in an environment, a list of one or more wake-up words associated with the particular speaker, a list of one or more speech-enabled applications that the particular speaker is allowed to control, a list of one or more speech-enabled applications that the particular speaker is not allowed to control, and information obtained from one or more sensors. 16. A method for detecting at least one designated wake-up word for at least one speech-enabled application, the method comprising: using at least one computer hardware processor to perform: receiving a first acoustic signal generated by at least one microphone at least in part as a result of receiving an utterance spoken by a first speaker; obtaining information indicative of the first speaker's identity; using the information indicative of the first speaker's identity to determine whether the utterance spoken by the first speaker includes the at least one designated wake-up word associated with the speaker's identity; in response to determining that the utterance spoken by the first speaker includes the at least one designated wake-up word, interacting with the first speaker; r

Assignees

Inventors

Classifications

  • Word spotting · CPC title

  • Interactive procedures; Man-machine interfaces · CPC title

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Speech classification or search · CPC title

  • Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11600269B2 cover?
A system for detection of at least one designated wake-up word for at least one speech-enabled application. The system comprises at least one microphone; and at least one computer hardware processor configured to perform: receiving an acoustic signal generated by the at least one microphone at least in part as a result of receiving an utterance spoken by a speaker; obtaining information indicat…
Who is the assignee on this patent?
Cerence Operating Co
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).