What technology area does this patent fall under?

Primary CPC classification G06F3/16. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 15 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Dynamic threshold for speaker verification

US9972323B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9972323-B2
Application number	US-201715599578-A
Country	US
Kind code	B2
Filing date	May 19, 2017
Priority date	Jun 24, 2014
Publication date	May 15, 2018
Grant date	May 15, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a dynamic threshold for speaker verification are disclosed. In one aspect, a method includes the actions of receiving, for each of multiple utterances of a hotword, a data set including at least a speaker verification confidence score, and environmental context data. The actions further include selecting from among the data sets, a subset of the data sets that are associated with a particular environmental context. The actions further include selecting a particular data set from among the subset of data sets based on one or more selection criteria. The actions further include selecting, as a speaker verification threshold for the particular environmental context, the speaker verification confidence score. The actions further include providing the speaker verification threshold for use in performing speaker verification of utterances that are associated with the particular environmental context.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving, by a computing device that uses voice-based speaker identification, data identifying an utterance previously received by the computing device and data indicating that a user likely did speak the utterance; prompting the user to confirm that the user did speak the utterance; receiving, from the user, data indicating that the user has confirmed that the user did speak the utterance; and in response to receiving the data indicating that the user has confirmed that the user did speak the utterance, using audio data corresponding to the utterance previously received by the computing device to perform voice-based speaker identification on a subsequently received utterance that has a shared characteristic with the utterance previously received, wherein the shared characteristic is (i) an amount of background noise within a same background noise range, (ii) an amount of loudness within a same loudness range, or (iii) a signal-to-noise ratio within a same signal-to-noise ratio range. 2. The method of claim 1 , comprising: recognizing an identity of the user using a technique other than voice-based speaker identification. 3. The method of claim 2 , wherein recognizing the identity of the user using the technique other than voice-based speaker identification comprises prompting the user for a passcode. 4. The method of claim 1 , wherein the utterance previously received by the computing device and the subsequently received utterance each include a predefined hotword. 5. The method of claim 1 , wherein the amount of background noise is measured prior to receipt of the previously received utterance and the subsequently received utterance. 6. The method of claim 1 , wherein prompting the user to confirm that the user did speak the utterance comprises: providing, for display, data indicating a date and time when the utterance was received. 7. A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: receiving, by a computing device that uses voice-based speaker identification, data identifying an utterance previously received by the computing device and data indicating that a user likely did speak the utterance; prompting the user to confirm that the user did speak the utterance; receiving, from the user, data indicating that the user has confirmed that the user did speak the utterance; and in response to receiving the data indicating that the user has confirmed that the user did speak the utterance, using audio data corresponding to the utterance previously received by the computing device to perform voice-based speaker identification on a subsequently received utterance that has a shared characteristic with the utterance previously received, wherein the shared characteristic is (i) an amount of background noise within a same background noise range, (ii) an amount of loudness within a same loudness range, or (iii) a signal-to-noise ratio within a same signal-to-noise ratio range. 8. The system of claim 7 , wherein the operations further comprise: recognizing an identity of the user using a technique other than voice-based speaker identification. 9. The system of claim 8 , wherein recognizing the identity of the user using the technique other than voice-based speaker identification comprises prompting the user for a passcode. 10. The system of claim 7 , wherein the utterance previously received by the computing device and the subsequently received utterance each include a predefined hotword. 11. The system of claim 7 , wherein the amount of background noise is measured prior to receipt of the previously received utterance and the subsequently received utterance. 12. The system of claim 7 , wherein prompting the user to confirm that the user did speak the utterance comprises: providing, for display, data indicating a date and time when the utterance was received. 13. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising: receiving, by a computing device that uses voice-based speaker identification, data identifying an utterance previously received by the computing device and data indicating that a user likely did speak the utterance; prompting the user to confirm that the user did speak the utterance; receiving, from the user, data indicating that the user has confirmed that the user did speak the utterance; and in response to receiving the data indicating that the user has confirmed that the user did speak the utterance, using audio data corresponding to the utterance previously received by the computing device to perform voice-based speaker identification on a subsequently received utterance that has a shared characteristic with the utterance previously received, wherein the shared characteristic is (i) an amount of background noise within a same background noise range, (ii) an amount of loudness within a same loudness range, or (iii) a signal-to-noise ratio within a same signal-to-noise ratio range. 14. The medium of claim 13 , wherein the utterance previously received by the computing device and the subsequently received utterance each include a predefined hotword. 15. The medium of claim 13 , wherein the operations further comprise: recognizing an identity of the user using a technique other than voice-based speaker identification. 16. The medium of claim 15 , wherein recognizing the identity of the user using the technique other than voice-based speaker identification comprises prompting the user for a passcode. 17. The medium of claim 13 , wherein the amount of background noise is measured prior to receipt of the previously received utterance and the subsequently received utterance. 18. The medium of claim 13 , wherein prompting the user to confirm that the user did speak the utterance comprises: providing, for display, data indicating a date and time when the utterance was received. 19. A computer-implemented method comprising: receiving, by a computing device that uses voice-based speaker identification, data identifying an utterance previously received by the computing device and data indicating that a user likely did speak the utterance; providing, for display, a prompt for the user to confirm that the user did speak the utterance, the prompt indicating a date and time that the utterance was received; receiving, from the user, data indicating that the user has confirmed that the user did speak the utterance; and in response to receiving the data indicating that the user has confirmed that the user did speak the utterance, using audio data corresponding to the utterance previously received by the computing device to perform voice-based speaker identification on a subsequently received utterance.

Assignees

Google Llc

Inventors

Classifications

G06F3/16Primary
Sound input; Sound output (speech processing G10L) · CPC title
G10L17/24
the user being prompted to utter a password or a predefined phrase · CPC title
G06F3/167
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
G10L17/08
Use of distortion metrics or a particular distance between probe pattern and reference templates · CPC title
G10L17/06
Decision making techniques; Pattern matching strategies · CPC title

Patent family

Related publications grouped by family.

View patent family 54870212

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9972323B2 cover?: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a dynamic threshold for speaker verification are disclosed. In one aspect, a method includes the actions of receiving, for each of multiple utterances of a hotword, a data set including at least a speaker verification confidence score, and environmental context data. The actions further includ…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G06F3/16. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 15 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Performing biometrics in uncontrolled environments

Multiple speech locale-specific hotword classifiers for selection of a speech locale

Controller for voice-controlled device and associated method

Frequently asked questions