What technology area does this patent fall under?

Primary CPC classification G10L15/30. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Dec 14 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Securely Executing Voice Actions Using Contextual Signals

US2017358317A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2017358317-A1
Application number	US-201615178895-A
Country	US
Kind code	A1
Filing date	Jun 10, 2016
Priority date	Jun 10, 2016
Publication date	Dec 14, 2017
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In some implementations, (i) audio data representing a voice command spoken by a speaker and (ii) a speaker identification result indicating that the voice command was spoken by the speaker are obtained. A voice action is selected based at least on a transcription of the audio data. A service provider corresponding to the selected voice action is selected from among a plurality of different service providers. One or more input data types that the selected service provider uses to perform authentication for the selected voice action are identified. A request to perform the selected voice action and (i) one or more values that correspond to the identified one or more input data types are provided to the service provider.

First claim

Opening claim text (preview).

1 . A method performed by one or more computers of a server system, the method comprising: obtaining, by the one or more computers of the server system, (i) audio data representing a voice command spoken by a speaker and (ii) a speaker identifier for the speaker that spoke the voice command; receiving, by the one or more computers of the server system, context data from a client device of the speaker, the context data indicating multiple context data items each indicating a different aspect of a current context of the client device, wherein at least one of the context data items is indicative of a location of the client device; selecting, by the one or more computers of the server system, a voice action based at least on a transcription of the audio data; selecting, by the one or more computers of the server system, a service provider corresponding to the selected voice action from among a plurality of different service providers; identifying, by the one or more computers of the server system, one or more input data types that the selected service provider uses to perform authentication for the selected voice action, wherein the one or more input data types for authentication are different from the speaker identifier; selecting, by the one or more computers of the server system, a subset of the multiple context data items that correspond to the identified one or more input data types; and providing, to the service provider by the one or more computers of the server system, (i) a request to perform the selected voice action and (ii) the selected subset of the multiple context data items that correspond to the identified one or more input data types. 2 . The method of claim 1 , wherein obtaining (i) audio data representing a voice command spoken by a speaker and (ii) a speaker identifier indicating that the voice command was spoken by the speaker comprises: obtaining the audio data representing the voice command spoken by the speaker; obtaining a voiceprint for the speaker; determining that the voiceprint for the speaker matches the audio data representing the voice command spoken by the speaker; and in response to determining that the voiceprint for the speaker matches the audio data representing the voice command spoken by the speaker, generating the speaker identifier for the speaker that spoke the voice command. 3 . The method of claim 1 , wherein selecting a voice action based at least on a transcription of the audio data comprises: obtaining a set of voice actions, wherein each voice action identifies one or more terms that correspond to that voice action; determining that one or more terms in the transcription match the one or more terms that correspond to the voice action; and in response to determining that the one or more terms in the transcription match the one or more terms that correspond to the voice action, selecting the voice action from among the set of voice actions. 4 . The method of claim 1 , wherein selecting a service provider corresponding to the selected voice action from among a plurality of different service providers comprises: obtaining a mapping of voice actions to the plurality of service providers, where for each voice action the mapping describes a service provider that can perform the voice action; determining that the mapping of voice actions indicates that the service provider can perform the selected voice action; and in response to determining that the mapping of voice actions indicates that the service provider can perform the selected voice action, selecting the service provider. 5 . The method of claim 1 , wherein identifying one or more input data types that the selected service provider uses to perform authentication for the selected voice action comprises: providing, to the selected service provider, a request for an identification of one or more input data types that the selected service provider uses to perform authentication for the selected voice action; receiving, from the selected service provider, a response to the request for the identification; and identifying the one or more input data types that the selected service provider uses to perform authentication for the selected voice action from the response to the request for the identification. 6 . The method of claim 1 , comprising: generating the transcription of the audio data using an automated speech recognizer. 7 . The method of claim 1 , comprising: receiving, from the service provider, an indication that the selected voice action has been performed. 8 . The method of claim 1 , comprising: receiving, from the service provider, an indication that additional authentication is needed to perform the selected voice action; and in response to receiving, from the service provider, the indication that additional authentication is needed to perform the selected voice action, providing a request for additional authentication. 9 . The method of claim 1 , wherein identifying one or more input data types that the selected service provider uses to perform authentication for the selected voice action comprises: identifying that the selected service provider uses one or more of an input data type that indicates whether the speaker's mobile computing device has been on a body since the mobile computing device was last unlocked, an input data type that indicates whether a speaker's mobile computing device is in short-range communication with a particular device, an input data type that indicates whether a speaker's mobile computing device is within a particular geographic area, or an input data type that indicates whether a speaker's face is in a view of a device. 10 . A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: obtaining (i) audio data representing a voice command spoken by a speaker and (ii) a speaker identifier for the speaker that spoke the voice command; receiving context data from a client device of the speaker, the context data indicating multiple context data items each indicating a different aspect of a current context of the client device, wherein at least one of the context data items is indicative of a location of the client device; selecting a voice action based at least on a transcription of the audio data; selecting a service provider corresponding to the selected voice action from among a plurality of different service providers; identifying one or more input data types that the selected service provider uses to perform authentication for the selected voice action, wherein the one or more input data types for authentication are different from the speaker identifier; selecting a subset of the multiple context data items that correspond to the identified one or more input data types; and providing, to the service provider, (i) a request to perform the selected voice action and (ii) the selected subset of the multiple context data items that correspond to the identified one or more input data types. 11 . The system of claim 10 , wherein obtaining (i) audio data representing a voice command spoken by a speaker and (ii) a speaker identifier for the speaker that spoke the voice command comprises: obtaining the audio data representing the voice command spoken by the speaker; obtaining a voiceprint for the speaker; determining that the voiceprint for the speaker matches the audio data representing the voice command spoken by the speaker; and in response to determining that the voiceprint for the speaker matches the audio data representing the voice command spoken by

Assignees

Google Inc

Inventors

James Barnaby John

Classifications

G06F3/167
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
G07C9/37
using biometric data, e.g. fingerprints, iris scans or voice recognition · CPC title
G10L15/22
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
H04L63/0861
using biometrical features, e.g. fingerprint, retina-scan (cryptographic mechanisms or cryptographic arrangements for entity authentication using biological data H04L9/3231) · CPC title
G10L17/00
Speaker identification or verification techniques · CPC title

Patent family

Related publications grouped by family.

View patent family 57543236

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2017358317A1 cover?: In some implementations, (i) audio data representing a voice command spoken by a speaker and (ii) a speaker identification result indicating that the voice command was spoken by the speaker are obtained. A voice action is selected based at least on a transcription of the audio data. A service provider corresponding to the selected voice action is selected from among a plurality of different ser…
Who is the assignee on this patent?: Google Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/30. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Dec 14 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Voice action biasing system

Systems and methods for voice-controlled account servicing

Developer voice actions system

Automatic speaker identification using speech recognition features

Frequently asked questions