What technology area does this patent fall under?

Primary CPC classification G10L15/16. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 04 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Sub-matrix input for neural network layers

US11620989B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11620989-B2
Application number	US-201916452959-A
Country	US
Kind code	B2
Filing date	Jun 26, 2019
Priority date	Jan 27, 2015
Publication date	Apr 4, 2023
Grant date	Apr 4, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes generating, by a speech recognition system, a matrix from a predetermined quantity of vectors that each represent input for a layer of a neural network, generating a plurality of sub-matrices from the matrix, using, for each of the sub-matrices, the respective sub-matrix as input to a node in the layer of the neural network to determine whether an utterance encoded in an audio signal comprises a keyword for which the neural network is trained.

First claim

Opening claim text (preview).

What is claimed is: 1. A method performed by one or more computing devices, the method comprising: obtaining, by the one or more computing devices, a set of input values indicating acoustic characteristics of an utterance; receiving, by the one or more computing devices, the set of input values as input to a first layer of a neural network, the first layer of the neural network having nodes, each node of the first layer comprising a corresponding set of weights that is different than the corresponding set of weights of each other node of the first layer, and each node of the first layer is configured to receive, as input, a different respective subset of the set of input values, wherein the different respective subsets are non-overlapping; for each respective node of the first layer, generating, by the one or more computing devices, as output, a corresponding initial output value by applying the corresponding set of weights of the respective node to the respective subset of the set of input values; receiving, by the one or more computing devices, each of the initial output values as input to a second layer of the neural network, the second layer of the neural network having nodes, each node of the second layer is configured to receive, as input, a subset of the initial output values and generate, as output, a corresponding final output value; and determining, by the one or more computing devices, whether the utterance includes a keyword based on each of the final output values. 2. The method of claim 1 , wherein generating the corresponding initial output value comprises, for each respective node of the first layer, applying a different function to the respective subset of the set of input values. 3. The method of claim 1 , wherein one or more of the nodes of the first layer are configured to each receive a respective subset of the set of input values that are localized. 4. The method of claim 1 , wherein one or more of the nodes of the first layer are configured to each receive a respective subset of the set of input values that are localized in frequency. 5. The method of claim 1 , wherein determining whether the utterance includes the keyword based on each of the final output values comprises determining whether the utterance includes the keyword from among a set of predetermined keywords that are each designated as a signal that a mobile device should activate. 6. The method of claim 1 , wherein determining whether the utterance includes the keyword based on each of the final output values comprises determining whether the utterance contains the keyword spoken by a particular user. 7. The method of claim 1 , wherein the neural network is trained to determine whether the utterance includes the keyword. 8. The method of claim 1 , wherein each one of the final output values comprises a posterior probability score. 9. The method of claim 1 , wherein the set of input values comprises audio features derived from audio data of the utterance. 10. The method of claim 1 , wherein the first layer of the neural network comprises a first hidden layer of the neural network. 11. The method of claim 1 , wherein each node of the second layer corresponds to at least one node of the first layer. 12. A device comprising: one or more hardware processors and one or more data storage devices, the one or more hardware processors and the one or more data storage devices being configured to implement a keyword detection function by causing the device to perform operations comprising: obtaining a set of input values indicating acoustic characteristics of an utterance; receiving the set of input values as input to a first layer of a neural network, the first layer of the neural network having nodes, each node of the first layer comprising a corresponding set of weights that is different than the corresponding set of weights of each other node of the first layer, and each node of the first layer is configured to receive, as input, a different respective subset of the set of input values, wherein the different respective subsets are non-overlapping; for each respective node of the first layer, generating, as output, a corresponding initial output value by applying the corresponding set of weights of the respective node to the respective subset of the set of input values; receiving each of the initial output values as input to a second layer of the neural network, the second layer of the neural network having nodes, each node of the second layer is configured to receive, as input, a subset of the initial output values and generate, as output, a corresponding final output value; and determining whether the utterance includes a keyword based on each of the final output values. 13. The device of claim 12 , wherein generating the corresponding initial output value comprises, for each respective node of the first layer, applying a different function to the respective subset of the set of input values. 14. One or more non-transitory data storage devices storing instructions that, when executed by one or more processing devices, cause the one or more processing devices to perform operations comprising: obtaining, by the one or more processing devices, a set of input values indicating acoustic characteristics of an utterance; receiving, by the one or more processing devices, the set of input values as input to a first layer of a neural network, the first layer of the neural network having nodes, each node of the first layer comprising a corresponding set of weights that is different than the corresponding set of weights of each other node of the first layer, and each node of the first layer is configured to receive, as input, a different respective subset of the set of input values, wherein the different respective subsets are non-overlapping; for each respective node of the first layer, generating, by the one or more processing devices, as output, a corresponding initial output value by applying the corresponding set of weights of the respective node to the respective subset of the set of input values; receiving, by the one or more processing devices, each of the initial output values as input to a second layer of the neural network, the second layer of the neural network having nodes, each node of the second layer is configured to receive, as input, a subset of the initial output values and generate, as output, a corresponding final output value; and determining, by the one or more processing devices, whether the utterance includes a keyword based on each of the final output values.

Assignees

Google Llc

Inventors

Classifications

G06N3/09
Supervised learning · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/0495
Quantised networks; Sparse networks; Compressed networks · CPC title
G10L17/18
Artificial neural networks; Connectionist approaches · CPC title
G10L2015/088
Word spotting · CPC title

Patent family

Related publications grouped by family.

View patent family 56432691

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11620989B2 cover?: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes generating, by a speech recognition system, a matrix from a predetermined quantity of vectors that each represent input for a layer of a neural network, generating a plurality of sub-matrices from the matrix, using, for each of the sub-matric…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G10L15/16. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 04 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Speaker recognition using neural networks

Sectioned memory networks for online word-spotting in continuous speech

Cluster specific speech model

Speaker verification using neural networks

Speaker identification

Method and Apparatus for Using Convolutional Neural Networks in Speech Recognition

Frequently asked questions