What technology area does this patent fall under?

Primary CPC classification G06N3/084. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 23 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).

On-device projection neural networks for natural language understanding

US11423233B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11423233-B2
Application number	US-202117141473-A
Country	US
Kind code	B2
Filing date	Jan 5, 2021
Priority date	Aug 2, 2018
Publication date	Aug 23, 2022
Grant date	Aug 23, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure provides projection neural networks and example applications thereof. In particular, the present disclosure provides a number of different architectures for projection neural networks, including two example architectures which can be referred to as: Self-Governing Neural Networks (SGNNs) and Projection Sequence Networks (ProSeqoNets). Each projection neural network can include one or more projection layers that project an input into a different space. For example, each projection layer can use a set of projection functions to project the input into a bit-space, thereby greatly reducing the dimensionality of the input and enabling computation with lower resource usage. As such, the projection neural networks provided herein are highly useful for on-device inference in resource-constrained devices. For example, the provided SGNN and ProSeqoNet architectures are particularly beneficial for on-device inference such as, for example, solving natural language understanding tasks on-device.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to implement: a machine-learned multi-layered projection model configured to receive a projection model input and to generate a projection model output from the projection model input, the machine-learned multi-layered projection model comprising: a sequence of one or more projection layers, wherein each projection layer has a plurality of projection layer parameters, wherein each projection layer is configured to: receive a layer input; apply a plurality of projection layer functions to the layer input, each projection layer function generating a respective projection function output that projects the layer input to a different space, and generate a layer output by applying the projection layer parameters for the projection layer to the projection function outputs; and a sequence of one or more additional model layers configured to receive a layer output generated by a highest projection layer in the sequence of one or more projection layers and to generate one or more additional model layer outputs; wherein execution of the instructions causes the one or more computers to perform operations comprising: obtaining the projection model input; inputting the projection model input into the machine-learned multi-layered projection model; and receiving the projection model output generated by the machine-learned multi-layered projection model. 2. The system of claim 1 , wherein the machine-learned multi-layered projection model further comprises: an output layer configured to receive the additional model layer output generated by a highest additional model layer in the sequence of one or more additional model layers and to generate the projection model output. 3. The system of claim 1 , wherein the sequence of one or more projection layers comprises a plurality of projection layers one after the other. 4. The system of claim 1 , wherein the sequence of one or more additional model layers comprises one or more neural model layers. 5. The system of claim 1 , wherein the sequence of one or more additional model layers comprises one or more additional projection layers. 6. The system of claim 1 , wherein the sequence of one or more additional model layers comprises one or more projection sequence layers. 7. The system of claim 1 , wherein the machine-learned multi-layered projection model is trained to perform on-device text classification. 8. The system of claim 7 , wherein execution of the instructions by the one or more computers causes the one or more computers to: receive an input text; convert the input text into an intermediate feature vector; and input the intermediate feature vector as the projection model input to the machine-learned multi-layered projection model. 9. The system of claim 8 , wherein the intermediate feature vector comprises one or more of the following intermediate features that have been generated from or associated with the input text: skip-grams; n-grams; part of speech tags; dependency relationships; knowledge graph information; or contextual information. 10. The system of claim 1 , wherein, for each projection layer, the plurality of projection layer functions are precomputed and held static. 11. The system of claim 1 , wherein, for each projection layer, the plurality of projection layer functions are modeled using locality sensitive hashing. 12. The system of claim 1 , wherein execution of the instructions by the one or more computers causes the one or more computers to: dynamically compute the plurality of projection layer functions at inference time using one or more seeds. 13. The system of claim 1 , wherein the machine-learned multi-layered projection model comprises a self-governing neural model that performs natural language processing without initializing, loading, or storing any feature or vocabulary weight matrices. 14. The system of claim 1 , wherein the machine-learned multi-layered projection model has been trained based solely on its own performance relative to training data. 15. The system of claim 1 , wherein, for each projection layer, each projection function is associated with a respective set of projection vectors, and wherein applying each projection function to the layer input comprises: for each projection vector: determining a dot product between the layer input and the projection vector; when the dot product is negative, assigning a first value to a corresponding position in the projection function output; and when the dot product is positive, assigning a second value to the corresponding position in the projection function output. 16. The system of claim 1 , wherein, for each projection layer, the projection functions are each encoded as sparse matrices and are used to generate a binary representation from the layer input. 17. The system of claim 1 , wherein the projection layer parameters include a parameter matrix and a bias vector, and wherein generating the layer output by applying the projection layer parameters for the projection layer to the projection function outputs comprises: applying the parameter matrix to the projection function outputs and then adding the bias vector. 18. A system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to implement: a projection sequence model configured to receive an input and to generate an output from the input, the projection sequence model comprising: a sequence of one or more projection layers, wherein each projection layer has a plurality of projection layer parameters, wherein each projection layer is configured to: receive a layer input; apply a plurality of projection layer functions to the layer input, each projection layer function generating a respective projection function output that projects the layer input to a different space, and generate a layer output by applying the projection layer parameters for the projection layer to the projection function outputs; and one or more projection sequence layers positioned after the sequence of one or more projection layers, wherein each of the one or more projection sequence layers is configured to provide first internal state data to a subsequent iteration of such projection sequence layer in a subsequent iteration of the projection sequence model and to receive second internal state data from the subsequent iteration of such projection sequence layer in the subsequent iteration of the projection sequence model; wherein execution of the instructions causes the one or more computers to perform operations comprising: obtaining the input; inputting the input into the projection sequence model; and receiving the output generated by the projection sequence model. 19. The system of claim 18 , wherein the input comprises text and the output comprises segments of the text that have been identified and classified into respective ones of a plurality of classes by the projection sequence model. 20. The system of claim 18 , wherein each projection sequence layer comprises: a first set of nodes configured to provide the first internal state data of the first set of nodes to a subsequent iteration of the first set of nodes in the subsequent iteration of such projection sequence layer in the subsequent iteration of t

Assignees

Google Llc

Inventors

Classifications

G06N7/01
Probabilistic graphical models, e.g. probabilistic networks · CPC title
G06N3/045
Combinations of networks · CPC title
G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
G06N3/09
Supervised learning · CPC title

Patent family

Related publications grouped by family.

View patent family 69228678

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11423233B2 cover?: The present disclosure provides projection neural networks and example applications thereof. In particular, the present disclosure provides a number of different architectures for projection neural networks, including two example architectures which can be referred to as: Self-Governing Neural Networks (SGNNs) and Projection Sequence Networks (ProSeqoNets). Each projection neural network can in…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 23 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).