What technology area does this patent fall under?

Primary CPC classification G06N3/045. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 14 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Implicit bridging of machine learning tasks

US10713593B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10713593-B2
Application number	US-201615394708-A
Country	US
Kind code	B2
Filing date	Dec 29, 2016
Priority date	Nov 4, 2016
Publication date	Jul 14, 2020
Grant date	Jul 14, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing machine learning tasks. One method includes receiving (i) a model input, and (ii) data identifying a first machine learning task to be performed on the model input to generate a first type of model output for the model input; augmenting the model input with an identifier for the first machine learning task to generate an augmented model input; and processing the augmented model input using a machine learning model, wherein the machine learning model has been trained on training data to perform a plurality of machine learning tasks including the first machine learning task, and wherein the machine learning model has been configured through training to process the augmented model input to generate a machine learning model output of the first type for the model input.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving (i) a model input comprising text in a source language, and (ii) data identifying a target language that the text in the source language is to be translated into by the machine learning model; augmenting the model input with an identifier that identifies at least the target language to generate an augmented model input; and processing the augmented model input using a machine learning model to generate a model output that is a translation of the model input into the target language, wherein the machine learning model has been trained on training data to translate model inputs into a plurality of different languages including the target language, and wherein the machine learning model comprises: an encoder neural network; and a decoder neural network that is shared between the plurality of different languages and that is configured to generate outputs from a shared vocabulary that includes outputs from all of the plurality of different languages. 2. The method of claim 1 , wherein augmenting the model input with an identifier comprises prepending a token identifier that identifies at least the target language to the model input. 3. The method of claim 1 , wherein the training data comprises a plurality of paired datasets, wherein each of the paired datasets comprises an input dataset paired with an output dataset, and wherein the plurality of paired datasets does not include a pairing of datasets comprising an input dataset in the source language paired with an output dataset in the target language. 4. The method of claim 1 , wherein the encoder neural network and the decoder neural network comprise respective recurrent neural networks. 5. The method of claim 1 , wherein the machine learning model has been trained on the training data to translate model inputs in a first plurality of different languages including the source language into any of the plurality of different languages that include the target language. 6. The method of claim 5 , wherein the identifier identifies both the source language and the target language. 7. The method of claim 5 , wherein the encoder neural network is shared among the first plurality of different languages. 8. A system comprising one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: receiving (i) a model input comprising text in a source language, and (ii) data identifying a target language that the text in the source language is to be translated into by the machine learning model; augmenting the model input with an identifier that identifies at least the target language to generate an augmented model input; and processing the augmented model input using a machine learning model to generate a model output that is a translation of the model input into the target language, wherein the machine learning model has been trained on training data to translate model inputs into a plurality of different languages including the target language, and wherein the machine learning model comprises: an encoder neural network; and a decoder neural network that is shared between the plurality of different languages and that is configured to generate outputs from a shared vocabulary that includes outputs from all of the plurality of different languages. 9. The system of claim 8 , wherein the encoder neural network and the decoder neural network comprise respective recurrent neural networks. 10. The system of claim 8 , wherein the decoder neural network comprises an attention mechanism. 11. The system of claim 8 , wherein the augmented model input comprises a model input with a prepended token identifier for at least the target language. 12. The system of claim 8 , wherein the machine learning model has been trained on the training data to translate model inputs in a first plurality of different languages including the source language into any of the plurality of different languages that include the target language. 13. The system of claim 12 , wherein the identifier identifies both the source language and the target language. 14. The system of claim 12 , wherein the encoder neural network is shared among the first plurality of different languages. 15. One or more non-transitory computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations comprising: receiving (i) a model input comprising text in a source language, and (ii) data identifying a target language that the text in the source language is to be translated into by the machine learning model; augmenting the model input with an identifier that identifies at least the target language to generate an augmented model input; and processing the augmented model input using a machine learning model to generate a model output that is a translation of the model input into the target language, wherein the machine learning model has been trained on training data to translate model inputs into a plurality of different languages including the target language, and wherein the machine learning model comprises: an encoder neural network; and a decoder neural network that is shared between the plurality of different languages and that is configured to generate outputs from a shared vocabulary that includes outputs from all of the plurality of different languages. 16. The computer-readable storage media of claim 15 , wherein augmenting the model input with an identifier comprises prepending a token identifier that identifies at least the target language to the model input. 17. The computer-readable storage media of claim 15 , wherein the training data comprises a plurality of paired datasets, wherein each of the paired datasets comprises an input dataset paired with an output dataset, and wherein the plurality of paired datasets does not include a pairing of datasets comprising an input dataset in the source language paired with an output dataset in the target language. 18. The computer-readable storage media of claim 15 , wherein the machine learning model has been trained on the training data to translate model inputs in a first plurality of different languages including the source language into any of the plurality of different languages that include the target language. 19. The computer-readable storage media of claim 17 , wherein the identifier identifies both the source language and the target language. 20. The computer-readable storage media of claim 17 , wherein the encoder neural network is shared among the first plurality of different languages.

Assignees

Google Llc

Inventors

Classifications

G06N3/045Primary
Combinations of networks · CPC title
G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/0455Primary
Auto-encoder networks; Encoder-decoder networks · CPC title
G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
G06N3/09
Supervised learning · CPC title

Patent family

Related publications grouped by family.

View patent family 62063853

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10713593B2 cover?: Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing machine learning tasks. One method includes receiving (i) a model input, and (ii) data identifying a first machine learning task to be performed on the model input to generate a first type of model output for the model input; augmenting the model input with an identifier for the first m…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G06N3/045. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 14 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Using meta-information in neural machine translation

Method and Device for Machine Translation

Multilingual prosody generation

Frequently asked questions