Computer-implemented generation and utilization of a universal encoder component

US10963644B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10963644-B2
Application numberUS-201816234491-A
CountryUS
Kind codeB2
Filing dateDec 27, 2018
Priority dateDec 27, 2018
Publication dateMar 30, 2021
Grant dateMar 30, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Computer-implemented techniques are described herein for generating and utilizing a universal encoder component (UEC). The UEC maps a linguistic expression in a natural language to a language-agnostic representation of the linguistic expression. The representation is said to be agnostic with respect to language because it captures semantic content that is largely independent of the syntactic rules associated with the natural language used to compose the linguistic expression. The representations is also agnostic with respect to task because a downstream training system can leverage it to produce different kinds to machine-trained components that serve different respective tasks. The UEC facilitates the generation of downstream machine-trained models by permitting a developer to train a model based on input examples expressed in a language j α , and thereafter apply it to the interpretation of documents in language j β , with no additional training required.

First claim

Opening claim text (preview).

What is claimed is: 1. One or more computing devices for generating a task-specific machine-trained component, comprising: one or more hardware processors that execute operations based on machine-readable instructions stored in a memory and/or based on logic embodied in a task-specific collection of logic gates, the operations including: receiving a universal encoder component that has been produced by a first computer-implemented training system; and using a second computer-implemented training system to generate, by training, a task-specific machine-trained component, following production of the universal encoder component by the first training system, based on a set of input training examples expressed in at least one natural language, the second training system using the trained universal encoder component to convert each input training example into a language-agnostic representation of the input training example, the task-specific machine-trained component, once trained, providing a computer-implemented tool for mapping an input document, expressed in a particular input natural language, into an output result, said mapping applying to a case in which the particular input natural language of the input document is not among said at least one natural language that was used to train the task-specific machine-trained component, but wherein the universal encoder component has been trained to convert the input document expressed in the particular input natural language into a linguistic-agnostic representation of the input document, the second training system corresponding to a different training system or a same training system as the first training system, the first training system and the second training system being implemented by said one or more hardware processors. 2. The one or more computing devices of claim 1 , wherein the operations further include producing plural task-specific machine-trained components using the universal encoder component that perform plural respective different tasks, the universal encoder component also being agnostic with respect to task. 3. The one or more computing devices of claim 1 , wherein the first training system produces the universal encoder component using a generative adversarial network (GAN). 4. The one or more computing devices of claim 1 , wherein the first training system generates the universal encoder component by simultaneously training a language model component and a discriminator component, and wherein the first training system generates the universal encoder component based on a training objective that takes into consideration at least: loss information based on a measure of predictive accuracy of the language model component; and loss information based on a measure of coherence among three or more distributions of language-agnostic representations of input training examples expressed in different natural languages, said measure of coherence being based on output information generated by the discriminator component. 5. The one or more computing devices of claim 4 , wherein the measure of coherence reflects a combined measure of coherence among a plurality of pairings of distributions, each pairing of distributions including a first distribution of language-agnostic representations associated with a first natural language represented by the input training examples used by the first training system, and a second distribution of language-agnostic representations associated with a second natural language represented by the input training examples used by the first training system. 6. The one or more computing devices of claim 5 , wherein the combined measure of coherence is generated by computing, for each particular pairing of distributions, a distance between the particular pairing of distributions, and wherein the combined measure of coherence is based on a combination of distances associated with different respective pairings of distributions. 7. The one or more computing devices of claim 4 , wherein a particular set of input training examples in the different natural languages that are fed together to the first training system do not represent translations of a same underlying content into the different natural languages. 8. A computer-readable storage medium for storing computer-readable instructions, the computer-readable instructions, when executed by one or more hardware processors, providing a task-specific machine-trained component that performs operations of: receiving an input document expressed in a particular input natural language; converting the input document into a language-agnostic representation of the input document using a universal encoder component; and mapping the language-agnostic representation to an output result, the task-specific machine-trained component having been trained based on input training examples expressed in at least one natural language, the universal encoder component being trained in a first machine-training process, and the task-specific machine-trained component being trained in a second machine-training process, and the task-specific machine-trained component incorporating the universal encoder component that is trained in the first machine-training process, wherein the universal encoder component is produced by the first machine-training process by simultaneously training a language model component and a discriminator component, and wherein the first machine-training process generates the universal component based on a training objective that takes into consideration at least: loss information based on a measure of predictive accuracy of the language model component; and loss information based on a measure of coherence among three or more distributions of language-agnostic representations of input training examples expressed in different natural languages, said measure of coherence being based on output information generated by the discriminator component. 9. The computer-readable storage medium of claim 8 , wherein the universal encoder component is also agnostic with respect to a task performed by the task-specific machine-trained component. 10. The computer-readable storage medium of claim 8 , wherein the measure of coherence reflects a combined measure of coherence among a plurality of pairings of distributions, each pairing of distributions including a first distribution of language agnostic representations associated with a first natural language represented by the input training examples used by the first machine-training process, and a second distribution of language agnostic representations associated with a second natural language represented by the input training examples used by the first machine-training process. 11. The computer-readable storage medium of claim 10 , wherein the combined measure of coherence is generated by computing, for each particular pairing of distributions, a distance between the particular pairing of distributions, and wherein the combined measure of coherence is based on a combination of distances associated with different respective pairings of distributions. 12. A method, implemented by one or more computing devices, for performing machine-training in a training system, comprising: in a training operation: using plural language-specific encoder components to convert input training examples expressed in respective different natural languages into respective language-specific representations of the input training examples; using a language-agnostic encoder component to convert each language-specific representation into a language-agnostic representation; for a given natural language associated with a given input training exam

Assignees

Inventors

Classifications

  • Probabilistic or stochastic networks · CPC title

  • Combinations of networks · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Adversarial learning · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10963644B2 cover?
Computer-implemented techniques are described herein for generating and utilizing a universal encoder component (UEC). The UEC maps a linguistic expression in a natural language to a language-agnostic representation of the linguistic expression. The representation is said to be agnostic with respect to language because it captures semantic content that is largely independent of the syntactic ru…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 30 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).