System and method for natural language processing using neural network with cross-task training

US12086539B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12086539-B2
Application numberUS-202017093478-A
CountryUS
Kind codeB2
Filing dateNov 9, 2020
Priority dateDec 9, 2019
Publication dateSep 10, 2024
Grant dateSep 10, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for using a neural network model for natural language processing (NLP) includes receiving training data associated with a source domain and a target domain; and generating one or more query batches. Each query batch includes one or more source tasks associated with the source domain and one or more target tasks associated with the target domain. For each query batch, class representations are generated for each class in the source domain and the target domain. A query batch loss for the query batch is generated based on the corresponding class representations. An optimization is performed on the neural network model by adjusting its network parameters based on the query batch loss. The optimized neural network model is used to perform one or more new NLP tasks.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for using a neural network model for natural language processing (NLP), comprising: receiving training data associated with a source domain and a target domain; generating one or more query batches, wherein a source subquery batch includes one or more source tasks associated with the source domain, and wherein a target subquery batch includes one or more target tasks associated with the target domain; for each query batch, generating combination class representations associated with a combination of classes in the source domain and the target domain; generating a source loss using the source subquery batch and the combination class representations; generating a target loss using the target subquery batch and the combination class representations; and generating a query batch loss using the source loss and target loss; and performing an optimization on the neural network model by adjusting its network parameters based on the query batch loss, wherein the optimized neural network model is used to perform one or more new NLP tasks. 2. The method of claim 1 , wherein a first new NLP task is from one of the target domain and a new domain, wherein the new domain is different from the source domain and target domain. 3. The method of claim 2 , wherein a second new NLP task is from the other of the target domain and the new domain. 4. The method of claim 1 , wherein the neural network model includes a textual entailment model, wherein the one or more source tasks and the one or more target tasks include one or more textual entailment tasks regarding a relation of a premise sentence including a premise and a hypothesis sentence including a hypothesis, where the relation indicates whether the hypothesis is true given the premise. 5. The method of claim 1 , wherein the generating the source loss includes: generating a probability distribution by comparing a query of the source subquery batch with the combination class representations; and generating the source loss based on the probability distribution. 6. The method of claim 1 , wherein the generating the target loss includes: generating a probability distribution by comparing a query of the target subquery batch with the combination class representations; and generating the target loss based on the probability distribution. 7. The method of claim 1 , wherein few-shot learning is performed to the neural network model. 8. A non-transitory machine-readable medium comprising a plurality of machine-readable instructions which, when executed by one or more processors, are adapted to cause the one or more processors to perform a method comprising: receiving training data associated with a source domain and a target domain; generating one or more query batches, wherein a source subquery batch includes one or more source tasks associated with the source domain, and wherein a target subquery batch includes one or more target tasks associated with the target domain; for each query batch, generating combination class representations associated with a combination of classes in the source domain and the target domain; generating a source loss using the source subquery batch and the combination class representations; generating a target loss using the target subquery batch and the combination class representations; and generating a query batch loss using the source loss and target loss; and performing an optimization on the neural network model by adjusting its network parameters based on the query batch loss, wherein the optimized neural network model is used to perform one or more new NLP tasks. 9. The non-transitory machine-readable medium of claim 8 , wherein a first new NLP task is from one of the target domain and a new domain different from the source domain and target domain. 10. The non-transitory machine-readable medium of claim 9 , wherein a second new NLP task is from the other of the target domain and the new domain. 11. The non-transitory machine-readable medium of claim 8 , wherein the neural network model includes a textual entailment model, wherein the one or more source tasks and one or more target tasks include one or more textual entailment tasks regarding a relation of a premise sentence including a premise and a hypothesis sentence including a hypothesis, where the relation indicates whether the hypothesis is true given the premise. 12. The non-transitory machine-readable medium of claim 8 , wherein the generating the source loss includes: generating a probability distribution by comparing a query of the source subquery batch with the combination class representations; and generating the source loss based on the probability distribution. 13. The non-transitory machine-readable medium of claim 8 , wherein the generating the target loss includes: generating a probability distribution by comparing a query of the target subquery batch with the combination class representations; and generating the target loss based on the probability distribution. 14. The non-transitory machine-readable medium of claim 12 , wherein the generating the class representations for each query batch includes: generating source sample sets for the classes from the source domain; and generating the class representations based on the source sample sets and the target subquery batch. 15. A system, comprising: a non-transitory memory; and one or more hardware processors coupled to the non-transitory memory and configured to read instructions from the non-transitory memory to cause the system to perform a method comprising: receiving training data associated with a source domain and a target domain; generating one or more query batches, wherein a source subquery batch includes one or more source tasks associated with the source domain, and wherein a target subquery batch includes one or more target tasks associated with the target domain; for each query batch, generating combination class representations associated with a combination of classes in the source domain and the target domain; generating a source loss using the source subquery batch and the combination class representations; generating a target loss using the target subquery batch and the combination class representations; and generating a query batch loss using the source loss and target loss; and performing an optimization on the neural network model by adjusting its network parameters based on the query batch loss, wherein the optimized neural network model is used to perform one or more new NLP tasks. 16. The system of claim 15 , wherein a first new NLP task is associated with one of the target domain and a new domain, wherein the new domain is different from the source domain and target domain. 17. The system of claim 16 , wherein a second new NLP task is associated with the other of the target domain and the new domain. 18. The system of claim 15 , wherein the neural network model includes a textual entailment model, wherein the one or more source tasks and one or more target tasks include one or more textual entailment tasks regarding a relation of a premise sentence including a premise and a hypothesis sentence including a hypothesis, where the relation indicates whether the hypothesis is true given the premise. 19. The system of claim 15 , wherein the generating the source loss includes: generating a probability distribution by comparing a query of the source subquery batch with the combination class representations; and generating the source loss

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12086539B2 cover?
A method for using a neural network model for natural language processing (NLP) includes receiving training data associated with a source domain and a target domain; and generating one or more query batches. Each query batch includes one or more source tasks associated with the source domain and one or more target tasks associated with the target domain. For each query batch, class representati…
Who is the assignee on this patent?
Salesforce Inc
What technology area does this patent fall under?
Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 10 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).