Machine learning model training method and apparatus

US12067483B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12067483-B2
Application numberUS-201916431393-A
CountryUS
Kind codeB2
Filing dateJun 4, 2019
Priority dateJan 11, 2018
Publication dateAug 20, 2024
Grant dateAug 20, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present invention provide a machine learning model training method, including: obtaining target task training data and N categories of support task training data; inputting the target task training data and the N categories of support task training data into a memory model to obtain target task training feature data and N categories of support task training feature data; training the target task model based on the target task training feature data and obtaining a first loss of the target task model, and separately training respectively corresponding support task models based on the N categories of support task training feature data and obtaining respective second losses of the N support task models; and updating the memory model, the target task model, and the N support task models based on the first loss and the respective second losses of the N support task models.

First claim

Opening claim text (preview).

What is claimed is: 1. A machine learning model training method, comprising: obtaining target task training data and N categories of support task training data, wherein both the target task training data and the N categories of support task training data are labeled data, the target task training data corresponds to a target task model, the N categories of support task training data are in a one-to-one correspondence with N support task models, and N is a positive integer; inputting the target task training data and the N categories of support task training data into a memory model to obtain target task training feature data and N categories of support task training feature data, wherein both the target task training feature data and the N categories of support task training feature data are labeled data, the target task training feature data corresponds to the target task training data, and the N categories of support task training feature data are in a one-to-one correspondence with the N categories of support task training data; training the target task model based on the target task training feature data; obtaining a first loss of the target task model; separately training the respectively corresponding support task models based on the N categories of support task training feature data; obtaining respective second losses of the N support task models; and updating the memory model, the target task model, and the N support task models based on the first loss and the respective second losses of the N support task models, wherein the updating comprises: combining the first loss and the respective second losses of the N support task models to obtain a target loss; and updating a first parameter of the memory model, a second parameter of the target task model, and respective third parameters of the N support task models based on the target loss, the method further including determining whether a quantity of updating times exceeds a first threshold; and in response to determining that the quantity of updating times does not exceed the first threshold, obtaining target task training data and N categories of support task training data, and repeatedly training the target task model based on the target task training feature data and training the respectively corresponding support task models based on the N categories of support task training feature data, until the quantity of updating times exceeds the first threshold. 2. The method according to claim 1 , wherein the target task training data comprises first target task training data and second target task training data; the inputting the target task training data into a memory model to obtain target task training feature data comprises: inputting the first target task training data and the second target task training data into the memory model to obtain first target task training feature data and second target task training feature data, wherein the target task training feature data comprises the first target task training feature data and the second target task training feature data, the first target task training feature data corresponds to the first target task training data, and the second target task training feature data corresponds to the second target task training data; the training the target task model based on the target task training feature data comprises training the target task model based on the first target task training feature data; and the obtaining a first loss of the target task model comprises: obtaining the first loss of the target task model based on the second target task training feature data and the trained target task model. 3. The method according to claim 2 , wherein the second target task training feature data comprises target task feature information and a corresponding target task label; and the obtaining the first loss of the target task model based on the second target task training feature data and the trained target task model comprises: obtaining a first test result based on the target task feature information and the trained target task model; and calculating the first loss based on the first test result and the target task label. 4. The method according to claim 3 , wherein the second target task training data comprises a plurality of target task test samples, correspondingly, the second target task training feature data comprises a plurality of target task test feature samples, and each target task test feature sample comprises first target task feature information and a corresponding first target task label; the obtaining a first test result based on the target task feature information and the trained target task model comprises: obtaining, based on first target task feature information respectively corresponding to the plurality of target task test feature samples and the trained target task model, first test results respectively corresponding to the plurality of target task test feature samples; and the calculating the first loss based on the first test result and the target task label comprises: calculating, based on the first test results respectively corresponding to the plurality of target task test feature samples and first target task labels respectively corresponding to the plurality of target task test feature samples, losses respectively corresponding to the plurality of target task test feature samples; and calculating the first loss based on the losses respectively corresponding to the plurality of target task test feature samples. 5. The method according to claim 1 , wherein the target task training data comprises a plurality of pieces of first training labeled data, the target task training feature data comprises a plurality of pieces of first training feature data, and the plurality of pieces of first training feature data are in a one-to-one correspondence with the plurality of pieces of first training labeled data; and the training the target task model based on the target task training feature data comprises: training the target task model based on the plurality of pieces of first training feature data; and the obtaining a first loss of the target task model comprises: obtaining a plurality of losses of the target task model, wherein the plurality of losses of the target task model are in a one-to-one correspondence with the plurality of pieces of first training feature data; and calculating the first loss based on the plurality of losses of the target task model. 6. The method according to claim 1 , wherein at least one category of the N categories of support task training data comprises first support task training data and second support task training data; the inputting the N categories of support task training data into a memory model to obtain N categories of support task training feature data comprises: for each category of the at least one category of support task training data, inputting the first support task training data and the second support task training data into the memory model to obtain first support task training feature data and second support task training feature data, wherein the first support task training feature data corresponds to the first support task training data, and the second support task training feature data corresponds to the second support task training data; the separately training the respectively corresponding support task models based on the N categories of support task training feature data comprises: for a support task model j, training the support task model j based on the first support task training feature data corresponding to the support task model j, wherein the support task model j is one of support task models corresponding to the at least one category of support task training data; and obt

Assignees

Inventors

Classifications

  • G06N3/08Primary

    Learning methods · CPC title

  • Supervised learning · CPC title

  • Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Subject matter not provided for in other groups of this subclass · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12067483B2 cover?
Embodiments of the present invention provide a machine learning model training method, including: obtaining target task training data and N categories of support task training data; inputting the target task training data and the N categories of support task training data into a memory model to obtain target task training feature data and N categories of support task training feature data; trai…
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 20 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).