Method and apparatus for training machine reading comprehension model, and non- transitory computer-readable recording medium

US12566969B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12566969-B2
Application numberUS-202318316744-A
CountryUS
Kind codeB2
Filing dateMay 12, 2023
Priority dateMay 19, 2022
Publication dateMar 3, 2026
Grant dateMar 3, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and an apparatus for training a machine reading comprehension model, and a non-transitory computer-readable recording medium are provided. A training process is repeatedly performed using a training sample set to obtain a machine reading comprehension model. The training process includes inputting a sample article and a sample question into the machine reading comprehension model, generating a first predicted answer, and calculating a first loss between the first predicted answer and a sample answer; replacing the sample question with a mask to obtain a mask question, inputting the sample article and the mask question into the machine reading comprehension model, generating a second predicted answer corresponding to the mask question, and calculating a second loss between the second predicted answer and the sample answer; and updating the machine reading comprehension model so as to minimize a total loss.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for training a machine reading comprehension model, the method comprising: obtaining a training sample set, the training sample set including a plurality of training samples, and each of the training samples including a sample article, a sample question, and a sample answer corresponding to the sample question; and repeatedly performing a training process using the training sample set until a preset training termination condition is met, so as to obtain a trained machine reading comprehension model, wherein the training process includes inputting the sample article and the sample question into the machine reading comprehension model, generating a first predicted answer corresponding to the sample question using the machine reading comprehension model, and calculating a first loss between the first predicted answer and the sample answer corresponding to the sample question; replacing at least a part of the sample question with at least one mask to obtain a mask question, inputting the sample article and the mask question into the machine reading comprehension model, generating a second predicted answer corresponding to the mask question using the machine reading comprehension model, and calculating a second loss between the second predicted answer and the sample answer corresponding to the sample question; and calculating a total loss according to the first loss and the second loss, and updating the machine reading comprehension model so as to minimize the total loss, wherein the replacing of at least the part of the sample question with the at least one mask to obtain the mask question includes at least one of replacing all words of the sample question with the mask to obtain a first mask question; and replacing a part of the words of the sample question with the mask to obtain a second mask question, and wherein the replacing of the part of the words of the sample question with the mask to obtain the second mask question includes replacing all words except a preset interrogative sentence in the sample question with the mask to obtain the second mask question. 2 . The method for training a machine reading comprehension model as claimed in claim 1 , wherein the total loss is positively correlated with the first loss, and the total loss is negatively correlated with the second loss. 3 . The method for training a machine reading comprehension model as claimed in claim 2 , wherein the calculating of the total loss according to the first loss and the second loss includes calculating the total loss using a first formula, the first formula being loss=α 1 ·loss 1 −β 1 ·loss 2,1 , in a case where the mask question includes the first mask question, the first formula being loss=α 2 ·loss 2 −β 2 ·loss 2,2 , in a case where the mask question includes the second mask question, and the first formula being loss=α 3 ·loss 1 −β 3 ·loss 2,1 −β 4 ·loss 2,2 , in a case where the mask question includes the first mask question and the second mask question, where loss represents the total loss, lossi represents the first loss, loss 2 , 1 represents the second loss between the second predicted answer corresponding to the first mask question and the sample answer, loss 2,2 represents the second loss between the second predicted answer corresponding to the second mask question and the sample answer, and β 1 , β 1 , α 2 , β 2 , α 3 , β 3 and β 4 represent preset weights, respectively, and are positive values. 4 . The method for training a machine reading comprehension model as claimed in claim 1 , the method further comprising: predicting an answer with respect to an input article and an input question using the trained machine reading comprehension model. 5 . An apparatus for training a machine reading comprehension model, the apparatus comprising: a memory storing computer-executable instructions; and one or more processors configured to execute the computer-executable instructions such that the one or more processors are configured to obtain a training sample set, the training sample set including a plurality of training samples, and each of the training samples including a sample article, a sample question, and a sample answer corresponding to the sample question; and repeatedly perform a training process using the training sample set until a preset training termination condition is met, so as to obtain a trained machine reading comprehension model, wherein the training process includes inputting the sample article and the sample question into the machine reading comprehension model, generating a first predicted answer corresponding to the sample question using the machine reading comprehension model, and calculating a first loss between the first predicted answer and the sample answer corresponding to the sample question; replacing at least a part of the sample question with at least one mask to obtain a mask question, inputting the sample article and the mask question into the machine reading comprehension model, generating a second predicted answer corresponding to the mask question using the machine reading comprehension model, and calculating a second loss between the second predicted answer and the sample answer corresponding to the sample question; and calculating a total loss according to the first loss and the second loss, and updating the machine reading comprehension model so as to minimize the total loss, wherein the replacing of at least the part of the sample question with the at least one mask to obtain the mask question includes at least one of replacing all words of the sample question with the mask to obtain a first mask question; and replacing a part of the words of the sample question with the mask to obtain a second mask question, wherein the one or more processors are configured to replace all words except a preset interrogative sentence in the sample question with the mask to obtain the second mask question. 6 . The apparatus for training a machine reading comprehension model as claimed in claim 5 , wherein the total loss is positively correlated with the first loss, and the total loss is negatively correlated with the second loss. 7 . The apparatus for training a machine reading comprehension model as claimed in claim 6 , wherein the one or more processors are configured to calculate the total loss using a first formula, the first formula being loss=α 1 ·loss 1 −β 1 ·loss 2,1 , in a case where the mask question includes the first mask question, the first formula being loss=α 2 ·loss 1 −β 2 ·loss 2,2 , in a case where the mask question includes the second mask question, and the first formula being loss=α 3 ·loss 1 −β 3 ·loss 2,1 −β 4 ·loss 2,2 , in a case where the mask question includes the first mask question and the second mask question, where loss represents the total loss, lossi represents the first loss, loss 2 , 1 represents the second loss between the second predicted answer corresponding to the first mask question and the sample answer, loss 2,2 represents the second loss between the second predicted answer corresponding to the second mask question and the sample answer, and α 1 , β 1 , α 2 , β 2 , α 3 , β 3 and β 4 represent preset weights, respectively, and are positive values. 8 . The apparatus for training a machine reading comprehension model as claimed in claim 5 , wherein the one or more processors are further configured to predict an answer with respect to an input article and an input question using the trained machine reading comprehension model. 9 . A non-transitory computer-readable recording medium having computer-executable instructions for execution by one or more processors, wherein, the comp

Assignees

Inventors

Classifications

  • G06F40/30Primary

    Semantic analysis · CPC title

  • Editing, e.g. inserting or deleting · CPC title

  • Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title

  • Adversarial learning · CPC title

  • G06N3/09Primary

    Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12566969B2 cover?
A method and an apparatus for training a machine reading comprehension model, and a non-transitory computer-readable recording medium are provided. A training process is repeatedly performed using a training sample set to obtain a machine reading comprehension model. The training process includes inputting a sample article and a sample question into the machine reading comprehension model, gene…
Who is the assignee on this patent?
Li Hongyu, Dong Bin, Jiang Shanshan, and 2 more
What technology area does this patent fall under?
Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 03 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).