What technology area does this patent fall under?

Primary CPC classification G06N5/02. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Aug 01 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Learning to combine explicit diversity conditions for effective question answer generation

US2024256906A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2024256906-A1
Application number	US-202318401074-A
Country	US
Kind code	A1
Filing date	Dec 29, 2023
Priority date	Jan 27, 2023
Publication date	Aug 1, 2024
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method includes predicting, using the at least one processing device, a question type for each section of a document using a trained question type prediction model, each section including a different portion of the document. The method also includes generating, using the at least one processing device, multiple question-answer pairs using a trained question-answer generation model that receives the predicted question types and the document as input. Each question-answer pair includes (i) a question having a type corresponding to one of the predicted question types and being associated with content in the section corresponding to the type and (ii) an answer to the question. The method further includes outputting, using the at least one processing device, the question-answer pairs for use in training a question answering model.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: predicting, using at least one processing device of an electronic device, a question type for each section of a document using a trained question type prediction model, each section including a different portion of the document; generating, using the at least one processing device, multiple question-answer pairs using a trained question-answer generation model that receives the predicted question types and the document as input, each question-answer pair comprising (i) a question having a type corresponding to one of the predicted question types and being associated with content in the section corresponding to the type and (ii) an answer to the question; and outputting, using the at least one processing device, the question-answer pairs for use in training a question answering model. 2 . The method of claim 1 , wherein the type of the question in each question-answer pair indicates that the associated question starts with one of: what, when, where, who, whom, which, whose, why, or how. 3 . The method of claim 1 , wherein the question type prediction model is trained to predict one or more question types for the document based on one or more entities described in the document. 4 . The method of claim 3 , further comprising: predicting a question type for each of the one or more entities using the question type prediction model; inputting the document and the question type for each of the one or more entities to the question-answer generation model; and generating multiple additional question-answer pairs using the question-answer generation model. 5 . The method of claim 1 , wherein: the question type prediction model is trained to output a list of possible question types for possible combinations of section and entity for the document; and generating the multiple question-answer pairs using the trained question-answer generation model comprises generating at least one question-answer pair for each of the possible combinations. 6 . The method of claim 5 , wherein: each question-answer pair is associated with a quality score; and a particular question-answer pair is not output if the quality score of the particular question-answer pair is less than a specified threshold. 7 . The method of claim 1 , further comprising: comparing the question-answer pairs to each other to determine any duplicate question-answer pairs; and removing one or more duplicate question-answer pairs before outputting the question-answer pairs. 8 . The method of claim 1 , wherein the question type prediction model and the question-answer generation model comprise large language models. 9 . An electronic device comprising: at least one processing device configured to: predict a question type for each section of a document using a trained question type prediction model, each section including a different portion of the document; generate multiple question-answer pairs using a trained question-answer generation model that receives the predicted question types and the document as input, each question-answer pair comprising (i) a question having a type corresponding to one of the predicted question types and being associated with content in the section corresponding to the type and (ii) an answer to the question; and output the question-answer pairs for use in training a question answering model. 10 . The electronic device of claim 9 , wherein the type of the question in each question-answer pair indicates that the associated question starts with one of: what, when, where, who, whom, which, whose, why, or how. 11 . The electronic device of claim 9 , wherein the question type prediction model is trained to predict one or more question types for the document based on one or more entities described in the document. 12 . The electronic device of claim 11 , wherein the at least one processing device is further configured to: predict a question type for each of the one or more entities using the question type prediction model; input the document and the question type for each of the one or more entities to the question-answer generation model; and generate multiple additional question-answer pairs using the question-answer generation model. 13 . The electronic device of claim 9 , wherein: the question type prediction model is trained to output a list of possible question types for possible combinations of section and entity for the document; and to generate the multiple question-answer pairs using the trained question-answer generation model, the at least one processing device is configured to generate at least one question-answer pair for each of the possible combinations. 14 . The electronic device of claim 13 , wherein: each question-answer pair is associated with a quality score; and the at least one processing device is configured to not output a particular question-answer pair if the quality score of the particular question-answer pair is less than a specified threshold. 15 . The electronic device of claim 9 , wherein the at least one processing device is further configured to: compare the question-answer pairs to each other to determine any duplicate question-answer pairs; and remove one or more duplicate question-answer pairs before outputting the question-answer pairs. 16 . The electronic device of claim 9 , wherein the question type prediction model and the question-answer generation model comprise large language models. 17 . A non-transitory machine-readable medium containing instructions that when executed cause at least one processor of an electronic device to: predict a question type for each section of a document using a trained question type prediction model, each section including a different portion of the document; generate multiple question-answer pairs using a trained question-answer generation model that receives the predicted question types and the document as input, each question-answer pair comprising (i) a question having a type corresponding to one of the predicted question types and being associated with content in the section corresponding to the type and (ii) an answer to the question; and output the question-answer pairs for use in training a question answering model. 18 . The non-transitory machine-readable medium of claim 17 , wherein: the question type prediction model is trained to predict one or more question types for the document based on one or more entities described in the document; and the non-transitory machine-readable medium further contains instructions that when executed cause the at least one processor to: predict a question type for each of the one or more entities using the question type prediction model; input the document and the question type for each of the one or more entities to the question-answer generation model; and generate multiple additional question-answer pairs using the question-answer generation model. 19 . The non-transitory machine-readable medium of claim 17 , wherein: the question type prediction model is trained to output a list of possible question types for possible combinations of section and entity for the document; and the instructions that when executed cause the at least one processor to generate the multiple question-answer pairs using the trained question-answer generation model comprise: instructions that when executed cause the at least one processor to generate at least one question-answer pair for each of the possible combinations. 20

Assignees

Samsung Electronics Co Ltd

Inventors

Classifications

G06F40/216
using statistical methods · CPC title
G06F40/35
Discourse or dialogue representation · CPC title
G06F40/30
Semantic analysis · CPC title
G06F40/44
Statistical methods, e.g. probability models · CPC title
G06F40/56
Natural language generation · CPC title

Patent family

Related publications grouped by family.

View patent family 91963386

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024256906A1 cover?: A method includes predicting, using the at least one processing device, a question type for each section of a document using a trained question type prediction model, each section including a different portion of the document. The method also includes generating, using the at least one processing device, multiple question-answer pairs using a trained question-answer generation model that receiv…
Who is the assignee on this patent?: Samsung Electronics Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06N5/02. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Aug 01 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).