Formulating questions using differences between sets of syntactic trees and differences between sets of semantic trees

US11914965B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11914965-B2
Application numberUS-202117389914-A
CountryUS
Kind codeB2
Filing dateJul 30, 2021
Priority dateSep 4, 2020
Publication dateFeb 27, 2024
Grant dateFeb 27, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed systems relate to generating questions from text. In an example, a method includes forming a first semantic tree from a first reference text and second semantic tree from a second reference text. The method includes identifying a set of semantic nodes that are in the first semantic tree but not in the second semantic tree. The method includes forming a first syntactic tree for the first reference text and a second syntactic tree for the second reference text. The method includes identifying a set of syntactic nodes that are in the first syntactic tree but not in the second syntactic tree. The method includes mapping the set of semantic nodes to the set of syntactic nodes by identifying a correspondence between a semantic node and a syntactic node, forming a question fragment from a normalized word, and providing the question fragment to a user device.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of generating questions from textual sources, the method comprising: forming a first semantic tree from a first reference text and second semantic tree from a second reference text, wherein each semantic tree comprises nodes and edges, wherein each node represents a role of a corresponding entity, and wherein each edge represents a relationship between two entities; identifying, from the first semantic tree and the second semantic tree, a set of semantic nodes that are (i) in the first semantic tree and (ii) not in the second semantic tree; forming a first syntactic tree for the first reference text and a second syntactic tree for the second reference text, wherein each syntactic tree comprises terminal nodes that represent words and syntactic nodes that represent syntactic categories; identifying, from the first syntactic tree and the second syntactic tree, a set of syntactic nodes that are in the first syntactic tree but not in the second syntactic tree, the set of syntactic nodes being identified based at least in part on providing the first syntactic tree and the second syntactic tree to a machine-learning model as input, the machine-learning model being previously trained to identify common syntactic nodes between two syntactic trees provided as input; mapping the set of semantic nodes to the set of syntactic nodes by identifying a correspondence between a first node in the set of semantic nodes and a second node in the set of syntactic nodes, wherein the first node and the second node are associated with a normalized word; forming a question fragment from the normalized word; and providing the question fragment to a user device. 2. The method of claim 1 , wherein identifying the set of semantic nodes comprises: identifying, between the first semantic tree and the second semantic tree, a semantic maximal common subtree that comprises a maximum number of (a) nodes, each node representing a common entity that is common between the first semantic tree and the second semantic tree and (b) edges between the nodes that represent a semantic relationship between two or more of the common entities; and removing, from the first semantic tree, nodes that are in the semantic maximal common subtree. 3. The method of claim 1 , wherein identifying the set of syntactic nodes comprises: removing, from the first syntactic tree, a set of nodes identified by the machine-learning model as being common to the first syntactic tree and the second syntactic tree. 4. The method of claim 1 , wherein forming the question fragment comprises: identifying that the normalized word represents either (i) a noun, (ii) a verb, (iii) an adjective, or (iv) an adverb; and replacing the normalized word with a question word, wherein the question word is one of (i) what, (ii) where, (iii) whom, (iv) who, or (v) how. 5. The method of claim 4 , wherein identifying that the normalized word represents either the noun, the verb, the adjective, or the adverb comprises constructing an additional syntactic tree from one or more of text associated with the normalized word, the set of semantic nodes, and the set of syntactic nodes, wherein the additional syntactic tree comprises additional nodes. 6. The method of claim 1 , wherein forming the question fragment comprises: extracting a candidate question fragment from text associated with the normalized word; identifying a level of similarity between the candidate question fragment and a text fragment template; and responsive to determining that the level of similarity is greater than a threshold, identifying the candidate question fragment as the question fragment. 7. The method of claim 1 , further comprising: receiving, from the user device, a response to the question fragment; and updating an entry in an ontology based on the response. 8. A system comprising: a non-transitory computer-readable medium storing computer-executable program instructions; and a processing device communicatively coupled to the non-transitory computer-readable medium for executing the computer-executable program instructions, wherein executing the computer-executable program instructions configures the processing device to perform operations comprising: forming a first semantic tree from a first reference text and second semantic tree from a second reference text, wherein each semantic tree comprises nodes and edges, wherein each node represents a role of a corresponding entity, and wherein each edge represents a relationship between two entities; identifying, from the first semantic tree and the second semantic tree, a set of semantic nodes that are (i) in the first semantic tree and (ii) not in the second semantic tree; forming a first syntactic tree for the first reference text and a second syntactic tree for the second reference text, wherein each syntactic tree comprises terminal nodes that represent words and syntactic nodes that represent syntactic categories; identifying, from the first syntactic tree and the second syntactic tree, a set of syntactic nodes that are in the first syntactic tree but not in the second syntactic tree, the set of syntactic nodes being identified based at least in part on providing the first syntactic tree and the second syntactic tree to a machine-learning model as input, the machine-learning model being previously trained to identify common syntactic nodes between two syntactic trees provided as input; mapping the set of semantic nodes to the set of syntactic nodes by identifying a correspondence between a first node in the set of semantic nodes and a second node in the set of syntactic nodes, wherein the first node and the second node are associated with a normalized word; forming a question fragment from the normalized word; and providing the question fragment to a user device. 9. The system of claim 8 , wherein identifying the set of semantic nodes comprises: identifying, between the first semantic tree and the second semantic tree, a semantic maximal common subtree that comprises a maximum number of (a) nodes, each node representing a common entity that is common between the first semantic tree and the second semantic tree and (b) edges between the nodes that represent a semantic relationship between two or more of the common entities; and removing, from the first semantic tree, nodes that are in the semantic maximal common subtree. 10. The system of claim 8 , wherein identifying the set of syntactic nodes comprises: removing, from the first syntactic tree, a set of nodes identified by the machine-learning model as being common to the first syntactic tree and the second syntactic tree. 11. The system of claim 8 , wherein forming the question fragment comprises: identifying that the normalized word represents either (i) a noun, (ii) a verb, (iii) adjective, or (iv) adverb; and replacing the normalized word with a question word, wherein the question word is one of (i) what, (ii) where, (iii) whom, (iv) who, or (v) how. 12. The system of claim 11 , wherein identifying that the normalized word represents either the noun, the verb, the adjective, or the adverb comprises constructing an additional syntactic tree from one or more of text associated with the normalized word, the set of semantic nodes, and the set of syntactic nodes, wherein the additional syntactic tree comprises additional nodes. 13. The system of claim 8 , wherein forming the question fragment comprises: extracting a candidate question fragment from text associated with the normalized word; identifying a level of similarity between the candidate question fragment and a text fragment template; and r

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11914965B2 cover?
Disclosed systems relate to generating questions from text. In an example, a method includes forming a first semantic tree from a first reference text and second semantic tree from a second reference text. The method includes identifying a set of semantic nodes that are in the first semantic tree but not in the second semantic tree. The method includes forming a first syntactic tree for the fir…
Who is the assignee on this patent?
Oracle Int Corp
What technology area does this patent fall under?
Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 27 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).