Correcting content generated by deep learning

US12175187B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12175187-B2
Application numberUS-202217591161-A
CountryUS
Kind codeB2
Filing dateFeb 2, 2022
Priority dateMar 3, 2021
Publication dateDec 24, 2024
Grant dateDec 24, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods for correcting raw text generated by deep learning techniques is disclosed. The methods may be performed by systems/computing devices described herein. Raw text previously generated by the deep learning techniques may be obtained. A search query can be generated from a raw text sentence of the raw text. The search query is executed against a knowledge base or a corpus of text to obtain a set of search results, the set of search results comprising a plurality of candidate true sentences that can potentially be utilized to correct one or more entities or phrases of the raw text sentence. A candidate true sentence is selected from the plurality and used to correct the raw text sentence. For example, at least one entity or phrase of the candidate true sentence can be used to replace a corresponding entity or phrase of the raw text sentence.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for correcting raw text generated by deep learning techniques, the computer-implemented method comprising: obtaining, by a computing device, the raw text generated by the deep learning techniques, the raw text comprising one or more raw text sentences; generating, by the computing device, a search query from a raw text sentence of the raw text; transmitting, by the computing device, the search query to an online search engine via an application programming interface, wherein providing the search query causes a set of search results to be obtained based on the online search engine executing the search query against a knowledge base or a corpus of text, the set of search results comprising a plurality of candidate true sentences that can potentially be utilized to correct one or more entities or phrases of the raw text sentence; selecting, by the computing device, a candidate true sentence from the plurality of candidate true sentences based at least in part on executing a generalization process using corresponding pairs of the raw text sentence and each of the plurality of candidate true sentences; generating, by the computing device, a corrected text sentence from the raw text sentence based at least in part on replacing at least one entity or phrase of the raw text sentence with a corresponding entity or phrase of the selected candidate true sentence, wherein replacing the at least one entity or the phrase corrects an untrue portion of the raw text sentence generated by the deep learning techniques; validating, by the computing device, the corrected text sentence based at least in part on comparing a first communicative discourse tree generated from the corrected text sentence with a second communicative discourse tree generated from the raw text sentence; and providing, by the computing device to a user device, the corrected text sentence. 2. The computer-implemented method of claim 1 , wherein executing the generalization process comprises generating a syntactic alignment score and a semantic alignment score for each candidate true sentence based on executing a syntactic-semantic alignment algorithm configured to determine syntactic and semantic alignment between the raw text sentence and each of the plurality of candidate true sentences, the syntactic alignment score quantifying a degree to which a syntactic representation of the raw text sentence and a syntactic representation of a respective candidate true statement agree, the semantic alignment score quantifying a degree to which a semantic representation of the raw text sentence and a semantic representation of the respective candidate true statement agree. 3. The computer-implemented method of claim 2 , wherein selecting the candidate true sentence from the plurality of candidate true sentences includes selecting the candidate true sentence having: 1) a highest syntactic alignment score of syntactic alignment scores corresponding to the plurality of candidate true sentences, and/or 2) a highest semantic alignment score of the plurality of candidate true sentences. 4. The computer-implemented method of claim 1 , further comprising validating the corrected text sentence, wherein validating the corrected text sentence comprises: generating the first communicative discourse tree for the raw text sentence; generating the second communicative discourse tree for the corrected text sentence; and identifying an alignment between the first communicative discourse tree and the second communicative discourse tree based at least in part on the comparing. 5. The computer-implemented method of claim 1 , further comprising: generating a second search query from a second raw text sentence of the raw text; executing the second search query against the knowledge base or the corpus of text to obtain a second set of search results, the second set of search results comprising a second plurality of candidate true sentences for correcting an entity or phrase of the second raw text sentence; selecting a particular candidate true sentence from the second plurality of candidate true sentences based at least in part on executing the generalization process using corresponding pairs of the second raw text sentence and each of the second plurality of candidate true sentences; and generating a second corrected text sentence from the second raw text sentence based at least in part on replacing at least one entity or phrase of the second raw text sentence with a corresponding entity or phrase of the particular candidate true sentence selected, wherein the corrected text further comprises the second corrected text sentence. 6. The computer-implemented method of claim 1 , wherein generating the search query comprises at least one of: 1) generating a discourse tree of the raw text sentence, the discourse tree comprising a plurality of nodes, each nonterminal node representing a rhetorical relationship between two sentence fragments of the raw text sentence and each terminal node of the nodes of the discourse tree being associated with a sentence fragment of the raw text sentence, or 2) generating a communicative discourse tree of the raw text sentence, the communicative discourse tree comprising the discourse tree, wherein terminal node corresponding to a respective sentence fragment of the raw text sentence is further associated with a verb signature. 7. A computing device, comprising: one or more processors; and one or more memories storing computer-executable instructions for correcting raw text generated by deep learning techniques, that, when executed by the one or more processors, cause the computing device to: obtain the raw text generated by the deep learning techniques, the raw text comprising one or more raw text sentences; generate a search query from a raw text sentence of the raw text; transmit the search query to an online search engine via an application programming interface, wherein providing the search query causes a set of search results to be obtained based on the online search engine execute-executing the search query against a knowledge base or a corpus of text, the set of search results comprising a plurality of candidate true sentences that can potentially be utilized to correct one or more entities or phrases of the raw text sentence; select a candidate true sentence from the plurality of candidate true sentences based at least in part on executing a generalization process using corresponding pairs of the raw text sentence and each of the plurality of candidate true sentences; generate a corrected text sentence from the raw text sentence based at least in part on replacing at least one entity or phrase of the raw text sentence with a corresponding entity or phrase of the selected candidate true sentence, wherein replacing the at least one entity or the phrase corrects an untrue portion of the raw text sentence generated by the deep learning techniques; validate the corrected text sentence based at least in part on comparing a first communicative discourse tree generated from the corrected text sentence with a second communicative discourse tree generated from the raw text sentence; and provide, to a user device, the corrected text sentence. 8. The computing device of claim 7 , wherein executing the generalization process causes the computing device to generate a syntactic alignment score and a semantic alignment score for each candidate true sentence based on executing a syntactic-semantic alignment algorithm, the syntactic alignment score quantifying a degree to which a syntactic representation of the raw text sentence and a syntactic representation of a respective candidate true statement agree, the semantic alignment score quantifying a de

Assignees

Inventors

Classifications

  • Lexical tools · CPC title

  • Phrasal analysis, e.g. finite state techniques or chunking · CPC title

  • Grammatical analysis; Style critique · CPC title

  • Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title

  • G06F40/30Primary

    Semantic analysis · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12175187B2 cover?
Methods for correcting raw text generated by deep learning techniques is disclosed. The methods may be performed by systems/computing devices described herein. Raw text previously generated by the deep learning techniques may be obtained. A search query can be generated from a raw text sentence of the raw text. The search query is executed against a knowledge base or a corpus of text to obtain …
Who is the assignee on this patent?
Oracle Int Corp
What technology area does this patent fall under?
Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 24 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).