Systems and methods for structured text translation with tag alignment
US-2021397799-A1 · Dec 23, 2021 · US
US12406150B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12406150-B2 |
| Application number | US-202117534899-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 24, 2021 |
| Priority date | Nov 25, 2020 |
| Publication date | Sep 2, 2025 |
| Grant date | Sep 2, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Machine learning (ML) systems and methods for fact extraction and claim verification are provided. The system receives a claim and retrieves a document from a dataset. The document has a first relatedness score higher than a first threshold, which indicates that ML models of the system determine that the document is most likely to be relevant to the claim. The dataset includes supporting documents and claims including a first group of claims supported by facts from more than two supporting documents and a second group of claims not supported by the supporting documents. The system selects a set of sentences from the document. The set of sentences have second relatedness scores higher than a second threshold, which indicate that the ML models determine that the set of sentences are most likely to be relevant to the claim. The system determines whether the claim includes facts from the set of sentences.
Opening claim text (preview).
What is desired to be protected by Letters Patent is set forth in the following claims: 1. A machine learning system for fact extraction and claim verification, comprising: a memory; and a processor in communication with the memory, the processor: receiving a claim comprising one or more sentences; retrieving, based at least in part on one or more machine learning models, a document from a dataset, the document having a first relatedness score higher than a first threshold, wherein the first relatedness score indicates that the one or more machine learning models determines that the document is most likely to be relevant to the claim, wherein the dataset comprises a plurality of supporting documents and a plurality of claims, the plurality of claims comprising a first group of claims supported by facts from more than two supporting documents from the plurality of supporting documents and a second group of claims not supported by the plurality of supporting documents; selecting, based at least in part on the one or more machine learning models, a set of sentences from the document, the set of sentences having second relatedness scores higher than a second threshold, wherein the second relatedness scores indicate that the one or more machine learning models determine that the set of sentences are most likely to be relevant to the claim; and determining, based at least in part on the one or more machine learning models, whether the claim includes one or more facts from the set of sentences. 2. The system of claim 1 , wherein the first group of claims comprise an n-hop claim created at least by a valid (n-1)-hop claim supported by one or more facts from (n-1) supporting documents of the plurality of supporting documents, wherein n is an integer number equal to or greater than 2, wherein one or more entities of the valid (n-1)-hop claim are substituted by information from an additional supporting document of the plurality of supporting documents, the information describing the one or more entities. 3. The system of claim 2 , wherein the additional supporting document comprises a hyperlink of the one or more entities in a text body of the additional supporting document, and a title of the additional supporting document is mentioned in a text body of a supporting document of the valid (n-1)-hop claim. 4. The system of claim 3 , wherein the one or more entities comprise a title of the supporting document, or the one more entities are part of a text body of a supporting document of the valid (n-1)-hop claim. 5. The system of claim 1 , wherein the second group of claims comprise claims having information that is not in the first group of claims, or claims having less information than the first group of claims. 6. The system of claim 1 , wherein the processor automatically substitutes one or more words of at least one claim of the first group of claims with one or more new words predicted by a machine learning model to form at least one claim of the second group of claims. 7. The system of claim 1 , wherein the processor automatically substitutes one or more entities of at least one claim of the first group of claims with one or more new entities that are not titles of any supporting documents of the at least one claim to form at least one claim of the second group of claims. 8. The system of claim 1 , wherein at least one claim of the second group of claims is created by removing or adding one or more negation words, or substituting a phrase with its antonyms in at least one claim of the first group of claims. 9. The system of claim 1 , wherein the first group of claims are labeled as supported claims, and the second group of claims are labeled as not-supported claims. 10. The system of claim 1 , wherein the one or more machine learning models comprise one or more pre-trained language representations models and one or more natural language inference models. 11. The system of claim 1 , wherein the processor retrieves, based at least in part on the one or more machine learning models, a plurality of documents from the plurality of supporting documents in response to a query associated with the claim, wherein the document is retrieved from the plurality of documents. 12. The system of claim 1 , wherein the processor determines an accuracy of the one or more machine learning models by comparing the determinations of the one or more machine learning models with ground truth provided by the dataset. 13. The system of claim 1 , wherein the dataset provides reasoning graphs of diverse shapes showing relationships between the first group of claims and the plurality of supporting documents. 14. A machine learning method for fact extraction and claim verification, comprising: receiving a claim comprising one or more sentences; retrieving, based at least in part on one or more machine learning models, a document from a dataset, the document having a first relatedness score higher than a first threshold, wherein the first relatedness score indicates that the one or more machine learning models determines that the document is most likely to be relevant to the claim, wherein the dataset comprises a plurality of supporting documents and a plurality of claims, the plurality of claims comprising a first group of claims supported by facts from more than two supporting documents from the plurality of supporting documents and a second group of claims not supported by the plurality of supporting documents; selecting, based at least in part on the one or more machine learning models, a set of sentences from the document, the set of sentences having second relatedness scores higher than a second threshold, wherein the second relatedness scores indicate that the one or more machine learning models determine that the set of sentences are most likely to be relevant to the claim; and determining, based at least in part on the one or more machine learning models, whether the claim includes one or more facts from the set of sentences. 15. The method of claim 14 , wherein the first group of claims comprise an n-hop claim created at least by a valid (n-1)-hop claim supported by one or more facts from (n-1) supporting documents of the plurality of supporting documents, wherein n is an integer number equal to or greater than 2, wherein one or more entities of the valid (n-1)-hop claim are substituted by information from an additional supporting document of the plurality of supporting documents, the information describing the one or more entities. 16. The method of claim 15 , wherein the additional supporting document comprises a hyperlink of the one or more entities in a text body of the additional supporting document, and a title of the additional supporting document is mentioned in a text body of a supporting document of the valid (n-1)-hop claim. 17. The method of claim 16 , wherein the one or more entities comprise a title of the supporting document, or the one more entities are part of a text body of a supporting document of the valid (n-1)-hop claim. 18. The method of claim 14 , wherein the second group of claims comprise claims having information that is not in the first group of claims, or claims having less information than the first group of claims. 19. The method of claim 14 , further comprising automatically substituting one or more words of at least one claim of the first group of claims with one or more new words predicted by a machine learning model to form at least one claim of the second group of claims. 20. The method of claim 14 , further compris
Supervised learning · CPC title
Clustering; Classification · CPC title
Validation · CPC title
Named entity recognition · CPC title
Inference or reasoning models · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.