Framework for analyzing table data by question answering systems

US11762890B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11762890-B2
Application numberUS-201816146698-A
CountryUS
Kind codeB2
Filing dateSep 28, 2018
Priority dateSep 28, 2018
Publication dateSep 19, 2023
Grant dateSep 19, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A question answering (QA) system comprising memory for storing instructions, and a processor configured to execute the instructions to ingest source documents that include structured data and unstructured data to create a knowledge base, wherein the unstructured data includes table data; create table annotations to represent the table data; store the ingested structured data, unstructured data, and the table annotations in the knowledge base; and determine answers to questions using the knowledge base.

First claim

Opening claim text (preview).

What is claimed is: 1. A method implemented by a question-answering (QA) system, the method comprising: ingesting source documents that include structured data and unstructured data to create a knowledge base, wherein the unstructured data includes table data; creating a plurality of table annotations to represent the table data, wherein the plurality of table annotations comprises a table annotation for each column in the table data that links content of all cells in the column together in order from a table identifier to a last cell in the column; storing the ingested structured data, unstructured data, and the table annotations in the knowledge base; and determining answers to questions using the knowledge base, wherein determining an answer to a question comprises performing a looping cells position mapping and folding method that loops through each cell data of a first table annotation in the plurality of table annotations until a keynote word search match is found, recording a cell position number of a cell in the first table annotation matching the keynote word search, and retrieving data in a corresponding cell position number from a second table annotation in the plurality of table annotations. 2. The method of claim 1 , wherein the answer to the question is not found directly in the knowledge base. 3. The method of claim 1 , further comprising: extracting the table data found in the source documents; parsing a table structure of a table that is part of the table data found in the source documents to identify table headers and content of table cells of the table; and determining annotation types of the table headers. 4. The method of claim 1 , wherein the looping cells position mapping and folding method further comprises retrieving data in the corresponding cell position number from a third table annotation. 5. The method of claim 2 , wherein determining the answer further comprises performing a curve fitting with graph axes intersection and folding method that plots one of a data cell position or a data cell content value to determine a function that is used to determine the answer. 6. The method of claim 3 , further comprising identifying units of measurement associated with the content of the table cells. 7. The method of claim 3 , wherein the table annotations link the table identifier of the table with a table column identifier associated with a table column of the table, an annotation type of a table header of the table column, a canonical name of the table header of the table column, and the content of the table cells of the table column. 8. The method of claim 3 , wherein the table annotations link the table identifier of the table with a table row identifier associated with a table row of the table, an annotation type of a table header of the table row, a canonical name of the table header of the table row, and the content of the table cells of the table row. 9. The method of claim 3 , wherein the content of all the table cells of a table row are linked in a single table annotation. 10. The method of claim 3 , wherein the content of the table cells of a table row are each linked in a separate table annotation. 11. A question answering (QA) system comprising memory for storing instructions, and a processor configured to execute the instructions to: ingest source documents that include structured data and unstructured data to create a knowledge base, wherein the unstructured data includes table data; create a plurality of table annotations to represent the table data, wherein the plurality of table annotations comprises a table annotation for each column in the table data that links content of all cells in the column together in order from a table identifier to a last cell in the column; store the ingested structured data, unstructured data, and the table annotations in the knowledge base; and determine answers to questions using the knowledge base, wherein determining an answer to a question comprises performing a looping cells position mapping and folding method that loops through each cell data of a first table annotation in the plurality of table annotations until a keynote word search match is found, recording a cell position number of a cell in the first table annotation matching the keynote word search, and retrieving data in a corresponding cell position number from a second table annotation in the plurality of table annotations. 12. The QA system of claim 11 , wherein when an answer to a question is not found directly in the knowledge base, the processor is configured to further execute the instructions to determine the answer by performing a curve fitting with graph axes intersection and folding method that plots one of a data cell position or a data cell content value to determine a function that is used to determine the answer. 13. The QA system of claim 11 , wherein the creating table annotations to represent the table data comprises: extracting the table data found in the source documents; parsing a table structure of a table that is part of the table data found in the source documents to identify table headers and content of table cells of the table; determining annotation types of the table headers; and identifying units of measurement associated with the content of the table cells. 14. The QA system of claim 11 , wherein the table annotations link the table identifier of the table with a table column identifier associated with a table column of the table, an annotation type of a table header of the table column, a canonical name of the table header of the table column, and the content of the cells of the table column. 15. A computer program product for utilizing table data in a question answering (QA) system, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: ingest source documents that include structured data and unstructured data to create a knowledge base, wherein the unstructured data includes table data; extract the table data found in the source documents; parse a table structure of a table that is part of the table data found in the source documents to identify table headers and content of table cells of the table; determine annotation types of the table headers; create a plurality of table annotations to represent the table data, wherein the plurality of table annotations comprises a table annotation for each column in the table data that links content of all cells in the column together in order from a table identifier to a last cell in the column; store the ingested structured data, unstructured data, and the table annotations in the knowledge base; and determine answers to questions using the knowledge base, wherein the program instructions for determining an answer comprises performing a looping cells position mapping and folding method that loops through each cell data of a first table annotation in the plurality of table annotations until a keynote word search match is found, recording a cell position number of a cell in the first table annotation matching the keynote word search, and retrieving data in a corresponding cell position number from at least one additional table annotation in the plurality of table annotations. 16. The computer program product of claim 15 , wherein the program instructions for determining an answer further comprises: a curve fitting with graph axes intersection and folding method that plots one of a data cell position or a data cell content value to deter

Assignees

Inventors

Classifications

  • G06F40/169Primary

    Annotation, e.g. comment data or footnotes · CPC title

  • G06F16/316Primary

    Indexing structures · CPC title

  • of tables; using ruled lines · CPC title

  • Translation of natural language queries to structured queries · CPC title

  • Inference or reasoning models · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11762890B2 cover?
A question answering (QA) system comprising memory for storing instructions, and a processor configured to execute the instructions to ingest source documents that include structured data and unstructured data to create a knowledge base, wherein the unstructured data includes table data; create table annotations to represent the table data; store the ingested structured data, unstructured data,…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F40/169. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 19 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).