Framework for Analyzing Table Data by Question Answering Systems

US2020104414A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020104414-A1
Application numberUS-201816146698-A
CountryUS
Kind codeA1
Filing dateSep 28, 2018
Priority dateSep 28, 2018
Publication dateApr 2, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A question answering (QA) system comprising memory for storing instructions, and a processor configured to execute the instructions to ingest source documents that include structured data and unstructured data to create a knowledge base, wherein the unstructured data includes table data; create table annotations to represent the table data; store the ingested structured data, unstructured data, and the table annotations in the knowledge base; and determine answers to questions using the knowledge base.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method for utilizing table data in a question answering (QA) system, the computer-implemented method comprising: ingesting, by the QA system, source documents that include structured data and unstructured data to create a knowledge base, wherein the unstructured data includes table data; creating, by the QA system, table annotations to represent the table data; storing, by the QA system, the ingested structured data, unstructured data, and the table annotations in the knowledge base; and determining, by the QA system, answers to questions using the knowledge base. 2 . The computer-implemented method of claim 1 , wherein determining, by the QA system, answers to questions using the knowledge base comprises performing a looping cells position mapping and folding method that loops through each cell data of a first table annotation until a keynote words search match is found, recording a cell position number of a cell matching the keynote words search, and retrieving data in a corresponding cell position number from a second table annotation. 3 . The computer-implemented method of claim 1 , wherein an answer to a question is not found directly in the knowledge base. 4 . The computer-implemented method of claim 1 , further comprising: extracting the table data found in the source documents; parsing a table structure of a table that is part of the table data found in the source documents to identify table headers and content of table cells of the table; and determining annotation types of the table headers. 5 . The computer-implemented method of claim 2 , wherein the looping cells position mapping and folding method further comprises retrieving data in the corresponding cell position number from a third table annotation. 6 . The computer-implemented method of claim 3 , wherein the answer is determined, by the QA system, by performing a curve fitting with graph axes intersection and folding method that plots one of a data cell position or a data cell content value to determine a function that is used to determine the answer. 7 . The computer-implemented method of claim 4 , further comprising identifying units of measurement associated with the content of the table cells. 8 . The computer-implemented method of claim 4 , wherein the table annotations links a table identifier of the table with a table column identifier associated with a table column of the table, an annotation type of a table header of the table column, a canonical name of the table header of the table column, and the content of the table cells of the table column. 9 . The computer-implemented method of claim 4 , wherein the content of all the table cells of a table column are linked in a single table annotation. 10 . The computer-implemented method of claim 4 , wherein the content of the table cells of a table column are each linked in a separate table annotation. 11 . The computer-implemented method of claim 4 , wherein the table annotations links a table identifier of the table with a table row identifier associated with a table row of the table, an annotation type of a table header of the table row, a canonical name of the table header of the table row, and the content of the table cells of the table row. 12 . The computer-implemented method of claim 4 , wherein the content of all the table cells of a table row are linked in a single table annotation. 13 . The computer-implemented method of claim 4 , wherein the content of the table cells of a table row are each linked in a separate table annotation. 14 . A question answering (QA) system comprising memory for storing instructions, and a processor configured to execute the instructions to: ingest source documents that include structured data and unstructured data to create a knowledge base, wherein the unstructured data includes table data; create table annotations to represent the table data; store the ingested structured data, unstructured data, and the table annotations in the knowledge base; and determine answers to questions using the knowledge base. 15 . The QA system of claim 14 , wherein determining answers to questions using the knowledge base comprises performing a looping cells position mapping and folding method that loops through each cell data of a first table annotation until a keynote words search match is found, recording a cell position number of a cell matching the keynote words search, and retrieving data in a corresponding cell position number from at least one additional table annotation. 16 . The QA system of claim 14 , wherein an answer to a question is not found directly in the knowledge base, and wherein the answer is determined by performing a curve fitting with graph axes intersection and folding method that plots one of a data cell position or a data cell content value to determine a function that is used to determine the answer. 17 . The QA system of claim 14 , wherein the creating table annotations to represent the table data comprises: extracting the table data found in the source documents; parsing a table structure of a table that is part of the table data found in the source documents to identify table headers and content of table cells of the table; determining annotation types of the table headers; and identifying units of measurement associated with the content of the table cells. 18 . The QA system of claim 14 , wherein the table annotations links a table identifier of the table with a table column identifier associated with a table column of the table, an annotation type of a table header of the table column, a canonical name of the table header of the table column, and the content of the table cells of the table column. 19 . A computer program product for utilizing table data in a question answering (QA) system, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: ingest source documents that include structured data and unstructured data to create a knowledge base, wherein the unstructured data includes table data; extract the table data found in the source documents; parse a table structure of a table that is part of the table data found in the source documents to identify table headers and content of table cells of the table; determine annotation types of the table headers; create table annotations to represent the table data by linking a table identifier of the table with a table column identifier associated with a table column of the table, an annotation type of a table header of the table column, a canonical name of the table header of the table column, and the content of the table cells of the table column; store the ingested structured data, unstructured data, and the table annotations in the knowledge base; and determine answers to questions using the knowledge base. 20 . The computer program product of claim 19 , wherein the program instructions for determining an answer comprises: a looping cells position mapping and folding method that loops through each cell data of a first table annotation until a keynote words search match is found, recording a cell position number of a cell matching the keynote words search, and retrieving data in a corresponding cell position number from at least one additional table annotation; and a curve fitting with graph axes intersection and folding method that plots one of a

Assignees

Inventors

Classifications

  • Translation of natural language queries to structured queries · CPC title

  • of tables; using ruled lines · CPC title

  • G06F40/169Primary

    Annotation, e.g. comment data or footnotes · CPC title

  • Computing arrangements using knowledge-based models · CPC title

  • G06F16/316Primary

    Indexing structures · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020104414A1 cover?
A question answering (QA) system comprising memory for storing instructions, and a processor configured to execute the instructions to ingest source documents that include structured data and unstructured data to create a knowledge base, wherein the unstructured data includes table data; create table annotations to represent the table data; store the ingested structured data, unstructured data,…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F40/169. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 02 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).