Augmented text search with syntactic information

US9720905B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9720905-B2
Application numberUS-201514746564-A
CountryUS
Kind codeB2
Filing dateJun 22, 2015
Priority dateJun 22, 2015
Publication dateAug 1, 2017
Grant dateAug 1, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An approach is provided in which a knowledge manager generates syntactic annotation tokens that correspond to syntactic relationships between terms included in a source document. The knowledge manager creates a knowledge structure that stores the syntactic annotation tokens in parallel fields and stores the source document terms in original text fields, which align to the parallel fields. In turn, the knowledge manager utilizes the knowledge structure to generate answers to questions based upon the syntactic annotation tokens.

First claim

Opening claim text (preview).

The invention claimed is: 1. An information handling system comprising: one or more processors; a memory coupled to at least one of the processors; and a set of computer program instructions stored in the memory and executed by at least one of the processors in order to perform actions of: generating a plurality of syntactic annotation tokens based upon a plurality of syntactic relationships between a plurality of terms included in a source document; creating a knowledge structure that includes the plurality of terms in a plurality of original text fields and includes the plurality of syntactic annotation tokens in a plurality of parallel fields, wherein each of the plurality of syntactic annotation tokens align to at least one of the plurality of original text fields; and generating one or more question-based syntactic annotation tokens corresponding to a question; and utilizing the knowledge structure in a question answer system to generate one or more answers to the question, wherein the question answer system matches at least one of the one or more question-based syntactic annotation tokens to at least one of the plurality of syntactic annotation tokens during the generation of at least one of the one or more answers. 2. The information handling system of claim 1 wherein at least one of the one or more processors perform additional actions comprising: analyzing the question and identifying one or more question-based syntactic relationships between question terms in the question; generating the one or more question-based syntactic annotation tokens based upon the question-based syntactic relationships; including the question-based syntactic annotation tokens in a query; and using the query in the querying of the knowledge structure. 3. The information handling system of claim 2 wherein at least one of the one or more processors perform additional actions comprising: identifying a selected one of the plurality of original text fields that align to the selected parallel field; and generating one or more candidate answers utilizing one or more of the plurality of terms included in the selected original text field. 4. The information handling system of claim 1 wherein, during the generation of the knowledge structure, at least one of the one or more processors perform additional actions comprising: matching a selected one of the plurality of terms to an abstract concept entry, wherein the abstract concept entry includes the selected term and a different term; for each of the plurality of syntactic annotation tokens that include the selected term, creating an abstract syntactic annotation token that replaces the selected term with the different term, resulting in one or more abstract syntactic annotation tokens; and including one or more abstract syntactic annotation tokens in the knowledge structure. 5. The information handling system of claim 1 wherein, during the generation of the knowledge structure, at least one of the one or more processors perform additional actions comprising: selecting at least one term from the plurality of terms; for each of the plurality of syntactic annotation tokens that include the selected term, creating a variable syntactic annotation token that replaces the selected term with a variable, resulting in a plurality of variable syntactic annotation tokens; and including the plurality of variable syntactic annotation tokens in the knowledge structure. 6. The information handling system of claim 1 wherein, during the generation of the knowledge structure, at least one of the one or more processors perform additional actions comprising: for one or more of the plurality of syntactic annotation tokens, creating a relaxed syntactic annotation token that replaces a syntactic relationship identifier with a variable, resulting in a plurality of relaxed syntactic annotation tokens; and including the plurality of relaxed syntactic annotation tokens in the knowledge structure. 7. The information handling system of claim 1 wherein, prior to the generating of the plurality of syntactic annotation tokens, at least one of the one or more processors perform additional actions comprising: parsing the source document using an English Slot Grammar (ESG) parser, resulting in a plurality of syntactic items that each identify a syntactic relationship between a first one of the plurality of terms and a second one of the plurality of terms, wherein the plurality of syntactic annotation tokes are generated from the plurality of syntactic items. 8. A computer program product stored in a computer readable storage medium, comprising computer program code that, when executed by an information handling system, causes the information handling system to perform actions comprising: generating a plurality of syntactic annotation tokens based upon a plurality of syntactic relationships between a plurality of terms included in a source document; creating a knowledge structure that includes the plurality of terms in a plurality of original text fields and includes the plurality of syntactic annotation tokens in a plurality of parallel fields, wherein each of the plurality of syntactic annotation tokens align to at least one of the plurality of original text fields; and generating one or more question-based syntactic annotation tokens corresponding to a question; and utilizing the knowledge structure in a question answer system to generate one or more answers to the question, wherein the question answer system matches at least one of the one or more question-based syntactic annotation tokens to at least one of the plurality of syntactic annotation tokens during the generation of at least one of the one or more answers. 9. The computer program product of claim 8 wherein the information handling system performs additional actions comprising: analyzing the question and identifying one or more question-based syntactic relationships between question terms in the question; generating the one or more question-based syntactic annotation tokens based upon the question-based syntactic relationships; including the question-based syntactic annotation tokens in a query; and using the query in the querying of the knowledge structure. 10. The computer program product of claim 9 wherein the information handling system performs additional actions comprising: identifying a selected one of the plurality of original text fields that align to the selected parallel field; and generating one or more candidate answers utilizing one or more of the plurality of terms included in the selected original text field. 11. The computer program product of claim 8 wherein, during the generation of the knowledge structure, the information handling system performs additional actions comprising: matching a selected one of the plurality of terms to an abstract concept entry, wherein the abstract concept entry includes the selected term and a different term; for each of the plurality of syntactic annotation tokens that include the selected term, creating an abstract syntactic annotation token that replaces the selected term with the different term, resulting in one or more abstract syntactic annotation tokens; and including one or more abstract syntactic annotation tokens in the knowledge structure. 12. The computer program product of claim 8 wherein, during the generation of the knowledge structure, the information handling system performs additional actions comprising: selecting at least one term from the plurality of terms; for each of the plurality of syntactic annotation tokens that include the selected term, creating a variable syntactic annotation token that

Assignees

Inventors

Classifications

  • G06F16/36Primary

    Creation of semantic tools, e.g. ontology or thesauri · CPC title

  • Named entity recognition · CPC title

  • Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually · CPC title

  • G06F40/30Primary

    Semantic analysis · CPC title

  • Natural language query formulation or dialogue systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9720905B2 cover?
An approach is provided in which a knowledge manager generates syntactic annotation tokens that correspond to syntactic relationships between terms included in a source document. The knowledge manager creates a knowledge structure that stores the syntactic annotation tokens in parallel fields and stores the source document terms in original text fields, which align to the parallel fields. In tu…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/36. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 01 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).