Database query generation using natural language text
US-11860916-B2 · Jan 2, 2024 · US
US2021109952A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2021109952-A1 |
| Application number | US-201916601082-A |
| Country | US |
| Kind code | A1 |
| Filing date | Oct 14, 2019 |
| Priority date | Oct 14, 2019 |
| Publication date | Apr 15, 2021 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Examples described herein generally relate to a computer system including a knowledge graph storing a plurality of entities. The computer system compares source documents within an enterprise intranet to a plurality of templates defining potential entity attributes to identify extracts matching at least one of the plurality of templates. The computer system parses the extracts according to respective templates of the plurality of templates that match the extracts to determine instances. The computer system performs incremental clustering on a number of the instances to determine potential entity names. The computer system queries the knowledge graph with the potential entity names to obtain a set of candidate entity records. The computer system links the potential entity names with at least partial matching ones of the set of candidate entity records to define updated matching candidate entity records. The computer system updates the knowledge graph with the updated matching candidate entity records.
Opening claim text (preview).
What is claimed is: 1 . A computer system, comprising: a knowledge graph storing a plurality of entities associated with an enterprise; a memory storing computer-executable instructions; and a processor configured to execute the instructions to: compare enterprise source documents within an enterprise intranet to a plurality of templates defining potential entity attributes to identify extracts of the enterprise source documents matching at least one of the plurality of templates; parse the extracts according to respective templates of the plurality of templates that match the extracts to determine instances; perform clustering on a number of the instances to determine potential entity names; query the knowledge graph with the potential entity names to obtain a set of candidate entity records; link the potential entity names with at least partial matching ones of the set of candidate entity records to define updated matching candidate entity records including attributes corresponding to instances associated with the potential entity names; and update the knowledge graph with the updated matching candidate entity records and with new entity records for unmatched potential entity names, wherein the unmatched potential entity names are defined by ones of the potential entity names that do not match with any of the set of candidate entity records. 2 . The computer system of claim 1 , wherein the number of the instances is based on an amount of the memory required to store the number of the instances and associated clustering metadata, and wherein performing the clustering on the number of the instances and performing the clustering on a second set of the number of the instances uses less memory than performing the clustering on a set of instances including twice the number of the instances. 3 . The computer system of claim 1 , wherein the processor is configured to: determine that one of the enterprise source documents associated with a candidate entity record of the set of candidate entity records is more relevant to one of the potential entity names than the candidate entity record; link the one of the enterprise source documents to the one of the potential entity names; and store the one of the potential entity names in the knowledge graph as a new entity record. 4 . The computer system of claim 1 , wherein a level of uncertainty is associated with a potential entity name, wherein the processor is configured to query the knowledge graph using alternative potential entity names based on the level of uncertainty. 5 . The computer system of claim 1 , wherein the processor is configured to determine a level of uncertainty associated with a candidate entity record of the set of candidate entity records based on supporting documents associated with the candidate entity record in the knowledge graph. 6 . The computer system of claim 5 , wherein the processor is configured to link one or more of the potential entity names with the candidate entity record based on the one or more of the potential entity names partially matching the candidate entity record according to the level of uncertainty of the candidate entity record. 7 . The computer system of claim 1 , wherein the processor is configured to determine a status of each of the updated matching candidate entity records and each of the new entity records as one of established or formative based on a level of uncertainty for a respective entity record. 8 . The computer system of claim 1 , wherein the processor is configured to display at least a portion of an entity page including a plurality of attributes of an entity record in the knowledge graph to a user based on permissions of the user to view the enterprise source documents associated with the entity record. 9 . A method of incrementally building a knowledge graph storing a plurality of entities associated with an enterprise, comprising: comparing enterprise source documents within an enterprise intranet to a plurality of templates defining potential entity attributes to identify extracts of the enterprise source documents matching at least one of the plurality of templates; parsing the extracts according to respective templates of the plurality of templates that match the extracts to determine instances; performing clustering on a number of the instances to determine potential entity names; querying the knowledge graph with the potential entity names to obtain a set of candidate entity records; linking the potential entity names with at least partial matching ones of the set of candidate entity records to define updated matching candidate entity records including attributes corresponding to instances associated with the potential entity names; and updating the knowledge graph with the updated matching candidate entity records and with new entity records for unmatched potential entity names, wherein the unmatched potential entity names are defined by ones of the potential entity names that do not match with any of the set of candidate entity records. 10 . The method of claim 9 , wherein the number of the instances is based on an amount of computer memory required to store the number of the instances and associated clustering metadata, and wherein performing the clustering on the number of the instances and performing the clustering on a second set of the number of the instances uses less memory than performing the clustering on a set of instances including twice the number of the instances. 11 . The method of claim 9 , wherein linking the potential entity names with at least partial matching ones of the set of candidate entity records to define updated matching candidate entity records comprises: determining that one of the enterprise source documents associated with a candidate entity record of the set of candidate entity records is more relevant to one of the potential entity names than the candidate entity record; linking the one of the enterprise source documents to the one of the potential entity names; and storing the one of the potential entity names in the knowledge graph as a new entity record. 12 . The method of claim 9 , wherein a level of uncertainty is associated with each attribute associated with a potential entity name, wherein querying the knowledge graph comprises querying the knowledge graph using alternative potential entity names based on the level of uncertainty. 13 . The method of claim 9 , further comprising determining a level of uncertainty associated with a candidate entity record of the set of candidate entity records based on supporting documents associated with the candidate entity record in the knowledge graph, wherein linking the potential entity names with at least partial matching ones of the set of candidate entity records comprises linking one or more of the potential entity names with the candidate entity record based on the one or more of the potential entity names partially matching the candidate entity record according to the level of uncertainty of the candidate entity record. 14 . The method of claim 9 , further comprising displaying at least a portion of an entity page including a plurality of attributes of an entity record in the knowledge graph to a user based on permissions of the user to view the enterprise source documents associated with the entity record. 15 . A non-transitory computer-readable medium storing computer-executable instructions that when executed by a computer processor cause the computer processor to: compare enterprise source documents within an enterprise intranet to a plurality of templates defining potenti
Creation or modification of classes or clusters · CPC title
Clustering or classification · CPC title
Query processing support for facilitating data mining operations in structured databases · CPC title
Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.