Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06F16/355. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Apr 15 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Incremental clustering for enterprise knowledge graph

US2021109952A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2021109952-A1
Application number	US-201916601082-A
Country	US
Kind code	A1
Filing date	Oct 14, 2019
Priority date	Oct 14, 2019
Publication date	Apr 15, 2021
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Examples described herein generally relate to a computer system including a knowledge graph storing a plurality of entities. The computer system compares source documents within an enterprise intranet to a plurality of templates defining potential entity attributes to identify extracts matching at least one of the plurality of templates. The computer system parses the extracts according to respective templates of the plurality of templates that match the extracts to determine instances. The computer system performs incremental clustering on a number of the instances to determine potential entity names. The computer system queries the knowledge graph with the potential entity names to obtain a set of candidate entity records. The computer system links the potential entity names with at least partial matching ones of the set of candidate entity records to define updated matching candidate entity records. The computer system updates the knowledge graph with the updated matching candidate entity records.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer system, comprising: a knowledge graph storing a plurality of entities associated with an enterprise; a memory storing computer-executable instructions; and a processor configured to execute the instructions to: compare enterprise source documents within an enterprise intranet to a plurality of templates defining potential entity attributes to identify extracts of the enterprise source documents matching at least one of the plurality of templates; parse the extracts according to respective templates of the plurality of templates that match the extracts to determine instances; perform clustering on a number of the instances to determine potential entity names; query the knowledge graph with the potential entity names to obtain a set of candidate entity records; link the potential entity names with at least partial matching ones of the set of candidate entity records to define updated matching candidate entity records including attributes corresponding to instances associated with the potential entity names; and update the knowledge graph with the updated matching candidate entity records and with new entity records for unmatched potential entity names, wherein the unmatched potential entity names are defined by ones of the potential entity names that do not match with any of the set of candidate entity records. 2 . The computer system of claim 1 , wherein the number of the instances is based on an amount of the memory required to store the number of the instances and associated clustering metadata, and wherein performing the clustering on the number of the instances and performing the clustering on a second set of the number of the instances uses less memory than performing the clustering on a set of instances including twice the number of the instances. 3 . The computer system of claim 1 , wherein the processor is configured to: determine that one of the enterprise source documents associated with a candidate entity record of the set of candidate entity records is more relevant to one of the potential entity names than the candidate entity record; link the one of the enterprise source documents to the one of the potential entity names; and store the one of the potential entity names in the knowledge graph as a new entity record. 4 . The computer system of claim 1 , wherein a level of uncertainty is associated with a potential entity name, wherein the processor is configured to query the knowledge graph using alternative potential entity names based on the level of uncertainty. 5 . The computer system of claim 1 , wherein the processor is configured to determine a level of uncertainty associated with a candidate entity record of the set of candidate entity records based on supporting documents associated with the candidate entity record in the knowledge graph. 6 . The computer system of claim 5 , wherein the processor is configured to link one or more of the potential entity names with the candidate entity record based on the one or more of the potential entity names partially matching the candidate entity record according to the level of uncertainty of the candidate entity record. 7 . The computer system of claim 1 , wherein the processor is configured to determine a status of each of the updated matching candidate entity records and each of the new entity records as one of established or formative based on a level of uncertainty for a respective entity record. 8 . The computer system of claim 1 , wherein the processor is configured to display at least a portion of an entity page including a plurality of attributes of an entity record in the knowledge graph to a user based on permissions of the user to view the enterprise source documents associated with the entity record. 9 . A method of incrementally building a knowledge graph storing a plurality of entities associated with an enterprise, comprising: comparing enterprise source documents within an enterprise intranet to a plurality of templates defining potential entity attributes to identify extracts of the enterprise source documents matching at least one of the plurality of templates; parsing the extracts according to respective templates of the plurality of templates that match the extracts to determine instances; performing clustering on a number of the instances to determine potential entity names; querying the knowledge graph with the potential entity names to obtain a set of candidate entity records; linking the potential entity names with at least partial matching ones of the set of candidate entity records to define updated matching candidate entity records including attributes corresponding to instances associated with the potential entity names; and updating the knowledge graph with the updated matching candidate entity records and with new entity records for unmatched potential entity names, wherein the unmatched potential entity names are defined by ones of the potential entity names that do not match with any of the set of candidate entity records. 10 . The method of claim 9 , wherein the number of the instances is based on an amount of computer memory required to store the number of the instances and associated clustering metadata, and wherein performing the clustering on the number of the instances and performing the clustering on a second set of the number of the instances uses less memory than performing the clustering on a set of instances including twice the number of the instances. 11 . The method of claim 9 , wherein linking the potential entity names with at least partial matching ones of the set of candidate entity records to define updated matching candidate entity records comprises: determining that one of the enterprise source documents associated with a candidate entity record of the set of candidate entity records is more relevant to one of the potential entity names than the candidate entity record; linking the one of the enterprise source documents to the one of the potential entity names; and storing the one of the potential entity names in the knowledge graph as a new entity record. 12 . The method of claim 9 , wherein a level of uncertainty is associated with each attribute associated with a potential entity name, wherein querying the knowledge graph comprises querying the knowledge graph using alternative potential entity names based on the level of uncertainty. 13 . The method of claim 9 , further comprising determining a level of uncertainty associated with a candidate entity record of the set of candidate entity records based on supporting documents associated with the candidate entity record in the knowledge graph, wherein linking the potential entity names with at least partial matching ones of the set of candidate entity records comprises linking one or more of the potential entity names with the candidate entity record based on the one or more of the potential entity names partially matching the candidate entity record according to the level of uncertainty of the candidate entity record. 14 . The method of claim 9 , further comprising displaying at least a portion of an entity page including a plurality of attributes of an entity record in the knowledge graph to a user based on permissions of the user to view the enterprise source documents associated with the entity record. 15 . A non-transitory computer-readable medium storing computer-executable instructions that when executed by a computer processor cause the computer processor to: compare enterprise source documents within an enterprise intranet to a plurality of templates defining potenti

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06F16/355Primary
Creation or modification of classes or clusters · CPC title
G06F16/285Primary
Clustering or classification · CPC title
G06F16/2465
Query processing support for facilitating data mining operations in structured databases · CPC title
G06F16/9024
Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

Patent family

Related publications grouped by family.

View patent family 75383067

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021109952A1 cover?: Examples described herein generally relate to a computer system including a knowledge graph storing a plurality of entities. The computer system compares source documents within an enterprise intranet to a plurality of templates defining potential entity attributes to identify extracts matching at least one of the plurality of templates. The computer system parses the extracts according to resp…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06F16/355. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Apr 15 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).