Matching co-referring entities from serialized data for schema inference

US9977817B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9977817-B2
Application numberUS-201414518361-A
CountryUS
Kind codeB2
Filing dateOct 20, 2014
Priority dateOct 20, 2014
Publication dateMay 22, 2018
Grant dateMay 22, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and method provide for identifying coreference from serialized data coming from different services. The method includes generating a tree structure from serialized data. The serialized data includes responses to queries from the different services. The responses each identify a hierarchical relationship between a respective set of objects. Nodes of the tree structure each have a name corresponding to a respective one of the objects. The tree structure is traversed in a breadth first manner and, for each node in the tree structure, a respective pairwise similarity is computed with each of the other nodes of the tree structure. The computed pairwise similarity is compared with a threshold to identify co-referring nodes that refer to a same entity. The threshold is a function of a depth of the node in the tree structure.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for identifying coreference from serialized data comprising: generating a tree structure from serialized data, the serialized data comprising responses to queries from different services, the responses each identifying a hierarchical relationship between a respective set of objects, nodes of the tree structure each having a name corresponding to a respective one of the objects; traversing the tree structure in a breadth first manner and for each node in the tree structure, computing a respective pairwise similarity with other nodes of the tree structure; comparing the computed pairwise similarity with a threshold to identify co-referring nodes that refer to a same entity, the threshold being a function of a depth of the node in the tree structure; merging two nodes identified as being co-referring nodes, the merging including identifying all of the children of the two nodes as having both of the two nodes as their parents; generating a directed acyclic graph which includes the merged nodes; and outputting information based on the identified co-referring nodes, the information comprising the directed acyclic graph or information based thereon, wherein at least one of the generating of the tree structure, computing a respective pairwise similarity, and identifying co-referring nodes is performed with a processor. 2. The method of claim 1 , wherein the computing of the pairwise similarity comprises computing a first similarity based on a similarity of the two nodes being compared and computing a second similarity based on a similarity of children of the two nodes being compared and aggregating the first and second similarities. 3. The method of claim 2 , wherein the computing of the first similarity is also based on parents of the two nodes being compared. 4. The method of claim 2 , wherein the computing of the second similarity comprises identifying a number of overlapping children of the first and second nodes. 5. The method of claim 4 , wherein the second similarity is a function of: sim 2 ⁡ ( n , n ′ ) =  n . children ⋂ n ′ . children  min (  n . children  ,  n ′ . children  where n represents the first node and n′ represents the second node. 6. The method of claim 2 , wherein the aggregating comprises multiplying the first and second similarities. 7. The method of claim 6 , wherein the threshold is a function of m + ( M - m ) ⁢ 1 - h H , where m and M are predefined constant values, h is a depth of the node and H is a maximum depth of the tree structure. 8. The method of claim 1 , wherein the threshold is a concave decreasing function of the depth. 9. The method of claim 1 , further comprising merging two nodes identified as being co-referring objects, the merging including identifying all of the children of the two nodes as having both of the two nodes as their parents. 10. The method of claim 9 , comprising generating a directed acyclic graph which includes the merged nodes. 11. The method of claim 1 wherein the serialization format of the serialized data is JSON. 12. The method of claim 1 , wherein the generating of the tree structure from serialized data comprises merging objects of a list into one node which has as its name the key-value of its parent key. 13. The method of claim 1 , wherein the services are services of a same organization that use at least one of: different names for the same object, and a same name for different objects. 14. The method of claim 1 , wherein the serialized data comprises responses to queries of a database, the method further comprising: enriching a database with information on the identified co-referring nodes. 15. A method for identifying coreference from serialized data comprising: generating a tree structure from serialized data, the serialized data comprising responses to database queries from different services, the responses each identifying a hierarchical relationship between a respective set of objects, nodes of the tree structure each having a name corresponding to a respective one of the objects; traversing the tree structure in a breadth first manner and for each node in the tree structure, computing a respective pairwise similarity with other nodes of the tree structure, the computing of the pairwise similarity comprising: computing a first similarity based on a similarity of the two nodes being compared, and computing a second similarity based on a similarity of children of the two nodes being compared and aggregating the first and second similarities, wherein at least one of: a) the computing of the first similarity is also based on parents of the two nodes being compared and comprises computing a similarity based on a maximal value of the name of the second node and a combination of the name of the second node with its parent's name, and b) the computing of the second similarity comprises identifying a number of overlapping children of the first and second nodes, the second similarity being a function of: sim 2 ⁡ ( n , n ′

Assignees

Inventors

Classifications

  • G06F16/25Primary

    Integrating or interfacing systems involving database management systems · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9977817B2 cover?
A system and method provide for identifying coreference from serialized data coming from different services. The method includes generating a tree structure from serialized data. The serialized data includes responses to queries from the different services. The responses each identify a hierarchical relationship between a respective set of objects. Nodes of the tree structure each have a name c…
Who is the assignee on this patent?
Conduent Business Services Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/25. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 22 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).