Database system for triggering event notifications based on updates to database records
US-2024419652-A1 · Dec 19, 2024 · US
US9298753B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9298753-B2 |
| Application number | US-201313907723-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 31, 2013 |
| Priority date | May 31, 2013 |
| Publication date | Mar 29, 2016 |
| Grant date | Mar 29, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present invention extends to methods, systems, and computer program products for resolving pairwise links to groups. Embodiments of the invention use an iterative algorithm to transform a collection of pairwise links to groups of records that correspond to the same entity. The algorithm can be stopped after any number of iterations for an increasing accurate approximation result. The algorithm essentially guarantees a correct solution for groups of size up to the number of iterations. This algorithm scales linearly on the size of the record set, with little impact from the number of links.
Opening claim text (preview).
What is claimed: 1. A method for resolving a set of pairwise links to groups, the method comprising for one or more iterations over a plurality of records until a stop condition is satisfied, each iteration including: accessing a plurality of neighborhood data structures that are active, each neighborhood data structure of the plurality of neighborhood data structures corresponding to a different specified record from among the plurality of records, each neighborhood data structure of the plurality of neighborhood data structures including a record identifier as a label, the record identifier corresponding to a specified record included in the plurality of records, each neighborhood data structure of the plurality of neighborhood data structures also including an activity state indicator and a neighborhood, the activity state indicator indicating whether or not the neighborhood data structure is active, the neighborhood including a list of record identifiers for other records expressly linked to the specified record by a pairwise link, the other records also included in the plurality of records; for each neighborhood data structure of the plurality of neighborhood data structures, creating a message for each of the other records expressly linked to the specified record, the message including the label and the activity state indicator for the neighborhood data structure; subsequent to creating the messages, grouping the plurality of neighborhood data structures and the messages by the record identifiers such that each group includes one neighborhood data structure of the plurality of neighborhood data structures and one or more messages; for each grouping of the plurality of neighborhood data structures and the one or more messages, setting the label for the neighborhood data structures in the group to a minimum record identifier included in the group of the plurality of neighborhood data structures and the one or more messages; and when the minimum record identifier for a group of the plurality of neighborhood data structures and the one or more messages is the record identifier for the specified record of a particular neighborhood data structure, removing the particular neighborhood data structure from the plurality of neighborhood data structures that are active by setting the activity state indicator to indicate that the particular neighborhood data structure is not active. 2. The method as recited in claim 1 , further comprising subsequent to the stop condition being satisfied, for each neighborhood data structure of the plurality of neighborhood data structures: writing out the label. 3. The method of claim 1 , wherein the minimum record identifier is based on a lexicographical order associated with the record identifiers. 4. The method of claim 1 , wherein the stop condition indicates a specified maximum number of the one or more iterations to perform. 5. The method of claim 1 , wherein the stop condition is based on detecting that no neighborhood data structures of the plurality of neighborhood data structures are active. 6. The method claim 1 , wherein one or more of the plurality of records is associated with a retail purchase transaction. 7. A computer program product for implementing a method for resolving a set of pairwise links to groups, the computer program product comprising one or more computer storage devices having stored thereon computer-executable instructions that, when executed at a processor, perform the method including one or more iterations over a plurality of records until a stop condition is satisfied, each iteration including: access a plurality of neighborhood data structures that are active, each neighborhood data structure of the plurality of neighborhood data structures corresponding to a different specified record from among the plurality of records, each neighborhood data structure of the plurality of neighborhood data structures including a record identifier as a label, the record identifier corresponding to a specified record included in the plurality of records, each neighborhood data structure of the plurality of neighborhood data structures also including an activity state indicator and a neighborhood, the activity state indicator indicating whether or not the neighborhood data structure is active, the neighborhood including a list of record identifiers for other records expressly linked to the specified record by a pairwise link, the other records also included in the plurality of records; for each neighborhood data structure of the plurality of neighborhood data structures, create a message for each of the other records expressly linked to the specified record, the message including the label and the activity state indicator for the neighborhood data structure; subsequent to creating the messages, group the plurality of neighborhood data structures and the messages by the record identifiers such that each group includes one neighborhood data structure of the plurality of neighborhood data structures and one or more messages; for each grouping of the plurality of neighborhood data structures and the one or more messages, set the label for the neighborhood data structures in the group to a minimum record identifier included in the group of the plurality of neighborhood data structures and the one or more messages; and when the minimum record identifier for a group of the plurality of neighborhood data structures and the one or more messages is the record identifier for the specified record of a particular neighborhood data structures, remove the particular neighborhood data structure from the plurality of neighborhood data structures that are active by setting the activity state indicator to indicate that the particular neighborhood data structure is not active. 8. The computer program product of claim 7 , further comprising computer-executable instructions that, when executed subsequent to the stop condition being satisfied, for each neighborhood data structure of the plurality of neighborhood data structures, perform the method of: write out the label. 9. The computer program product of claim 7 , wherein the minimum record identifier is based on a lexicographical order associated with the record identifiers. 10. The computer program product of claim 7 , wherein the stop condition indicates a specified maximum number of the one or more iterations to perform. 11. The computer program product of claim 7 , wherein the stop condition is based on detecting that no neighborhood data structures of the plurality of neighborhood data structures are active. 12. At a computer system, the computer system including one or more processors, a system memory, and one or more local storage devices, the computer system communicatively coupled to a cluster of servers, a plurality of pairwise links of records being stored across the cluster of servers, each pairwise link of records of the plurality of pairwise links of records including a pair of records that correspond to a same entity, each record of the pair of records being identified by an assigned identifier, the system memory being insufficient to store an entirety of the plurality of pairwise links of records, a method for resolving the plurality of pairwise links of records into one or more groups of the records corresponding to the same entity, the method comprising: a neighborhood creation phase, including for each of the records: creating a neighborhood data structure for the record, the neighborhood data structure comprising a list of neighbors, the neighbors being other records linked to the record in at least one pairwise link of the plurality of pairwise links of records, the other records included i
Accounting · CPC title
Geographical information databases · CPC title
Clustering or classification · CPC title
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.