Method and system for focused multi-blocking to increase link identification rates in record comparison

US9760654B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9760654-B2
Application numberUS-201313871885-A
CountryUS
Kind codeB2
Filing dateApr 26, 2013
Priority dateApr 26, 2013
Publication dateSep 12, 2017
Grant dateSep 12, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for comparing customer records to identify linked customer records pertaining to a single customer entity are provided. The techniques include identifying a target group of electronic customer records having data fields containing data pertaining to a customer, identifying one or more focused blockers identifying a data value for an electronic customer record data field, and analyzing the target group of electronic customer records to identify a focused group of electronic customer records containing the focused blocker data value. The techniques also include comparing pairs of electronic customer records from the focused group of electronic customer records to identify linked records which pertain to a single customer entity.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: a computer system identifying a target group of electronic customer records, each electronic customer record having data fields containing data pertaining to a customer; the computer system, at one or more master nodes, using a mapping system to divide linking tasks into sub-tasks that are distributed to worker nodes in a multi-level tree structure for parallel processing; the computer system, at one or more first worker nodes of the worker nodes, identifying a first focused blocker, the first focused blocker identifying a data value for an electronic customer record data field; the computer system analyzing one or more electronic customer records from within the target group of electronic customer records to, for each electronic customer record within the target group of electronic customer records, identify one or more first focused blocker keys, the one or more first focused blocker keys comprising one or more data values from the each electronic customer record of the target group of electronic customer records corresponding to the data value for the electronic customer record data field; the computer system further analyzing the one or more electronic customer records from within the target group of electronic customer records, and producing one or more additional focused blocker keys based from the further analysis; the computer system associating the one or more additional focused blocker keys with the one or more electronic customer records from within the target group of electronic customer records; the computer system, at the one or more first worker nodes of the worker nodes, analyzing the target group of electronic customer records to identify a first focused group of electronic customer records, the first focused group of electronic customer records comprising: electronic customer records comprising a first focused blocker data value; and electronic customer records associated with at least one of the one or more additional focused blocker keys; the computer system, at the one or more first worker nodes of the worker nodes, comparing pairs of electronic customer records from the first focused group of electronic customer records to identify linked records, each linked record comprising two or more electronic customer records which pertain to a single customer entity; in parallel to the computer system identifying the first focused blocker, the computer system, at one or more second worker nodes of the worker nodes, identifying a second focused blocker, the second focused blocker identifying a third data value for the electronic customer record data field, wherein the third data value is different than the data value identified by the first focused blocker for the electronic customer record data field; in parallel to the computer system analyzing the target group of electronic customer records to identify the first focused group of electronic customer records, the computer system, at the one or more second worker nodes of the worker nodes, analyzing the target group of electronic customer records to identify a second focused group of electronic customer records, the second focused group of electronic customer records comprising electronic customer records comprising the third data value; and in parallel to the computer system comparing the pairs of electronic customer records from the first focused group of electronic customer records, the computer system, at the one or more second worker nodes of the worker nodes, comparing pairs of electronic customer records from the second focused group of electronic customer records to identify second linked records, each linked record of the second linked records comprising two or more electronic customer records which pertain to a second single customer entity; and the computer system, at the one or more master nodes, using a reduction system to receive and combine responses from the worker nodes for output. 2. The method of claim 1 , wherein the first focused blocker value comprises a first data value for a first electronic customer record data field and a second data value for a second electronic customer record data field. 3. The method of claim 2 , wherein the first focused group of electronic customer records further comprises electronic customer records comprising the first data value and the second data value, the data value comprising the first and second data values. 4. The method of claim 2 , wherein the first focused blocker identifies the first data value and the second data value selected from a group consisting of: a double metaphone of a last name and a zip code; the last name and the zip code; the last name and a house number; and the house number and the zip code. 5. The method of claim 1 , wherein the target group of electronic customer records comprises a first group of electronic customer records obtained from a first source and a second group of electronic customer records obtained from a second source. 6. The method of claim 1 , wherein the single customer entity is selected from a group consisting of: a single customer; a household; and a business. 7. The method of claim 1 , wherein the method further comprises the computer system comparing all possible pairs of electronic customer records from the first focused group of electronic customer records to identify the linked records. 8. The method of claim 1 , wherein: the first focused blocker data value comprises a first data value for a first electronic customer record data field and a second data value for a second electronic customer record data field; the first focused group of electronic customer records further comprises electronic customer records comprising the first data value and the second data value; the first focused blocker identifies the first data value and the second data value selected from a group consisting of: a double metaphone of a last name and a zip code; the last name and the zip code; the last name and a house number; and the house number and the zip code; the target group of electronic customer records comprises a first group of electronic customer records obtained from a first source and a second group of electronic customer records obtained from a second source; and the single customer entity is selected from a group consisting of: a single customer; a household; and a business. 9. The method of claim 1 , further comprising: the computer system collecting information based at least in part on the linked records; and the computer system modifying one or more of the one or more electronic customer records based at least in part on the information. 10. A computer-implemented method comprising: a computer system identifying a target group of electronic records, each electronic record having data fields containing data pertaining to one or more events; the computer system, at one or more master nodes, using a mapping system to divide linking tasks into sub-tasks that are distributed to worker nodes in a multi-level tree structure for parallel processing; the computer system, at one or more first worker nodes of the worker nodes, identifying a first focused blocker, the first focused blocker identifying a data value for an electronic record data field; the computer system analyzing one or more electronic records from within the target group of electronic records to, for each electronic customer record within the target group of electronic records, identify one or more first focused blocker keys, the one or more first focused blocker keys comprising one or more data values from the each electronic customer record of the target group of electronic records cor

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9760654B2 cover?
Techniques for comparing customer records to identify linked customer records pertaining to a single customer entity are provided. The techniques include identifying a target group of electronic customer records having data fields containing data pertaining to a customer, identifying one or more focused blockers identifying a data value for an electronic customer record data field, and analyzin…
Who is the assignee on this patent?
Wal Mart Stores Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/30958. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 12 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).