What technology area does this patent fall under?

Primary CPC classification G06F17/30985. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 21 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Combined deterministic and probabilistic matching for data management

US9600602B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9600602-B2
Application number	US-201414301790-A
Country	US
Kind code	B2
Filing date	Jun 11, 2014
Priority date	Aug 30, 2013
Publication date	Mar 21, 2017
Grant date	Mar 21, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for data management. The method includes a computer selecting a first data record and a second data record. The computer determines whether the first data record and the second data record share a deterministic matching category. Responsive to determining the first data record does not share a deterministic matching category with the second data record, the computer determines whether the first data record and the second data record share a probabilistic matching category.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for data management, the method comprising: a computer selecting a first data record and a second data record, wherein one or more attributes of the data records used by a deterministic algorithm are independent of one or more attributes of the data records used by a probabilistic algorithm, and wherein the one or more attributes of the data records used by the deterministic algorithm overlap the one or more attributes of the data records used by the probabilistic algorithm; the computer determining whether the first data record and the second data record share a deterministic matching category, wherein a matching category includes one or more data record types; responsive to determining the first data record does not share a deterministic matching category with the second data record, the computer determining whether the first data record and the second data record share a probabilistic matching category; responsive to determining that the first data record and the second data record share a probabilistic matching category, the computer determining whether one or more additional records in the probabilistic matching category match; the computer setting a comparison score as a number of matching records in the probabilistic matching category; the computer determining whether the comparison score meets or exceeds a predetermined threshold value; responsive to determining that the comparison score meets or exceeds the predetermined threshold value, the computer retrieving matched data records; and subsequent to retrieving matched data records, the computer applying data steward rules to the matched data records, wherein the data steward rules define additional requirements to determine a final match outcome. 2. The method of claim 1 , wherein the computer determining a comparison score for the first data record and the second data record further comprises: the computer determining at least one of an attribute of the first data record matches at least one of an attribute of the second data record; and the computer setting the comparison score as a number of matching attributes between the first data record and the second data record. 3. The method of claim 1 , further comprising the step of: responsive to determining that the first data record and the second data record share a deterministic matching category, the computer retrieving matched data records. 4. The method of claim 1 , wherein the first data record and the second data record are stored in one or more databases within a clouded federated database system.

Assignees

Inventors

Classifications

G06F17/30985Primary
Physics · mapped topic
G06F17/30303
Physics · mapped topic
G06F16/215Primary
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
G06F16/90344Primary
by using string matching techniques · CPC title

Patent family

Related publications grouped by family.

View patent family 52584731

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9600602B2 cover?: A method for data management. The method includes a computer selecting a first data record and a second data record. The computer determines whether the first data record and the second data record share a deterministic matching category. Responsive to determining the first data record does not share a deterministic matching category with the second data record, the computer determines whether …
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F17/30985. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 21 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).