System and method for providing trusted links between applications
US-11972029-B2 · Apr 30, 2024 · US
US9600509B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9600509-B2 |
| Application number | US-34191308-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 22, 2008 |
| Priority date | Dec 21, 2007 |
| Publication date | Mar 21, 2017 |
| Grant date | Mar 21, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
To facilitate access to public records, the present inventors devised, among other things, an entity resolution system. The exemplary system includes master records database of 300 million entities, which is partitioned into multiple distinct portions. The exemplary system extracts entity information from input public records and constructs one or more blocking queries against specific portions of the master records database to identify one or more sets of candidate records. Feature vectors are defined for the candidate records and machine learning techniques, such as Support Vector Machine, are used to determine which of the candidate records from the master records database match the input public records. Candidate records that match are logically associated with public records, enabling ready access via direct or indirect queries.
Opening claim text (preview).
What is claimed is: 1. A system comprising: one or more processors; an entity resolution database (“ERD”) resolution engine adapted to retrieve, responsive to a first set of data in one or more data fields in a public record, a set of candidate named entity records from a master named entity database based on one of a set of two or more blocking queries, wherein each blocking query in the set of two or more blocking queries comprises a query for a last name and a first name, and a city name, all extracted from the public record, and a query for a last name and a first name, all from the public record; the ERD resolution engine further adapted to automatically determine a permutation for each blocking query in the set of two or more blocking queries and an order of execution for the set of two or more blocking queries based on the first set of data; the ERD resolution engine further adapted to calculate similarity scores for the first set of data in the one or more of the data fields in the public record and a second set of data in a set of data fields in the set of candidate named entity records by comparing the second set of data in the set of data fields in the set of candidate named entity records retrieved by the set of blocking queries with the first set of data in the one or more data fields in the public record; and the ERD resolution engine further adapted to determine a confidence rating for one or more of the set of similarity scores between the public record and the candidate named entity record. 2. The system of claim 1 , wherein the ERD resolution engine is further adapted to, responsive to the confidence rating, determine whether to retrieve another set of candidate named entity records from the master named entity database based on another of the set of two or more blocking queries. 3. The system of claim 2 , wherein the other of the set of two or more blocking queries is broader in scope that the one blocking query. 4. The system of claim 1 , wherein the set of blocking queries includes: a query for a social security number from the public record; a query for a last name and a first name, and a city name, all extracted from the public record; and a query for a last name and a first name, all from the public record. 5. The system of claim 4 , wherein the system is implemented as a client-server architecture and one or more of the processors is a component of a web server and wherein one or more client access devices interface with the web server via a wide or local area network to request and receive public record information. 6. The system of claim 1 wherein the master named entity database is partitioned into a number of blocks based on corresponding hashes of a name field associated with each record in the master named entity database. 7. The system of claim 1 wherein each similarity score ranges from 0 and 1.0, wherein 0 indicates a non-match and 1.0 indicates an identical match. 8. The system of claim 1 further comprising a lookup table for determining whether one or more of the blocking queries will return a number of candidate named entity records in excess of a threshold. 9. The system of claim 1 , wherein one or more of the recited means is implemented using in combination machine-executable instruction sets stored on a machine-readable magnetic, electrical, or optical medium, with the instruction sets executed using one or more processors. 10. A method comprising: retrieving a set of candidate named entity records from a master named entity database based on one of a set of two or more blocking queries, with each blocking query based on one or more data fields in a public record, and wherein each blocking query comprises a query for a last name and a first name, and a city name, all extracted from the public record, and a query for a last name and a first name, all from the public record, and wherein a permutation for each blocking query in the set of two or more blocking queries and an order of execution for the set of two or more blocking queries is automatically determined based on the one or more data fields in the public record; calculating similarity scores for one or more of the data fields in the public record and a set of data fields in the set of candidate named entity records by comparing the set of data fields in the set of candidate named entity records retrieved by the set of blocking queries with the one or more data fields in the public record; and determining a confidence rating for one or more of the set of similarity scores between the public record and the candidate named entity record. 11. The method of claim 10 , further comprising: determining whether to retrieve another set of candidate named entity records from the master named entity database based on another of the set of two or more blocking queries. 12. The method of claim 11 , wherein the other of the set of two or more blocking queries is broader in scope that the one blocking query. 13. The method of claim 10 , wherein the set of blocking queries includes: a query for a social security number extracted from the public record; a query for a last name and a first name, and a city name, all extracted from the public record; and a query for a last name and a first name, all extracted from the public record. 14. The method of claim 10 wherein the master named entity database is partitioned into a number of blocks based on corresponding hashes of a name field associated with each record in the master named entity database. 15. The method of claim 10 wherein each similarity score ranges from 0 and 1.0, wherein 0 indicates a non-match and 1.0 indicates an identical match. 16. The method of claim 10 further comprising: using a lookup table to determine whether the one of the blocking queries will return a number of candidate named entity records in excess of a threshold. 17. An entity resolution system comprising: a computer based system comprising an input adapted to receive user-defined inputs, a processor adapted to process executable code and user-defined inputs and a memory adapted to store the executable code and user-defined inputs, the executable code comprising: a retrieval code set stored on the memory, when executed by the processor, being responsive to a first set of data in one or more data fields in a public record and adapted to retrieve a set of candidate named entity records from a master named entity database based on one of a set of two or more blocking queries, wherein each blocking query in the set of two or more blocking queries includes a query for a last name and a first name, and a city name, all extracted from the public record, and a query for a last name and a first name, all from the public record; the retrieval set of code further adapted to automatically determine a permutation for each blocking query in the set of two or more blocking queries and an order of execution for the set of two or more blocking queries based on the first set of data; a matching code set stored on the memory and being adapted to, when executed by the processor, calculate similarity scores for the first set of data in the one or more of the data fields in the public record and a second set of data from a set of data fields in the set of candidate named entity records by comparing the second set of data from the set of data fields in the set of candidate named entity records retrieved by the set of blocking queries with the first set of data from the one or more data fields in the public record; and a confidence code set stored on the memory and being
Updating · CPC title
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.