Augmenting match indices
US-2018165354-A1 · Jun 14, 2018 · US
US10810233B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10810233-B2 |
| Application number | US-201715844311-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 15, 2017 |
| Priority date | Dec 15, 2017 |
| Publication date | Oct 20, 2020 |
| Grant date | Oct 20, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for linking records from different datasets based on record similarities is described. The method includes ingesting a first dataset, including a first set of records with a first set of fields, wherein the first dataset is associated with a first vendor and a first type of data, and a second dataset, including a second set of records with a second set of fields, wherein the second dataset is associated with a second vendor and a second type of data; determining that a first record from the first set of records is similar to a second record from the second set of records based on similarities between fields in the first and second set of fields; and linking the first and second records in response to determining that the similarity, wherein the first and second vendors are different and/or the first and second types of data are different.
Opening claim text (preview).
What is claimed is: 1. A method for linking records from different datasets based on similarities of the records, wherein the method comprises: ingesting, by a data as a service (DAAS) system, a first dataset including a first set of records with a first set of fields, wherein the first dataset is associated with a first vendor and a first type of data; ingesting, by the DAAS system, a second dataset including a second set of records with a second set of fields, wherein the second dataset is associated with a second vendor and a second type of data; determining, by the DAAS system, that a first record from the first set of records is similar to a second record from the second set of records based on similarities between fields in the first set of fields and fields in the second set of fields; linking, by the DAAS system, the first record to the second record in response to determining that the first record is similar to the second record, wherein the first vendor is different than the second vendor and the first type of data is different than the second type of data such that at least one field in the first set of fields is not a field in the second set of fields or at least one field in the second set of fields is not a field in the first set of fields; performing a query for a customer of the DAAS system to generate an initial query result, wherein the customer has access to the first and second datasets and the initial query result includes the first record; and adding, following performance of the query, the second record to the initial query result to generate a final query result based on the link between the first record and the second record that indicates that the second record is similar to the first record. 2. The method of claim 1 , wherein determining that the first record is similar to the second record comprises: generating one or more match keys for the first record based on one or more fields in the first set of fields; and generating one or more match keys for the second record based on one or more fields in the second set of fields. 3. The method of claim 2 , wherein determining that the first record is similar to the second record further comprises: determining that the one or more match keys for the first record are identical or within a threshold of the one or more match keys for the second record. 4. The method of claim 3 , further comprising: making the final query result available to the customer. 5. The method of claim 4 , wherein the query is a match query that updates records previously provided to the customer. 6. The method of claim 4 , wherein the final query result indicates that the second record is a recommended record for the customer based on the link to the first record. 7. The method of claim 3 , wherein each match key in the one or more match keys for the first record is a combination of two or more fields in the first set of fields or a single field in the first set of fields and each match key in the one or more match keys for the second record is a combination of two or more fields in the second set of fields or a single field in the second set of fields. 8. The method of claim 1 , wherein the ingesting of the first dataset is based on first ingestion metadata that maps one or more fields in the first set of fields to a set of fields defined in the data as a service system, and wherein the ingesting of the second dataset is based on second ingestion metadata that maps one or more fields in the second set of fields to the set of fields defined in the data as a service system. 9. A non-transitory machine readable medium that stores instructions that, when executed by a processor of an electronic device, cause the electronic device to: ingest a first dataset including a first set of records with a first set of fields, wherein the first dataset is associated with a first vendor and a first type of data; ingest a second dataset including a second set of records with a second set of fields, wherein the second dataset is associated with a second vendor and a second type of data; determine that a first record from the first set of records is similar to a second record from the second set of records based on similarities between fields in the first set of fields and fields in the second set of fields; link the first record to the second record in response to determining that the first record is similar to the second record, wherein the first vendor is different than the second vendor and the first type of data is different than the second type of data such that at least one field in the first set of fields is not a field in the second set of fields or at least one field in the second set of fields is not a field in the first set of fields; perform a query for a customer of a data as a service system to generate an initial query result, wherein the customer has access to the first dataset and the second dataset and the initial query result includes the first record; and add, following performance of the query, the second record to the initial query result to generate a final query result based on the link between the first record and the second record that indicates that the second record is similar to the first record. 10. The non-transitory machine readable medium of claim 9 , wherein determining that the first record is similar to the second record comprises: generating one or more match keys for the first record based on one or more fields in the first set of fields; and generating one or more match keys for the second record based on one or more fields in the second set of fields. 11. The non-transitory machine readable medium of claim 10 , wherein determining that the first record is similar to the second record further comprises: determining that the one or more match keys for the first record are identical or within a threshold of the one or more match keys for the second record. 12. The non-transitory machine readable medium of claim 11 , wherein the instructions further cause the electronic device to: make the final query result available to the customer. 13. The non-transitory machine readable medium of claim 12 , wherein the query is a match query that updates records previously provided to the customer. 14. The non-transitory machine readable medium of claim 12 , wherein the final query result indicates that the second record is a recommended record for the customer based on the link to the first record. 15. The non-transitory machine readable medium of claim 11 , wherein each match key in the one or more match keys for the first record is a combination of two or more fields in the first set of fields or a single field in the first set of fields and each match key in the one or more match keys for the second record is a combination of two or more fields in the second set of fields or a single field in the second set of fields. 16. The non-transitory machine readable medium of claim 9 , wherein the ingesting of the first dataset is based on first ingestion metadata that maps one or more fields in the first set of fields to a set of fields defined in a data as a service system, and wherein the ingesting of the second dataset is based on second ingestion metadata that maps one or more fields in the second set of fields to the set of fields defined in the data as a service system. 17. A data as a service system for linking records from different datasets based on similarities of the records, wherein the data as a service system comprises: a set of memory devices; and a processor coupled to the set of memory dev
Distributed queries · CPC title
Integrating or interfacing systems involving database management systems · CPC title
Search customisation based on user profiles and personalisation · CPC title
Updates performed during online database operations; commit processing · CPC title
Clustering or classification · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.