Linking records between datasets to augment query results

US10810233B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10810233-B2
Application numberUS-201715844311-A
CountryUS
Kind codeB2
Filing dateDec 15, 2017
Priority dateDec 15, 2017
Publication dateOct 20, 2020
Grant dateOct 20, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for linking records from different datasets based on record similarities is described. The method includes ingesting a first dataset, including a first set of records with a first set of fields, wherein the first dataset is associated with a first vendor and a first type of data, and a second dataset, including a second set of records with a second set of fields, wherein the second dataset is associated with a second vendor and a second type of data; determining that a first record from the first set of records is similar to a second record from the second set of records based on similarities between fields in the first and second set of fields; and linking the first and second records in response to determining that the similarity, wherein the first and second vendors are different and/or the first and second types of data are different.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for linking records from different datasets based on similarities of the records, wherein the method comprises: ingesting, by a data as a service (DAAS) system, a first dataset including a first set of records with a first set of fields, wherein the first dataset is associated with a first vendor and a first type of data; ingesting, by the DAAS system, a second dataset including a second set of records with a second set of fields, wherein the second dataset is associated with a second vendor and a second type of data; determining, by the DAAS system, that a first record from the first set of records is similar to a second record from the second set of records based on similarities between fields in the first set of fields and fields in the second set of fields; linking, by the DAAS system, the first record to the second record in response to determining that the first record is similar to the second record, wherein the first vendor is different than the second vendor and the first type of data is different than the second type of data such that at least one field in the first set of fields is not a field in the second set of fields or at least one field in the second set of fields is not a field in the first set of fields; performing a query for a customer of the DAAS system to generate an initial query result, wherein the customer has access to the first and second datasets and the initial query result includes the first record; and adding, following performance of the query, the second record to the initial query result to generate a final query result based on the link between the first record and the second record that indicates that the second record is similar to the first record. 2. The method of claim 1 , wherein determining that the first record is similar to the second record comprises: generating one or more match keys for the first record based on one or more fields in the first set of fields; and generating one or more match keys for the second record based on one or more fields in the second set of fields. 3. The method of claim 2 , wherein determining that the first record is similar to the second record further comprises: determining that the one or more match keys for the first record are identical or within a threshold of the one or more match keys for the second record. 4. The method of claim 3 , further comprising: making the final query result available to the customer. 5. The method of claim 4 , wherein the query is a match query that updates records previously provided to the customer. 6. The method of claim 4 , wherein the final query result indicates that the second record is a recommended record for the customer based on the link to the first record. 7. The method of claim 3 , wherein each match key in the one or more match keys for the first record is a combination of two or more fields in the first set of fields or a single field in the first set of fields and each match key in the one or more match keys for the second record is a combination of two or more fields in the second set of fields or a single field in the second set of fields. 8. The method of claim 1 , wherein the ingesting of the first dataset is based on first ingestion metadata that maps one or more fields in the first set of fields to a set of fields defined in the data as a service system, and wherein the ingesting of the second dataset is based on second ingestion metadata that maps one or more fields in the second set of fields to the set of fields defined in the data as a service system. 9. A non-transitory machine readable medium that stores instructions that, when executed by a processor of an electronic device, cause the electronic device to: ingest a first dataset including a first set of records with a first set of fields, wherein the first dataset is associated with a first vendor and a first type of data; ingest a second dataset including a second set of records with a second set of fields, wherein the second dataset is associated with a second vendor and a second type of data; determine that a first record from the first set of records is similar to a second record from the second set of records based on similarities between fields in the first set of fields and fields in the second set of fields; link the first record to the second record in response to determining that the first record is similar to the second record, wherein the first vendor is different than the second vendor and the first type of data is different than the second type of data such that at least one field in the first set of fields is not a field in the second set of fields or at least one field in the second set of fields is not a field in the first set of fields; perform a query for a customer of a data as a service system to generate an initial query result, wherein the customer has access to the first dataset and the second dataset and the initial query result includes the first record; and add, following performance of the query, the second record to the initial query result to generate a final query result based on the link between the first record and the second record that indicates that the second record is similar to the first record. 10. The non-transitory machine readable medium of claim 9 , wherein determining that the first record is similar to the second record comprises: generating one or more match keys for the first record based on one or more fields in the first set of fields; and generating one or more match keys for the second record based on one or more fields in the second set of fields. 11. The non-transitory machine readable medium of claim 10 , wherein determining that the first record is similar to the second record further comprises: determining that the one or more match keys for the first record are identical or within a threshold of the one or more match keys for the second record. 12. The non-transitory machine readable medium of claim 11 , wherein the instructions further cause the electronic device to: make the final query result available to the customer. 13. The non-transitory machine readable medium of claim 12 , wherein the query is a match query that updates records previously provided to the customer. 14. The non-transitory machine readable medium of claim 12 , wherein the final query result indicates that the second record is a recommended record for the customer based on the link to the first record. 15. The non-transitory machine readable medium of claim 11 , wherein each match key in the one or more match keys for the first record is a combination of two or more fields in the first set of fields or a single field in the first set of fields and each match key in the one or more match keys for the second record is a combination of two or more fields in the second set of fields or a single field in the second set of fields. 16. The non-transitory machine readable medium of claim 9 , wherein the ingesting of the first dataset is based on first ingestion metadata that maps one or more fields in the first set of fields to a set of fields defined in a data as a service system, and wherein the ingesting of the second dataset is based on second ingestion metadata that maps one or more fields in the second set of fields to the set of fields defined in the data as a service system. 17. A data as a service system for linking records from different datasets based on similarities of the records, wherein the data as a service system comprises: a set of memory devices; and a processor coupled to the set of memory dev

Assignees

Inventors

Classifications

  • Distributed queries · CPC title

  • Integrating or interfacing systems involving database management systems · CPC title

  • Search customisation based on user profiles and personalisation · CPC title

  • Updates performed during online database operations; commit processing · CPC title

  • G06F16/285Primary

    Clustering or classification · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10810233B2 cover?
A method for linking records from different datasets based on record similarities is described. The method includes ingesting a first dataset, including a first set of records with a first set of fields, wherein the first dataset is associated with a first vendor and a first type of data, and a second dataset, including a second set of records with a second set of fields, wherein the second dat…
Who is the assignee on this patent?
Salesforce Com Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2471. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 20 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).