What technology area does this patent fall under?

Primary CPC classification G06F16/906. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 08 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Automatic discovery of related data records

US12111870B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12111870-B2
Application number	US-202117213946-A
Country	US
Kind code	B2
Filing date	Mar 26, 2021
Priority date	Mar 26, 2021
Publication date	Oct 8, 2024
Grant date	Oct 8, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are provided for automatic discovery of data records. One method comprises obtaining data records each corresponding to a different item and comprising features extracted from a data source, wherein the data records identify related items identified using a collaborative filter that relates items based on user preferences; generating an item network comprising multiple nodes each corresponding to a different item, where two nodes are connected by an edge based on: (i) an item type of the two nodes, (ii) a ratio of numerical values associated with the two nodes, and/or (iii) a pairwise configuration similarity score for the two nodes; clustering the nodes into node clusters based on topological properties of the item network; and identifying items related to a given item that (i) share an edge with the given item and (ii) are in a node cluster comprising a node of the given item.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: obtaining a plurality of data records, wherein each data record corresponds to a different one of a plurality of items and comprises a plurality of features extracted from at least one data source, wherein at least one data record associated with a first item identifies at least one related item that is related to the first item, wherein the at least one related item is identified in the plurality of data records using a collaborative filter that relates at least some of the items of the plurality of items based at least in part on preferences of a plurality of users, and wherein the collaborative filter identifies, for a given item, one or more additional items obtained or researched by one or more users that also obtained or researched, respectively, the given item; generating, using the plurality of data records, an item network comprising a plurality of nodes, wherein each node in the item network corresponds to a different one of the plurality of items, wherein two nodes in the item network are selectively connected by an edge in response to an evaluation of: (i) an item type of the items associated with the two nodes, (ii) a ratio of numerical values associated with the two nodes, and (iii) a pairwise configuration similarity score for the two nodes, wherein the pairwise configuration similarity score for the two nodes is based at least in part on a similarity analysis of a textual description of a configuration of each of the items associated with the two nodes, extracted from the at least one data source, for each of the two nodes, wherein the two nodes in the item network are selectively connected by the edge in response to the evaluation determining that: (i) the respective item types of the items associated with the two nodes satisfy one or more similarity criteria, (ii) the ratio of the numerical values associated with the two nodes satisfies a first designated threshold, and (iii) the pairwise configuration similarity score for the two nodes satisfies a second designated threshold, wherein the first designated threshold and the second designated threshold are distinct and wherein the ratio of the numerical values is distinct from the pairwise configuration similarity score; clustering the plurality of nodes in the item network into a plurality of node clusters based at least in part on an analysis of one or more topological properties of the item network; identifying one or more items related to a given item by querying the item network to return the one or more identified related items having a corresponding node in the item network that (i) shares an edge with a node in the item network corresponding to the given item and (ii) are in at least one node cluster comprising a node corresponding to the given item; and initiating an automated processing of at least a given one of the plurality of data records associated with the given item using at least some of the identified one or more items related to the given item; wherein the method is performed by at least one processing device comprising a processor coupled to a memory. 2. The method of claim 1 , wherein the plurality of items comprises a plurality of products and wherein the features extracted from the at least one data source comprise one or more of a product type, a product name, a product price, a product configuration and a product family. 3. The method of claim 1 , wherein the plurality of items comprises a plurality of products and wherein the plurality of features is extracted from the at least one data source for one or more additional products provided by competitors of a provider of a given product. 4. The method of claim 1 , wherein the plurality of items comprises a plurality of products and wherein the collaborative filter identifies, for a given product, one or more additional products purchased or researched by customers that also purchased or researched, respectively, the given product. 5. The method of claim 1 , wherein the plurality of items comprises a plurality of products and wherein the two nodes in the item network are connected by the edge in response to the two corresponding products having a same product type and having a price ratio that satisfies one or more pricing criteria. 6. The method of claim 1 , further comprising adding one or more edges to the item network using a prediction model trained using one or more features of the item network extracted from the item network, wherein the trained prediction model identifies topological link patterns in the item network to predict at least one missing edge to add to the item network, wherein a weight of the at least one added edge is based at least in part on the pairwise configuration similarity score for the two nodes connected by the at least one added edge. 7. The method of claim 1 , wherein the nodes in a given cluster are more closely related to the nodes in the given cluster than to the nodes in other clusters. 8. The method of claim 1 , wherein the similarity analysis of the textual description of the configuration of each of the items associated with the two nodes comprises one or more of determining a Jaccard similarity and determining a cosine similarity of the configuration of each of the items associated with the two nodes. 9. The method of claim 1 , wherein the plurality of items comprises a plurality of products and wherein the identifying one or more items related to the given item comprises identifying, for a given product, one or more additional products that: (i) are associated with nodes in the item network that share an edge with the node associated with the given product and (ii) are found in the same cluster as the given product. 10. The method of claim 1 , further comprising querying the item network for a particular item of interest to a particular organization, wherein the query returns one or more items that (i) share an edge with a node in the item network corresponding to the particular item, wherein the one or more items that share the edge with the node corresponding to the particular item comprise items competing with the particular item of interest and (ii) are in at least one node cluster comprising a node corresponding to the particular item, wherein the one or more items in the at least one node cluster comprise similar items competing with the particular item of interest. 11. The method of claim 10 , further comprising identifying whether the one or more items returned by the query are provided by one or more of the particular organization and a different organization. 12. The method of claim 1 , wherein the evaluation of the pairwise configuration similarity score for the two nodes is performed in response to the evaluation determining that the respective item types of the items associated with the two nodes satisfy the one or more similarity criteria and the ratio of the numerical values associated with the two nodes satisfies the first designated threshold. 13. An apparatus comprising: at least one processing device comprising a processor coupled to a memory; the at least one processing device being configured to implement the following steps: obtaining a plurality of data records, wherein each data record corresponds to a different one of a plurality of items and comprises a plurality of features extracted from at least one data source, wherein at least one data record associated with a first item identifies at least one related item that is related to the first item, wherein the at least one related item is identified in the plurality of data records using a collaborative filter that relates at least some of the items of the plurality of ite

Assignees

Emc Ip Holding Co Llc

Inventors

Classifications

G06F16/906Primary
Clustering; Classification · CPC title
H04L45/02Primary
Topology update or discovery · CPC title
G06F16/9035
Filtering based on additional data, e.g. user or group profiles · CPC title
G06F16/9038
Presentation of query results · CPC title
G06F16/9024
Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

Patent family

Related publications grouped by family.

View patent family 83364689

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12111870B2 cover?: Techniques are provided for automatic discovery of data records. One method comprises obtaining data records each corresponding to a different item and comprising features extracted from a data source, wherein the data records identify related items identified using a collaborative filter that relates items based on user preferences; generating an item network comprising multiple nodes each cor…
Who is the assignee on this patent?: Emc Ip Holding Co Llc
What technology area does this patent fall under?: Primary CPC classification G06F16/906. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 08 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).