Multi-service business platform system having entity resolution systems and methods

US2023418793A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2023418793-A1
Application numberUS-202318244042-A
CountryUS
Kind codeA1
Filing dateSep 8, 2023
Priority dateMay 12, 2020
Publication dateDec 28, 2023
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosure is directed to various ways of improving the functioning of computer systems, information networks, data stores, search engine systems and methods, and other advantages. Among other things, provided herein are methods, systems, components, processes, modules, blocks, circuits, sub-systems, articles, and other elements (collectively referred to in some cases as the “platform” or the “system”) that collectively enable, in one or more datastores (e.g., where each datastore may include one or more databases) and systems, the creation, development, maintenance, and use of a set of custom objects for use in a wide range of activities, including sales activities, marketing activities, service activities, content development activities, and others, as well as improved methods and systems for sales, marketing and services that make use of such entity resolution systems and methods as well as custom objects.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method, comprising: processing, by a neural network, a set of entity feature encodings of entities for deduplication to generate a first reduced vector and a second reduced vector; generating representations of likelihoods that entity pairs of the entities are duplicates based upon a dot product of the first reduced vector and the second reduced vector; training a machine learning model using the likelihoods that entity pairs of the entities are duplicates and control values; and utilizing the machine learning model to deduplicate objects within a database. 2 . The method of claim 1 , wherein the utilizing the machine learning model comprises: matching, by artificial intelligence implemented as a proxy for a deduplication process having computation demand exceed a threshold, strings associated with the objects. 3 . The method of claim 1 , wherein the utilizing the machine learning model comprises: performing, by artificial intelligence implemented as a proxy for a deduplication process having computation demand exceed a threshold, heuristics upon the objects. 4 . The method of claim 1 , wherein the utilizing the machine learning model comprises: utilizing a front-end text encoder as a proxy for a deduplication process having computation demand exceed a threshold. 5 . The method of claim 1 , wherein the utilizing the machine learning model comprises: utilizing a middle stage trained neural network as a proxy for a deduplication process having computation demand exceed a threshold. 6 . The method of claim 1 , wherein the utilizing the machine learning model comprises: utilizing a back-end merge indicator function as a proxy for a deduplication process having computation demand exceed a threshold. 7 . The method of claim 1 , wherein the neural network is implemented as a dimension-reducing neural network. 8 . The method of claim 1 , wherein the control values comprise preconfigured p-merge values derived from string matching of features of the entity pairs and heuristics applied to comparisons of the features of the entity pairs. 9 . The method of claim 1 , wherein list segmentation is performed to filter the objects. 10 . A system comprising: a memory comprising machine executable code; and a processor coupled to the memory, the processor configured to execute the machine executable code to cause the processor to perform operation comprising: processing, by a neural network, a set of entity feature encodings of entities for deduplication to generate a first reduced vector and a second reduced vector; generating representations of likelihoods that entity pairs of the entities are duplicates based upon a dot product of the first reduced vector and the second reduced vector; training a machine learning model using the likelihoods that entity pairs of the entities are duplicates and control values; and utilizing the machine learning model to deduplicate objects within a database. 11 . The system of claim 10 , wherein the operations comprise: matching, by artificial intelligence implemented as a proxy for a deduplication process having computation demand exceed a threshold, strings associated with the objects. 12 . The system of claim 10 , wherein the operations comprise: performing, by artificial intelligence implemented as a proxy for a deduplication process having computation demand exceed a threshold, heuristics upon the objects. 13 . The system of claim 10 , wherein the operations comprise: utilizing a front-end text encoder as a proxy for a deduplication process having computation demand exceed a threshold. 14 . The system of claim 10 , wherein the operations comprise: utilizing a middle stage trained neural network as a proxy for a deduplication process having computation demand exceed a threshold. 15 . The system of claim 10 , wherein the operations comprise: utilizing a back-end merge indicator function as a proxy for a deduplication process having computation demand exceed a threshold. 16 . A non-transitory machine-readable storage medium comprising instructions that when executed by a machine, causes the machine to perform operations comprising: processing, by a neural network, a set of entity feature encodings of entities for deduplication to generate a first reduced vector and a second reduced vector; generating representations of likelihoods that entity pairs of the entities are duplicates based upon a dot product of the first reduced vector and the second reduced vector; training a machine learning model using the likelihoods that entity pairs of the entities are duplicates and control values; and utilizing the machine learning model to deduplicate objects within a database. 17 . The non-transitory machine-readable storage medium of claim 16 , wherein the operations comprise: matching, by artificial intelligence implemented as a proxy for a deduplication process having computation demand exceed a threshold, strings associated with the objects. 18 . The non-transitory machine-readable storage medium of claim 16 , wherein the operations comprise: performing, by artificial intelligence implemented as a proxy for a deduplication process having computation demand exceed a threshold, heuristics upon the objects. 19 . The non-transitory machine-readable storage medium of claim 16 , wherein the operations comprise: utilizing a front-end text encoder as a proxy for a deduplication process having computation demand exceed a threshold. 20 . The non-transitory machine-readable storage medium of claim 16 , wherein the operations comprise: utilizing a middle stage trained neural network as a proxy for a deduplication process having computation demand exceed a threshold.

Assignees

Inventors

Classifications

  • G06F16/215Primary

    Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2023418793A1 cover?
The disclosure is directed to various ways of improving the functioning of computer systems, information networks, data stores, search engine systems and methods, and other advantages. Among other things, provided herein are methods, systems, components, processes, modules, blocks, circuits, sub-systems, articles, and other elements (collectively referred to in some cases as the “platform” or t…
Who is the assignee on this patent?
Hubspot Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/215. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 28 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).