Monitoring and visualization of model-based clustering definition performance

US12412103B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-12412103-B1
Application numberUS-202217589445-A
CountryUS
Kind codeB1
Filing dateJan 31, 2022
Priority dateJan 29, 2021
Publication dateSep 9, 2025
Grant dateSep 9, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This document discloses methods and systems for cohort identification. The methods and systems include improved calculations to perform cohort identification and practical applications of the improved calculations. Specifically, the systems and methods described herein may utilize key components that include enhancements of existing cohort clustering techniques with regard to selecting a number of cohort input dimensions, normalizing input data using a logarithm kernel-function, treatment of categorical data with mutually exclusive and not-mutually exclusive values, methods and visualization tool to determine appropriate number of cohorts, methods and visualization tool to compare cohorts extracted from different input dimensions, and methods to quantify the difference in cohorts. Beyond improvements to the cohort clustering techniques, also disclosed are ancillary tools to prepare input data by joining CRM and product usage data and facilitate subsequent automated action via an API to retrieve cohort results.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computing device, comprising: one or more hardware processors; and a non-transitory computer-readable medium having stored thereon instructions that, when executed by the one or more hardware processors, cause the one or more hardware processors to perform operations comprising: generating a cohort definition using a machine learning model based on a combination of a number (D) of dimensions selected from a first data set comprising first data points, wherein the cohort definition clusters the first data points into a number (K) of first clusters, each of the first clusters having a first center point; applying the machine learning model to a second data set comprising second data points to determine a second center point for each of K of second clusters; causing a user interface to be presented on a display device, the user interface comprising an indication of a difference measure between the first clusters and the second clusters, wherein the different measure is generated based on a difference vector determined for each second center point and a nearest first center point. 2. The computing device of claim 1 , the operations further comprising: determining, for each second cluster: a first cluster having a first center point nearest to the second center point of the second cluster; and a difference scalar between a number of second data points assigned to the second cluster and a number of first data points assigned to the first cluster; wherein the determining of the difference measure between the first clusters and the second clusters is further based on the determined difference scalars. 3. The computing device of claim 1 , the operations further comprising: based on the difference measure between the first clusters and the second clusters and a predetermined threshold, the user interface further comprises a recommendation to select different dimensions for clustering. 4. The computing device of claim 1 , the operations further comprising: applying the cohort definition to a third data set comprising third data points to determine a third center point for each of K of third clusters and assign each data point in the third data set to a third cluster; determining, for each third center point, a second difference vector from a nearest second center point to the third center point; and based on the determined second difference vectors, determining a second difference measure between the second clusters and the third clusters; wherein the user interface further comprises an indication of the second difference measure. 5. The computing device of claim 1 , wherein: the first data points comprise data for a first period of time; and the second data points comprise data for a second period of time. 6. The computing device of claim 1 , wherein the applying of the cohort definition to the first data set comprising the first data points comprises applying K-means clustering to the first data points. 7. The computing device of claim 1 , wherein the operations further comprise: generating the first data points by linking customer relationship management (CRM) and product usage data using shared identifiers. 8. The computing device of claim 7 , wherein the operations further comprise: accessing CRM data that indicates a parent-subsidiary relationship between a parent account and a subsidiary account; and accessing first product usage data that is linked to both the parent account and the subsidiary account; wherein the generating of the first data points comprises generating a data point that links the first product usage data to the subsidiary account. 9. A non-transitory computer-readable medium having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations including: generating a cohort definition using a machine learning model based on a combination of a number (D) of dimensions selected from a first data set comprising first data points, wherein the cohort definition clusters the first data points into a number (K) of first clusters, each of the first clusters having a first center point; applying the machine learning model to a second data set comprising second data points to determine a second center point for each of K of second clusters; causing a user interface to be presented on a display device, the user interface comprising an indication of a difference measure between the first clusters and the second clusters, wherein the different measure is generated based on a difference vector determined for each second center point and a nearest first center point. 10. The non-transitory computer-readable medium of claim 9 , the operations further comprising: determining, for each second cluster: a first cluster having a first center point nearest to the second center point of the second cluster; and a difference scalar between a number of second data points assigned to the second cluster and a number of first data points assigned to the first cluster; wherein the determining of the difference measure between the first clusters and the second clusters is further based on the determined difference scalars. 11. The non-transitory computer-readable medium of claim 9 , the operations further comprising: based on the difference measure between the first clusters and the second clusters and a predetermined threshold, the user interface further comprises a recommendation to select different dimensions for clustering. 12. The non-transitory computer-readable medium of claim 9 , the operations further comprising: applying the cohort definition to a third data set comprising third data points to determine a third center point for each of K of third clusters and assign each data point in the third data set to a third cluster; determining, for each third center point, a second difference vector from a nearest second center point to the third center point; and based on the determined second difference vectors, determining a second difference measure between the second clusters and the third clusters; wherein the user interface further comprises an indication of the second difference measure. 13. The non-transitory computer-readable medium of claim 9 , wherein: the first data points comprise data for a first period of time; and the second data points comprise data for a second period of time. 14. The non-transitory computer-readable medium of claim 9 , wherein the applying of the cohort definition to the first data set comprising the first data points comprises applying K-means clustering to the first data points. 15. The non-transitory computer-readable medium of claim 9 , wherein the operations further comprise: generating the first data points by linking customer relationship management (CRM) and product usage data using shared identifiers. 16. The non-transitory computer-readable medium of claim 15 , wherein the operations further comprise: accessing CRM data that indicates a parent-subsidiary relationship between a parent account and a subsidiary account; and accessing first product usage data that is linked to both the parent account and the subsidiary account; wherein the generating of the first data points comprises generating a data point that links the first product usage data to the subsidiary account. 17. A computer-implemented method, comprising: generating, by one or more processors, a cohort definition using a machine learning model based on a combination of a number (D) of dimensions selected from a first data set

Assignees

Inventors

Classifications

  • Clustering or classification · CPC title

  • Machine learning · CPC title

  • G06N5/022Primary

    Knowledge engineering; Knowledge acquisition · CPC title

  • Market modelling; Market analysis; Collecting market data · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12412103B1 cover?
This document discloses methods and systems for cohort identification. The methods and systems include improved calculations to perform cohort identification and practical applications of the improved calculations. Specifically, the systems and methods described herein may utilize key components that include enhancements of existing cohort clustering techniques with regard to selecting a number…
Who is the assignee on this patent?
Splunk Inc, Splunk Llc
What technology area does this patent fall under?
Primary CPC classification G06N5/022. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).