What technology area does this patent fall under?

Primary CPC classification G06N5/022. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Monitoring and visualization of model-based clustering definition performance

US12412103B1 · US · B1

Patent metadata
Field	Value
Publication number	US-12412103-B1
Application number	US-202217589445-A
Country	US
Kind code	B1
Filing date	Jan 31, 2022
Priority date	Jan 29, 2021
Publication date	Sep 9, 2025
Grant date	Sep 9, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This document discloses methods and systems for cohort identification. The methods and systems include improved calculations to perform cohort identification and practical applications of the improved calculations. Specifically, the systems and methods described herein may utilize key components that include enhancements of existing cohort clustering techniques with regard to selecting a number of cohort input dimensions, normalizing input data using a logarithm kernel-function, treatment of categorical data with mutually exclusive and not-mutually exclusive values, methods and visualization tool to determine appropriate number of cohorts, methods and visualization tool to compare cohorts extracted from different input dimensions, and methods to quantify the difference in cohorts. Beyond improvements to the cohort clustering techniques, also disclosed are ancillary tools to prepare input data by joining CRM and product usage data and facilitate subsequent automated action via an API to retrieve cohort results.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computing device, comprising: one or more hardware processors; and a non-transitory computer-readable medium having stored thereon instructions that, when executed by the one or more hardware processors, cause the one or more hardware processors to perform operations comprising: generating a cohort definition using a machine learning model based on a combination of a number (D) of dimensions selected from a first data set comprising first data points, wherein the cohort definition clusters the first data points into a number (K) of first clusters, each of the first clusters having a first center point; applying the machine learning model to a second data set comprising second data points to determine a second center point for each of K of second clusters; causing a user interface to be presented on a display device, the user interface comprising an indication of a difference measure between the first clusters and the second clusters, wherein the different measure is generated based on a difference vector determined for each second center point and a nearest first center point. 2. The computing device of claim 1 , the operations further comprising: determining, for each second cluster: a first cluster having a first center point nearest to the second center point of the second cluster; and a difference scalar between a number of second data points assigned to the second cluster and a number of first data points assigned to the first cluster; wherein the determining of the difference measure between the first clusters and the second clusters is further based on the determined difference scalars. 3. The computing device of claim 1 , the operations further comprising: based on the difference measure between the first clusters and the second clusters and a predetermined threshold, the user interface further comprises a recommendation to select different dimensions for clustering. 4. The computing device of claim 1 , the operations further comprising: applying the cohort definition to a third data set comprising third data points to determine a third center point for each of K of third clusters and assign each data point in the third data set to a third cluster; determining, for each third center point, a second difference vector from a nearest second center point to the third center point; and based on the determined second difference vectors, determining a second difference measure between the second clusters and the third clusters; wherein the user interface further comprises an indication of the second difference measure. 5. The computing device of claim 1 , wherein: the first data points comprise data for a first period of time; and the second data points comprise data for a second period of time. 6. The computing device of claim 1 , wherein the applying of the cohort definition to the first data set comprising the first data points comprises applying K-means clustering to the first data points. 7. The computing device of claim 1 , wherein the operations further comprise: generating the first data points by linking customer relationship management (CRM) and product usage data using shared identifiers. 8. The computing device of claim 7 , wherein the operations further comprise: accessing CRM data that indicates a parent-subsidiary relationship between a parent account and a subsidiary account; and accessing first product usage data that is linked to both the parent account and the subsidiary account; wherein the generating of the first data points comprises generating a data point that links the first product usage data to the subsidiary account. 9. A non-transitory computer-readable medium having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations including: generating a cohort definition using a machine learning model based on a combination of a number (D) of dimensions selected from a first data set comprising first data points, wherein the cohort definition clusters the first data points into a number (K) of first clusters, each of the first clusters having a first center point; applying the machine learning model to a second data set comprising second data points to determine a second center point for each of K of second clusters; causing a user interface to be presented on a display device, the user interface comprising an indication of a difference measure between the first clusters and the second clusters, wherein the different measure is generated based on a difference vector determined for each second center point and a nearest first center point. 10. The non-transitory computer-readable medium of claim 9 , the operations further comprising: determining, for each second cluster: a first cluster having a first center point nearest to the second center point of the second cluster; and a difference scalar between a number of second data points assigned to the second cluster and a number of first data points assigned to the first cluster; wherein the determining of the difference measure between the first clusters and the second clusters is further based on the determined difference scalars. 11. The non-transitory computer-readable medium of claim 9 , the operations further comprising: based on the difference measure between the first clusters and the second clusters and a predetermined threshold, the user interface further comprises a recommendation to select different dimensions for clustering. 12. The non-transitory computer-readable medium of claim 9 , the operations further comprising: applying the cohort definition to a third data set comprising third data points to determine a third center point for each of K of third clusters and assign each data point in the third data set to a third cluster; determining, for each third center point, a second difference vector from a nearest second center point to the third center point; and based on the determined second difference vectors, determining a second difference measure between the second clusters and the third clusters; wherein the user interface further comprises an indication of the second difference measure. 13. The non-transitory computer-readable medium of claim 9 , wherein: the first data points comprise data for a first period of time; and the second data points comprise data for a second period of time. 14. The non-transitory computer-readable medium of claim 9 , wherein the applying of the cohort definition to the first data set comprising the first data points comprises applying K-means clustering to the first data points. 15. The non-transitory computer-readable medium of claim 9 , wherein the operations further comprise: generating the first data points by linking customer relationship management (CRM) and product usage data using shared identifiers. 16. The non-transitory computer-readable medium of claim 15 , wherein the operations further comprise: accessing CRM data that indicates a parent-subsidiary relationship between a parent account and a subsidiary account; and accessing first product usage data that is linked to both the parent account and the subsidiary account; wherein the generating of the first data points comprises generating a data point that links the first product usage data to the subsidiary account. 17. A computer-implemented method, comprising: generating, by one or more processors, a cohort definition using a machine learning model based on a combination of a number (D) of dimensions selected from a first data set

Assignees

Inventors

Classifications

G06F16/285
Clustering or classification · CPC title
G06N20/00
Machine learning · CPC title
G06N5/022Primary
Knowledge engineering; Knowledge acquisition · CPC title
G06Q30/0201Primary
Market modelling; Market analysis; Collecting market data · CPC title

Patent family

Related publications grouped by family.

View patent family 96950467

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12412103B1 cover?: This document discloses methods and systems for cohort identification. The methods and systems include improved calculations to perform cohort identification and practical applications of the improved calculations. Specifically, the systems and methods described herein may utilize key components that include enhancements of existing cohort clustering techniques with regard to selecting a number…
Who is the assignee on this patent?: Splunk Inc, Splunk Llc
What technology area does this patent fall under?: Primary CPC classification G06N5/022. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method of Determining at least one tolerance band limit value for a technical variable under test and corresponding calculation device

Identifying content items in response to a text-based request

Functional object-oriented networks for manipulation learning

Computer architecture for generating hierarchical clusters in a correlithm object processing system

Method and system to predict and interpret conceptual knowledge in the brain

Automated entity-resolution methods and systems

Incremental Generation of Models with Dynamic Clustering

Frequently asked questions