What technology area does this patent fall under?

Primary CPC classification G06F7/08. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 11 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Method, apparatus, and computer-readable medium for efficiently classifying a data object of unknown type

US12468736B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12468736-B2
Application number	US-202318543550-A
Country	US
Kind code	B2
Filing date	Dec 18, 2023
Priority date	Nov 3, 2021
Publication date	Nov 11, 2025
Grant date	Nov 11, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus, computer-readable medium, and computer-implemented method for efficiently classifying a data object, including representing the data object as a data object vector in a vector space, each dimension of the data object vector corresponding to a different feature of the data object, determining a distance between the data object vector and centroids of data domain clusters in the vector space, each data domain cluster comprising data domain vectors representing data domains, sorting the data domain clusters according to their respective distances to the data object vector, and iteratively applying data domain classifiers corresponding to data domains represented in a closest data domain cluster in the sorted data domain clusters to the data object.

First claim

Opening claim text (preview).

The invention claimed is: 1 . A method executed by one or more computing devices for efficiently classifying a data object of unknown type, the method comprising: storing a plurality of data domain vectors corresponding to a plurality of data domain models, each data domain model corresponding to a data object class and each data domain vector comprising a multidimensional vector having a plurality of dimensions, the plurality of dimensions corresponding to a plurality of features of a corresponding data domain model; generating a data object vector corresponding to the data object, the data object vector comprising a multidimensional vector, with each dimension of the data object vector corresponding to a feature of the data object; clustering the plurality of data domain vectors into a plurality of data domain clusters; determining a classification query order corresponding to the data object based at least in part on a distance between the data object vector and one or more data domain clusters in the plurality of data domain clusters, the classification query order specifying an optimal sequence for applying one or more data domain classifiers corresponding to one or more data domain models in the plurality of data domain models to the data object, the optimal sequence being configured minimize a computational cost for classification of the data object. 2 . The method of claim 1 , wherein determining a classification query order corresponding to the data object based at least in part on a distance between the data object vector and each of one or more data domain clusters in the plurality of data domain clusters comprises: determining a distance between the data object vector and each of one or more centroids of the one or more data domain clusters in a vector space corresponding to the data object vector and the plurality of data domain vectors; and ranking the one or more data domain clusters based at least in part on the determined distance. 3 . The method of claim 2 , wherein determining a classification query order corresponding to the data object based at least in part on a distance between the data object vector and each of one or more data domain clusters in the plurality of data domain clusters further comprises: identifying a closest data domain cluster based at least in part on the ranking of the one or more data domain clusters; determining a distance between the data object vector and each of one or more data domain vectors in the closest data domain cluster; and ranking the one or more data domain vectors based at least in part on the determined distance. 4 . The method of claim 1 , further comprising: iteratively applying the one or more data domain classifiers to the data object based at least in part on the classification query order. 5 . The method of claim 4 , wherein iteratively applying the one or more data domain classifiers to the data object based at least in part on the classification query order comprises: determining whether one or more termination conditions are true; and iteratively applying the one or more data domain classifiers to the data object based at least in part on the classification query order and a determination that none of the one or more termination conditions are true. 6 . The method of claim 5 , wherein the one or more termination conditions comprise: the data object being classified by a data domain classifier in the one or more data domain classifiers; or a subsequent data domain classifier in the one or more data domain classifiers having a probability of successful classification of the data object below a predetermined threshold. 7 . An apparatus for efficiently classifying a data object of unknown type, the apparatus comprising: one or more processors; and one or more memories operatively coupled to at least one of the one or more processors and having instructions stored thereon that, when executed by at least one of the one or more processors, cause at least one of the one or more processors to: store a plurality of data domain vectors corresponding to a plurality of data domain models, each data domain model corresponding to a data object class and each data domain vector comprising a multidimensional vector having a plurality of dimensions, the plurality of dimensions corresponding to a plurality of features of a corresponding data domain model; generate a data object vector corresponding to the data object, the data object vector comprising a multidimensional vector, with each dimension of the data object vector corresponding to a feature of the data object; cluster the plurality of data domain vectors into a plurality of data domain clusters; and determine a classification query order corresponding to the data object based at least in part on a distance between the data object vector and one or more data domain clusters in the plurality of data domain clusters, the classification query order specifying an optimal sequence for applying one or more data domain classifiers corresponding to one or more data domain models in the plurality of data domain models to the data object, the optimal sequence being configured minimize a computational cost for classification of the data object. 8 . The apparatus of claim 7 , wherein the instructions that, when executed by at least one of the one or more processors, cause at least one of the one or more processors to determine a classification query order corresponding to the data object based at least in part on a distance between the data object vector and each of one or more data domain clusters in the plurality of data domain clusters further cause at least one of the one or more processors to: determine a distance between the data object vector and each of one or more centroids of the one or more data domain clusters in a vector space corresponding to the data object vector and the plurality of data domain vectors; and rank the one or more data domain clusters based at least in part on the determined distance. 9 . The apparatus of claim 8 , wherein the instructions that, when executed by at least one of the one or more processors, cause at least one of the one or more processors to determine a classification query order corresponding to the data object based at least in part on a distance between the data object vector and each of one or more data domain clusters in the plurality of data domain clusters further cause at least one of the one or more processors to: identifying a closest data domain cluster based at least in part on the ranking of the one or more data domain clusters; determining a distance between the data object vector and each of one or more data domain vectors in the closest data domain cluster; and ranking the one or more data domain vectors based at least in part on the determined distance. 10 . The apparatus of claim 7 , further storing computer-readable instructions that, when executed by at least one of the one or more computing devices, cause at least one of the one or more computing devices to: identify a closest data domain cluster based at least in part on the ranking of the one or more data domain clusters; determine a distance between the data object vector and each of one or more data domain vectors in the closest data domain cluster; and rank the one or more data domain vectors based at least in part on the determined distance. 11 . The apparatus of claim 10 , wherein the instructions that, when executed by at least one of the one or more processors, cause at least one of the one or more processors to iteratively apply the one or more data domain classifiers to the data object based at least in part on the clas

Assignees

Informatica Llc

Inventors

Balabine Igor

Classifications

G06F16/2264
Multidimensional index structures · CPC title
G06F16/2237
Vectors, bitmaps or matrices · CPC title
G06F7/08Primary
Sorting, i.e. grouping record carriers in numerical or other ordered sequence according to the classification of at least some of the information they carry (by merging two or more sets of carriers in ordered sequence G06F7/16) · CPC title
G06F16/285Primary
Clustering or classification · CPC title

Patent family

Related publications grouped by family.

View patent family 86144778

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12468736B2 cover?: An apparatus, computer-readable medium, and computer-implemented method for efficiently classifying a data object, including representing the data object as a data object vector in a vector space, each dimension of the data object vector corresponding to a different feature of the data object, determining a distance between the data object vector and centroids of data domain clusters in the vec…
Who is the assignee on this patent?: Informatica Llc
What technology area does this patent fall under?: Primary CPC classification G06F7/08. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 11 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).