Attribute similarity-based search
US-10043109-B1 · Aug 7, 2018 · US
US12468736B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12468736-B2 |
| Application number | US-202318543550-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 18, 2023 |
| Priority date | Nov 3, 2021 |
| Publication date | Nov 11, 2025 |
| Grant date | Nov 11, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus, computer-readable medium, and computer-implemented method for efficiently classifying a data object, including representing the data object as a data object vector in a vector space, each dimension of the data object vector corresponding to a different feature of the data object, determining a distance between the data object vector and centroids of data domain clusters in the vector space, each data domain cluster comprising data domain vectors representing data domains, sorting the data domain clusters according to their respective distances to the data object vector, and iteratively applying data domain classifiers corresponding to data domains represented in a closest data domain cluster in the sorted data domain clusters to the data object.
Opening claim text (preview).
The invention claimed is: 1 . A method executed by one or more computing devices for efficiently classifying a data object of unknown type, the method comprising: storing a plurality of data domain vectors corresponding to a plurality of data domain models, each data domain model corresponding to a data object class and each data domain vector comprising a multidimensional vector having a plurality of dimensions, the plurality of dimensions corresponding to a plurality of features of a corresponding data domain model; generating a data object vector corresponding to the data object, the data object vector comprising a multidimensional vector, with each dimension of the data object vector corresponding to a feature of the data object; clustering the plurality of data domain vectors into a plurality of data domain clusters; determining a classification query order corresponding to the data object based at least in part on a distance between the data object vector and one or more data domain clusters in the plurality of data domain clusters, the classification query order specifying an optimal sequence for applying one or more data domain classifiers corresponding to one or more data domain models in the plurality of data domain models to the data object, the optimal sequence being configured minimize a computational cost for classification of the data object. 2 . The method of claim 1 , wherein determining a classification query order corresponding to the data object based at least in part on a distance between the data object vector and each of one or more data domain clusters in the plurality of data domain clusters comprises: determining a distance between the data object vector and each of one or more centroids of the one or more data domain clusters in a vector space corresponding to the data object vector and the plurality of data domain vectors; and ranking the one or more data domain clusters based at least in part on the determined distance. 3 . The method of claim 2 , wherein determining a classification query order corresponding to the data object based at least in part on a distance between the data object vector and each of one or more data domain clusters in the plurality of data domain clusters further comprises: identifying a closest data domain cluster based at least in part on the ranking of the one or more data domain clusters; determining a distance between the data object vector and each of one or more data domain vectors in the closest data domain cluster; and ranking the one or more data domain vectors based at least in part on the determined distance. 4 . The method of claim 1 , further comprising: iteratively applying the one or more data domain classifiers to the data object based at least in part on the classification query order. 5 . The method of claim 4 , wherein iteratively applying the one or more data domain classifiers to the data object based at least in part on the classification query order comprises: determining whether one or more termination conditions are true; and iteratively applying the one or more data domain classifiers to the data object based at least in part on the classification query order and a determination that none of the one or more termination conditions are true. 6 . The method of claim 5 , wherein the one or more termination conditions comprise: the data object being classified by a data domain classifier in the one or more data domain classifiers; or a subsequent data domain classifier in the one or more data domain classifiers having a probability of successful classification of the data object below a predetermined threshold. 7 . An apparatus for efficiently classifying a data object of unknown type, the apparatus comprising: one or more processors; and one or more memories operatively coupled to at least one of the one or more processors and having instructions stored thereon that, when executed by at least one of the one or more processors, cause at least one of the one or more processors to: store a plurality of data domain vectors corresponding to a plurality of data domain models, each data domain model corresponding to a data object class and each data domain vector comprising a multidimensional vector having a plurality of dimensions, the plurality of dimensions corresponding to a plurality of features of a corresponding data domain model; generate a data object vector corresponding to the data object, the data object vector comprising a multidimensional vector, with each dimension of the data object vector corresponding to a feature of the data object; cluster the plurality of data domain vectors into a plurality of data domain clusters; and determine a classification query order corresponding to the data object based at least in part on a distance between the data object vector and one or more data domain clusters in the plurality of data domain clusters, the classification query order specifying an optimal sequence for applying one or more data domain classifiers corresponding to one or more data domain models in the plurality of data domain models to the data object, the optimal sequence being configured minimize a computational cost for classification of the data object. 8 . The apparatus of claim 7 , wherein the instructions that, when executed by at least one of the one or more processors, cause at least one of the one or more processors to determine a classification query order corresponding to the data object based at least in part on a distance between the data object vector and each of one or more data domain clusters in the plurality of data domain clusters further cause at least one of the one or more processors to: determine a distance between the data object vector and each of one or more centroids of the one or more data domain clusters in a vector space corresponding to the data object vector and the plurality of data domain vectors; and rank the one or more data domain clusters based at least in part on the determined distance. 9 . The apparatus of claim 8 , wherein the instructions that, when executed by at least one of the one or more processors, cause at least one of the one or more processors to determine a classification query order corresponding to the data object based at least in part on a distance between the data object vector and each of one or more data domain clusters in the plurality of data domain clusters further cause at least one of the one or more processors to: identifying a closest data domain cluster based at least in part on the ranking of the one or more data domain clusters; determining a distance between the data object vector and each of one or more data domain vectors in the closest data domain cluster; and ranking the one or more data domain vectors based at least in part on the determined distance. 10 . The apparatus of claim 7 , further storing computer-readable instructions that, when executed by at least one of the one or more computing devices, cause at least one of the one or more computing devices to: identify a closest data domain cluster based at least in part on the ranking of the one or more data domain clusters; determine a distance between the data object vector and each of one or more data domain vectors in the closest data domain cluster; and rank the one or more data domain vectors based at least in part on the determined distance. 11 . The apparatus of claim 10 , wherein the instructions that, when executed by at least one of the one or more processors, cause at least one of the one or more processors to iteratively apply the one or more data domain classifiers to the data object based at least in part on the clas
Multidimensional index structures · CPC title
Vectors, bitmaps or matrices · CPC title
Sorting, i.e. grouping record carriers in numerical or other ordered sequence according to the classification of at least some of the information they carry (by merging two or more sets of carriers in ordered sequence G06F7/16) · CPC title
Clustering or classification · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.