Transductive lasso for high-dimensional data regression problems

US9704105B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9704105-B2
Application numberUS-201615155151-A
CountryUS
Kind codeB2
Filing dateMay 16, 2016
Priority dateJan 18, 2013
Publication dateJul 11, 2017
Grant dateJul 11, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various embodiments select features from a feature space. In one embodiment, a set of training samples and a set of test samples are received. A first centered Gram matrix of a given dimension is determined for each of a set of feature vectors that include at least one of the set of training samples and at least one of the set of test samples. A second centered Gram matrix of the given dimension is determined for a target value vector that includes target values from the set of training samples. A set of columns and rows associated with the at least one of the test samples in the second centered Gram matrix is set to 0. A subset of features is selected from a set of features based on the first and second centered Gram matrices.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method for reducing computation time of an information processing system when selecting features from a feature space to train one or more algorithms, the computer implemented method comprising: receiving, by a feature selection circuit, a set of training samples and a set of test samples, wherein each of the set of training samples comprises a set of features and a target value, and wherein the set of test samples comprises the set of features absent the target value; training, by the feature selection circuit, one or more learning algorithms based on subset of the set of features; and reducing, by the feature selection circuit, computation time of the information processing system during the training of the one or more learning algorithms, wherein reducing the computation time comprises: determining a first centered Gram matrix of a given dimension for each of a set of feature vectors comprising at least one of the set of training samples and at least one of the set of test samples; determining a second centered Gram matrix of the given dimension for a target value vector comprising the target values from the set of training samples; and selecting a subset of features from the set of features based on the first and second centered Gram matrices. 2. The computer implemented method of claim 1 , wherein determining each of the first centered Gram matrices comprises: determining, for each of the set of feature vectors, a Gram matrix based on computing, a Gaussian kernel function on each pair of vector elements in the feature vector; and multiplying a centering matrix on each side of the Gram matrix. 3. The computer implemented method of claim 1 , wherein determining the second centered Gram matrix comprises: generating the target value vector with a first n values being the target values from the set of training samples, and a remaining n′ values being set to infinity, where n′ is a number of test samples in the set of test samples; determining a Gram matrix based on computing, a Gaussian kernel function of size (n+n′)×(n+n′) on each pair of vector elements in the target value vector; setting a set of columns and rows in the Gram matrix with index [n+1, . . . , n+n′] to 0; and multiplying, after the setting, a centering matrix on each side of the Gram matrix. 4. The computer implemented method of claim 1 , further comprising: concatenating each column in the second centered Gram matrix into a vector of size (n+n′)×(n+n′), where n corresponds to a number of target values in the set of training samples and n′ corresponds to a number of test samples in the set of test samples. 5. The computer implemented method of claim 4 , further comprising: concatenating each column in each of the first centered Gram matrices into one of a set of d vectors of size (n+n′)×(n+n′), where n corresponds to a number of target values in the set of training samples and n′ corresponds to a number of test samples in the set of test samples. 6. The computer implemented method of claim 5 , further comprising: generating a single matrix based on each of the set of d vectors, where each column of the single matrix is one of the set of d vectors, and where the single matrix is of a size (n+n′)×(n+n′)×d. 7. The computer implemented method of claim 6 , wherein the subset of features are selected from the single matrix and the single vector. 8. The computer implemented method of claim 1 , wherein the selecting is based on: ⁢ min α ∈ • d ⁢ 1 2 ⁢  L _ ′ - ∑ k = 1 d ⁢ ⁢ α k ⁢ K _ ′ ⁡ ( k )  Frob 2 + λ ⁢  α  1 , ⁢ ⁢ s . t . ⁢ α 1 , … ⁢ , α d ≥ 0 , where 1 2 ⁢  L _ ′ -

Assignees

Inventors

Classifications

  • G06N20/00Primary

    Machine learning · CPC title

  • using kernel methods, e.g. support vector machines [SVM] · CPC title

  • based on approximation criteria, e.g. principal component analysis · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • G06N99/005Primary

    Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9704105B2 cover?
Various embodiments select features from a feature space. In one embodiment, a set of training samples and a set of test samples are received. A first centered Gram matrix of a given dimension is determined for each of a set of feature vectors that include at least one of the set of training samples and at least one of the set of test samples. A second centered Gram matrix of the given dimensio…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 11 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).