Records Access and Management
US-2024419838-A1 · Dec 19, 2024 · US
US2016283735A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016283735-A1 |
| Application number | US-201514667163-A |
| Country | US |
| Kind code | A1 |
| Filing date | Mar 24, 2015 |
| Priority date | Mar 24, 2015 |
| Publication date | Sep 29, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system, method and computer program product for generating a classification model using original data that is sensitive or private to a data owner. The method includes: receiving, from one or more entities, a masked data set having masked data corresponding to the original sensitive data, and further including a masked feature label set for use in classifying the masked data contents; forming a shared data collection of the masked data and the masked feature label sets received; and training, by a second entity, a classification model from the shared masked data and feature label sets, wherein the classification model learned from the shared masked data and feature label sets is the same as a classification model learned from the original sensitive data. The sensitive features and labels cannot be reliably recovered even when both the masked data and the learning algorithm are known.
Opening claim text (preview).
1 - 7 . (canceled) 8 . A system for generating a classification model using original data that is sensitive or private to a data owner comprising: a memory storage device; a hardware processor in communication with said memory storage device, the hardware processor configured to perform a method to: receive, from one or more first entities, a masked data set, each data set from an entity having masked data corresponding to the original sensitive data, the masked data set further including a masked feature label set for use in classifying the masked data contents; form a shared data collection of the masked data and the masked feature label sets received from the first entities; and train, by a second entity, a classification model from the shared masked data and feature label sets, the model being a classification model configured to classify original sensitive data contained in masked data sets received from the entities, wherein the classification model learned from the shared masked data and feature label sets is the same as the model learned from the original sensitive data. 9 . The system of claim 8 , wherein said processor device is further configured to generate said masked data by: access, from a computing device associated with a first entity, one or more records having original data sensitive to a data owner; generate an original data matrix of original data content including sensitive features and a corresponding feature label set for use in classifying said feature data; generate a random feature matrix sharing the same subspace as said sensitive features of original data matrix; compute an intermediate data structure as a product of said original data feature set matrix and said generated random feature matrix; compute one or more further intermediate data structures; form a convex optimization problem having an objective function based on said intermediate data structure, said original data matrix of original data content, said corresponding feature label set, and said one or more further intermediate data structures; and solve said convex optimization problem, said solving generating said masked matrix data feature set and masked feature label set. 10 . The system of claim 9 , wherein to compute said one or more further intermediate data structures, said processor device is further configured to: compute a low-rank soft feature data matrix and a corresponindg soft class labels vector, the formed low-rank soft feature data matrix having denoised features and class labels of the sensitive data, said low-rank soft feature data matrix having entries that include the original data matrix of original data content having an added noise component. 11 . The system of claim 9 , wherein to compute said one or more further intermediate data structures, said processor device is further configured to: compute a first loss function for a feature according to: ℒ A ( A , A ~ ) = ∑ i ∑ j ( A ij - A ~ ij ) 2 where A ij and à ij represent the feature matrix A and low-ranked feature matrix, respectively, and i and j are indices into the respective feature set matrices; and compute a second loss function according to: ℒ b ( b , b ~ ) = ∑ i ∑ j 1 γ log { 1 + exp [ - γ ( b ij b ~ ij ) ] } where {tilde over (b)} and {tilde over (b)} ij represent an class and low-rank class label vector set, and γ is a variable. 12 . The system of claim 11 , wherein said processor device is configured to form said convex optimization problem as an objective function according to: min C , d , A ~ , b ~ μ [
by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title
Physics · mapped topic
Protecting personal data, e.g. for financial or medical purposes · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.