Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification H04L63/0428. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Oct 17 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Flexible data security and machine learning system for merging third-party data

US11792167B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11792167-B2
Application number	US-202117219482-A
Country	US
Kind code	B2
Filing date	Mar 31, 2021
Priority date	Mar 31, 2021
Publication date	Oct 17, 2023
Grant date	Oct 17, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for a flexible data security and machine learning system for merging third-party data are provided. In one technique, the system receives a data set from a third-party entity and receives selection data that indicates that the third-party entity selected a set of data security policies that includes an encryption option and a data mixing option from among multiple data mixing options. In response to receiving the selection data, the system stores data that associates the set of data security policies with the data set, encrypts the data set according to the encryption option, and persistently stores the encrypted data set. Later, the system decrypts the encrypted data set in volatile memory, generates, based on the data mixing option, training data based on the decrypted version of the data set, trains a machine-learned model based on the training data, and stores the machine-learned model in association with the data set.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, from a third-party entity, a data set; receiving selection data that indicates that the third-party entity selected a set of data security policies that includes an encryption option and a data mixing option from among a plurality of data mixing options; in response to receiving the selection data: storing data that associates the set of data security policies with the data set; encrypting the data set according to the encryption option to generate an encrypted data set; storing the encrypted data set in persistent storage; after storing the encrypted data set: reading the encrypted data set from the persistent storage into volatile memory; based on the data mixing option that is associated with the data set, generating training data based on the encrypted data set and training a machine-learned model based on the training data; storing the machine-learned model in association with the data set and the third-party entity; wherein the method is performed by one or more computing devices. 2. The method of claim 1 , wherein the encryption option is an encryption-on-storage option that is one of a plurality of data accessibility options that includes one or more of: end-to-end encryption or no encryption. 3. The method of claim 1 , wherein the set of data security policies includes a data storage option from among a plurality of data storage options; the plurality of data storage options including at least one storage option selected from a group comprising: a no storage separation policy; a logically-separated storage policy; or a physically-separate storage policy; and combinations thereof. 4. The method of claim 1 , wherein the plurality of data mixing options include two or more of: complete separation, sharing of coefficients, combinable with public data, mergeable with data sets from similar entities, sample-based mergeable, or mergeable-with-any-data set. 5. The method of claim 1 , wherein: the data set is a first data set and the third-party entity is a first third-party entity; the data mixing option is a sharing of coefficients option; the machine-learned model is a third machine-learned model; the method further comprising: generating first training data based on the data set and no other data set; using one or more machine learning techniques to generate a first machine-learned model based on the first training data; generating second training data based on a second data set from a second third-party entity that is different than the first third-party entity; using the one or more machine learning techniques to generate a second machine-learned model based on the second training data; the first machine-learned model comprises a first set of coefficients; the second machine-learned model comprises a second set of coefficients that is different than the first set of coefficients; generating the third machine-learned model comprises aggregating the first set of coefficients and the second set of coefficients to generate a third set of coefficients; the third machine-learned model comprises the third set of coefficients. 6. The method of claim 1 , wherein: the data mixing option is a combinable-with-public data option; the machine-learned model is a third machine-learned model; the method further comprising: generating first training data based on profile data that was uploaded to a content sharing platform by a plurality of users of the content sharing platform; generating a first machine-learned model based on the first training data; generating second training data based on the data set; generating a second machine-learned model based on the second training data; the third machine-learned model is based on the first machine-learned model and the second machine-learned model. 7. The method of claim 1 , wherein: the data mixing option is a mergeable-with-data sets-from-similar-entities option; the method further comprising: identifying a plurality of data sets that includes the data set and that are similar in size with each other; generating training data based on the plurality of data sets; generating the machine-learned model is based on the training data using one or more machine learning techniques; storing the machine-learned model in association with the data set and the third-party entity comprises storing the machine-learned model in associated with each data set in the plurality of data sets and with each third-party entity that provided a data set in the plurality of data sets. 8. The method of claim 1 , wherein: the data mixing option is a sample-based mergeable option; the method further comprising: identifying a plurality of data sets that includes the data set and one or more other data sets; for each data set in the plurality of data sets: retrieving a sample from the data set; adding the sample to a sample set, wherein a size of each sample in the sample set is approximately the same; generating training data based on the sample set; generating the machine-learned model is based on the training data using one or more machine learning techniques; storing the machine-learned model in association with the data set and the third-party entity comprises storing the machine-learned model in associated with each data set in the plurality of data sets and with each third-party entity that provided a data set in the plurality of data sets. 9. The method of claim 1 , wherein: the data mixing option is a mergeable-with-any-data set option; the method further comprising: identifying a plurality of data sets that includes the data set and one or more other data sets that are also associated with the mergeable-with-any-data set option; generating training data based on the plurality of data sets; generating the machine-learned model is based on the training data using one or more machine learning techniques; storing the machine-learned model in association with the data set and the third-party entity comprises storing the machine-learned model in associated with each data set in the plurality of data sets and with each third-party entity that provided a data set in the plurality of data sets. 10. The method of claim 1 , further comprising: causing a user interface to be presented on a screen of a computing device; wherein the user interface indicates (1) a plurality of data accessibility options that includes the encryption option and (2) the plurality of data mixing options; wherein a user of the computing device selects, through the user interface, the encryption option and the data mixing option. 11. The method of claim 1 , further comprising: receiving second selection data that indicates that a second third-party entity selected a second set of data security policies that includes a data accessibility option from among a plurality of data accessibility options and a second data mixing option from among the plurality of data mixing options; determining whether the data accessibility option conflicts with the second data mixing option; in response to determining that the data accessibility option conflicts with the second data mixing option, generating a notification that indicates that a conflict exists and causing the notification to be presented on a computing device. 12. The method of claim 1 , wherein the data mixing option is a first data mixing option, the method further comprising: receiving input that indicates the third-party entity selects a second data mixing option that is different than the first data mixing option; in response to receiving the input: updating the set of data security policies to indicate th

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

H04L63/0428Primary
wherein the data content is protected, e.g. by encrypting or encapsulating the payload · CPC title
G06N20/00
Machine learning · CPC title
H04L63/20
for managing network security; network security policies in general (filtering policies H04L63/0227) · CPC title

Patent family

Related publications grouped by family.

View patent family 83448494

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11792167B2 cover?: Techniques for a flexible data security and machine learning system for merging third-party data are provided. In one technique, the system receives a data set from a third-party entity and receives selection data that indicates that the third-party entity selected a set of data security policies that includes an encryption option and a data mixing option from among multiple data mixing options…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification H04L63/0428. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Oct 17 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Secure storage and processing of data for generating training data

Systems and methods for executing data protection policies specific to a classified organizational structure

Systems and methods for computing with private healthcare data

Inferring user demographics from user behavior using Bayesian inference

Generating and training machine learning systems using stored training datasets

Input processing for machine learning

Frequently asked questions