Secure storage and processing of data for generating training data
US-2022253540-A1 · Aug 11, 2022 · US
US11792167B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11792167-B2 |
| Application number | US-202117219482-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 31, 2021 |
| Priority date | Mar 31, 2021 |
| Publication date | Oct 17, 2023 |
| Grant date | Oct 17, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for a flexible data security and machine learning system for merging third-party data are provided. In one technique, the system receives a data set from a third-party entity and receives selection data that indicates that the third-party entity selected a set of data security policies that includes an encryption option and a data mixing option from among multiple data mixing options. In response to receiving the selection data, the system stores data that associates the set of data security policies with the data set, encrypts the data set according to the encryption option, and persistently stores the encrypted data set. Later, the system decrypts the encrypted data set in volatile memory, generates, based on the data mixing option, training data based on the decrypted version of the data set, trains a machine-learned model based on the training data, and stores the machine-learned model in association with the data set.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving, from a third-party entity, a data set; receiving selection data that indicates that the third-party entity selected a set of data security policies that includes an encryption option and a data mixing option from among a plurality of data mixing options; in response to receiving the selection data: storing data that associates the set of data security policies with the data set; encrypting the data set according to the encryption option to generate an encrypted data set; storing the encrypted data set in persistent storage; after storing the encrypted data set: reading the encrypted data set from the persistent storage into volatile memory; based on the data mixing option that is associated with the data set, generating training data based on the encrypted data set and training a machine-learned model based on the training data; storing the machine-learned model in association with the data set and the third-party entity; wherein the method is performed by one or more computing devices. 2. The method of claim 1 , wherein the encryption option is an encryption-on-storage option that is one of a plurality of data accessibility options that includes one or more of: end-to-end encryption or no encryption. 3. The method of claim 1 , wherein the set of data security policies includes a data storage option from among a plurality of data storage options; the plurality of data storage options including at least one storage option selected from a group comprising: a no storage separation policy; a logically-separated storage policy; or a physically-separate storage policy; and combinations thereof. 4. The method of claim 1 , wherein the plurality of data mixing options include two or more of: complete separation, sharing of coefficients, combinable with public data, mergeable with data sets from similar entities, sample-based mergeable, or mergeable-with-any-data set. 5. The method of claim 1 , wherein: the data set is a first data set and the third-party entity is a first third-party entity; the data mixing option is a sharing of coefficients option; the machine-learned model is a third machine-learned model; the method further comprising: generating first training data based on the data set and no other data set; using one or more machine learning techniques to generate a first machine-learned model based on the first training data; generating second training data based on a second data set from a second third-party entity that is different than the first third-party entity; using the one or more machine learning techniques to generate a second machine-learned model based on the second training data; the first machine-learned model comprises a first set of coefficients; the second machine-learned model comprises a second set of coefficients that is different than the first set of coefficients; generating the third machine-learned model comprises aggregating the first set of coefficients and the second set of coefficients to generate a third set of coefficients; the third machine-learned model comprises the third set of coefficients. 6. The method of claim 1 , wherein: the data mixing option is a combinable-with-public data option; the machine-learned model is a third machine-learned model; the method further comprising: generating first training data based on profile data that was uploaded to a content sharing platform by a plurality of users of the content sharing platform; generating a first machine-learned model based on the first training data; generating second training data based on the data set; generating a second machine-learned model based on the second training data; the third machine-learned model is based on the first machine-learned model and the second machine-learned model. 7. The method of claim 1 , wherein: the data mixing option is a mergeable-with-data sets-from-similar-entities option; the method further comprising: identifying a plurality of data sets that includes the data set and that are similar in size with each other; generating training data based on the plurality of data sets; generating the machine-learned model is based on the training data using one or more machine learning techniques; storing the machine-learned model in association with the data set and the third-party entity comprises storing the machine-learned model in associated with each data set in the plurality of data sets and with each third-party entity that provided a data set in the plurality of data sets. 8. The method of claim 1 , wherein: the data mixing option is a sample-based mergeable option; the method further comprising: identifying a plurality of data sets that includes the data set and one or more other data sets; for each data set in the plurality of data sets: retrieving a sample from the data set; adding the sample to a sample set, wherein a size of each sample in the sample set is approximately the same; generating training data based on the sample set; generating the machine-learned model is based on the training data using one or more machine learning techniques; storing the machine-learned model in association with the data set and the third-party entity comprises storing the machine-learned model in associated with each data set in the plurality of data sets and with each third-party entity that provided a data set in the plurality of data sets. 9. The method of claim 1 , wherein: the data mixing option is a mergeable-with-any-data set option; the method further comprising: identifying a plurality of data sets that includes the data set and one or more other data sets that are also associated with the mergeable-with-any-data set option; generating training data based on the plurality of data sets; generating the machine-learned model is based on the training data using one or more machine learning techniques; storing the machine-learned model in association with the data set and the third-party entity comprises storing the machine-learned model in associated with each data set in the plurality of data sets and with each third-party entity that provided a data set in the plurality of data sets. 10. The method of claim 1 , further comprising: causing a user interface to be presented on a screen of a computing device; wherein the user interface indicates (1) a plurality of data accessibility options that includes the encryption option and (2) the plurality of data mixing options; wherein a user of the computing device selects, through the user interface, the encryption option and the data mixing option. 11. The method of claim 1 , further comprising: receiving second selection data that indicates that a second third-party entity selected a second set of data security policies that includes a data accessibility option from among a plurality of data accessibility options and a second data mixing option from among the plurality of data mixing options; determining whether the data accessibility option conflicts with the second data mixing option; in response to determining that the data accessibility option conflicts with the second data mixing option, generating a notification that indicates that a conflict exists and causing the notification to be presented on a computing device. 12. The method of claim 1 , wherein the data mixing option is a first data mixing option, the method further comprising: receiving input that indicates the third-party entity selects a second data mixing option that is different than the first data mixing option; in response to receiving the input: updating the set of data security policies to indicate th
wherein the data content is protected, e.g. by encrypting or encapsulating the payload · CPC title
Machine learning · CPC title
for managing network security; network security policies in general (filtering policies H04L63/0227) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.