Secure Training of Multi-Party Deep Neural Network

US2017372201A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2017372201-A1
Application numberUS-201715630944-A
CountryUS
Kind codeA1
Filing dateJun 22, 2017
Priority dateJun 22, 2016
Publication dateDec 28, 2017
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A deep neural network may be trained on the data of one or more entities, also know as Alices. An outside computing entity, also known as a Bob, may assist in these computations, without receiving access to Alices' data. Data privacy may be preserved by employing a “split” neural network. The network may comprise an Alice part and a Bob part. The Alice part may comprise at least three neural layers, and the Bob part may comprise at least two neural layers. When training on data of an Alice, that Alice may input her data into the Alice part, perform forward propagation though the Alice part, and then pass output activations for the final layer of the Alice part to Bob. Bob may then forward propagate through the Bob part. Similarly, backpropagation may proceed backwards through the Bob part, and then through the Alice part of the network.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of training a neural network, partially on a first set of one or more computers and partially on a second set of one or more computers, wherein: (a) the network comprises a first part and a second part, the first and second parts of the network being denoted in this claim as the “Bob part” and “Alice part”, respectively, and the first and second sets of computers being denoted in this claim as the “Bob computer” and “Alice computer”, respectively; (b) the Alice part of the network comprises three or more neural layers; (c) the Bob part of the network comprises two or more neural layers; (d) a first dataset is inputted into an input layer of the Alice part of the network; (e) the Alice computer performs forward propagation through the Alice part of the network; (f) output activations of an output layer of the Alice part of the network are sent to the Bob computer and are inputted into an input layer of the Bob part of the network; (g) the Bob computer performs forward propagation through the Bob part of the network; (h) the Bob computer calculate losses and gradients; (i) the Bob computer performs backpropagation through the Bob part of the network; (j) gradients of the input layer of the Bob part of the network are sent to the Alice computer; (k) the Alice computer performs backpropagation through the Alice part of the network; and (l) the Bob computer does not have access to the first dataset. 2 . The method of claim 1 , wherein, for one or more layers of the Bob part of the network, the Alice computer does not have access to any data that specifies topology of the one or more layers. 3 . The method of claim 1 , wherein the Alice computer does not have access to data that specifies any hyperparameter of the network that is in the group of hyperparameters consisting of learning rate, learning rate decay, weight decay, and momentum. 4 . The method of claim 1 , wherein: (a) an ensemble includes the neural network of claim 1 and also includes one or more additional neural networks; (b) for each respective additional network in the ensemble, (i) the respective network comprises a first portion and a second portion, the first and second portions of the respective network being denoted in this claim as the “Bob portion” and “Alice portion”, respectively, (ii) the Alice portion of the respective network comprises three or more neural layers, (iii) the Bob portion of the respective network comprises two or more neural layers, (iv) the first dataset is inputted into an input layer of the Alice portion of the respective network, (v) the Alice computer performs forward propagation through the Alice portion of the respective network, (vi) output activations of an output layer of the Alice portion of the respective network are sent to the Bob computer and inputted into an input layer of the Bob portion of the respective network, (vii) the Bob computer performs forward propagation through the Bob portion of the respective network, (viii) the Bob computer calculates losses and gradients, (ix) the Bob computer performs backpropagation through the Bob portion of the respective network, (x) gradients of the input layer of the Bob portion of the respective network are sent to the Alice computer, and (xi) the Alice computer performs backpropagation through the Alice portion of the respective network; and (c) during test mode, each network in the ensemble, respectively, outputs a classification, such that the networks in the ensemble collectively output a set of classifications; (d) based on the set of classifications, a classification is determined according to a voting function; and (e) none of the networks in the ensemble is identical to any other network in the ensemble. 5 . The method of claim 1 , wherein: (a) during test mode after the network is trained, the network takes, as input, a second dataset and outputs labels regarding the second dataset; (b) the labels are shared with an additional set of one or more computers; (c) the additional set of computers performs forward and back propagation in a second network while training the second network; (d) the additional set of computers trains the second network on the second dataset, by a training that includes employing the labels that were shared; (e) the additional set of computers does not have access to the first dataset; and (f) the first dataset is not identical to the second dataset. 6 . The method of claim 5 , wherein: (a) the network mentioned in claim 1 has a first topology; (b) the second network has a second topology; and (c) the first topology is different than the second topology. 7 . A method of training a neural network, partially on a first set of one or more computers and partially on other sets of one or more computers each, wherein: (a) the network comprises a first part and a second part, the first and second parts of the network being denoted in this claim as the “Bob part” and “Alice part”, respectively, the first set of computers being denoted in this claim as the “Bob computer”, and each of the other sets of computers, respectively, being denoted in this claim as an “Alice computer”; (b) the Alice part of the network comprises three or more neural layers; (c) the Bob part of the network comprises two or more neural layers; (d) for each respective Alice computer (i) a dataset is inputted by the respective Alice computer into an input layer of the Alice part of the network; (ii) the respective Alice computer performs forward propagation through the Alice part of the network; (iii) output activations of an output layer of the Alice part of the network are sent to the Bob computer and inputted into an input layer of the Bob part of the network; (iv) the Bob computer performs forward propagation through the Bob part of the network; (v) the Bob computer calculates losses and gradients; (vi) the Bob computer performs backpropagation through the Bob part of the network; (vii) gradients of the input layer of the Bob part of the network are sent to the respective Alice computer, and (viii) the respective Alice computer performs backpropagation through the Alice part of the network; and (e) the Bob computer has access to no database inputted in clause (d)(i) of this claim. 8 . The method of claim 7 , wherein, for one or more layers of the Bob part of the network, none of the Alice computers have access to any data that specifies topology of the one or more layers. 9 . The method of claim 7 , wherein, for a set of hyperparameters of the network, none of the Alice computers have access to data that specifies any hyperparameter of the network that is in the group of hyperparameters consisting of learning rate, learning rate decay, weight decay, and momentum. 10 . The method of claim 7 , wherein each Alice computer has access to no database inputted in clause (d)(i) of claim 7 by any other Alice computer. 11 . The method of claim 7 , wherein, after a first Alice computer performs steps (d)(ii) and (d)(viii) of claim 7 : (a) the first Alice computer uploads encrypted weights of the Alice part of the network to a server; and (b) a second Alice computer downloads the encrypted weights from the server. 12 . The method of claim 7 , wherein: (a) after a first Alice computer performs steps (d)(ii) and (d)(viii) of claim 7 (i) the first Alice computer uploads encrypted weight updates for the Alice part of the network to a server, and (ii) a second Alice computer downloads the encrypted weight updates from the server; and (b) each weight update, respectively, denotes a change in a given weig

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • G06N3/084Primary

    Backpropagation, e.g. using gradient descent · CPC title

  • G06N3/098Primary

    Distributed learning, e.g. federated learning · CPC title

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2017372201A1 cover?
A deep neural network may be trained on the data of one or more entities, also know as Alices. An outside computing entity, also known as a Bob, may assist in these computations, without receiving access to Alices' data. Data privacy may be preserved by employing a “split” neural network. The network may comprise an Alice part and a Bob part. The Alice part may comprise at least three neural la…
Who is the assignee on this patent?
Massachusetts Inst Technology
What technology area does this patent fall under?
Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).