Pattern processing apparatus and method, and program
US-9117111-B2 · Aug 25, 2015 · US
US9400918B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9400918-B2 |
| Application number | US-201414375668-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 29, 2014 |
| Priority date | May 29, 2014 |
| Publication date | Jul 26, 2016 |
| Grant date | Jul 26, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A deep learning framework jointly optimizes the compactness and discriminative ability of face representations. The compact representation can be as compact as 32 bits and still produce highly discriminative performance. In another aspect, based on the extreme compactness, traditional face analysis tasks (e.g. gender analysis) can be effectively solved by a Look-Up-Table approach given a large-scale face data set.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for training a deep learning neural network for compact face representations, the method comprising: presenting face images to the neural network, wherein the neural network is a pyramid convolutional neural network (CNN) comprising at least N shared layers where N≧2 and at least one unshared network coupled to the Nth shared layer; the neural network processing the face images to produce compact representations of the face images, wherein the compact representations have not more than 64 dimensions; processing the compact representations to produce estimates of a metric, for which actual values of the metric are known; and training the neural network based on the estimates of the metric compared to the actual values of the metric, wherein training the pyramid CNN comprises: training CNN levels 1 to N in that order, wherein CNN level n comprises an input for receiving the face images, the first n shared layers of the pyramid CNN, the unshared network of the pyramid CNN, and an output producing the compact representations of the face images; wherein the input is coupled to a first of the n shared layers; each shared layer includes convolution, non-linearity and down-sampling; an nth of the n shared layers is coupled to the unshared network; and the unshared network is coupled to the output, wherein training CNN level n comprises: presenting face images to the input, each face image producing the corresponding compact representation at the output, processing the compact representations to produce estimates of a metric, for which actual values of the metric are known, and adapting the nth shared layer and the unshared network based on the estimates of the metric compared to the actual values of the metric. 2. The computer-implemented method of claim 1 wherein the compact representations have a predetermined number of dimensions. 3. The computer-implemented method of claim 1 wherein the compact representations are quantized. 4. The computer-implemented method of claim 3 wherein training the neural network comprises modeling quantization of the compact representations as a rounding error term. 5. The computer-implemented method of claim 3 wherein training the neural network comprises modeling quantization of the compact representations as random variables and using dynamic programming to compute an expected value based on the random variables. 6. The computer-implemented method of claim 3 wherein training the neural network comprises training the neural network based on an objective function that includes a standard deviation term. 7. The computer-implemented method of claim 3 wherein the compact representations are quantized to a single bit for each dimension of the compact representation. 8. The computer-implemented method of claim 3 wherein the compact representations are quantized to multiple bits for each dimension of the compact representation. 9. The computer-implemented method of claim 3 wherein the compact representations are quantized to different numbers of bits for different dimensions of the compact representation. 10. The computer-implemented method of claim 1 wherein the compact representations are not more than 64 bytes each. 11. The computer-implemented method of claim 10 wherein the neural network is trained subject to the compact representations having not more than 64 bytes each. 12. The computer-implemented method of claim 1 wherein: presenting face images to the input comprises presenting pairs of face images to the input, where it is known whether the two faces images in each pair are for a same person; the metric is whether the two face images of each pair are for a same person; and adapting comprises adapting the nth shared layer and the unshared network based on the estimated metric of whether two face images of each pair are for a same person compared to the known value of whether the two face images are actually for the same person, and further comprises not adapting the first (n−1) shared layers. 13. A computer-implemented method for processing a face image, the method comprising: presenting a face image to the deep learning neural network trained according to the computer-implemented method of claim 1 ; and obtaining a compact representation of the face image at an output of the deep learning neural network. 14. The computer-implemented method of claim 13 further comprising: using the compact representation of the face image to perform face recognition. 15. The computer-implemented method of claim 14 further comprising: using the compact representation of the face image to prioritize a large scale face search; and performing further search for the face image based on the prioritization. 16. The computer-implemented method of claim 13 further comprising: using the compact representation of the face image to classify the face image. 17. The computer-implemented method of claim 16 wherein using the compact representation comprises: using the compact representation as an index into a lookup table for the classification. 18. A non-transitory computer readable medium configured to store program code, the program code comprising instructions for training a deep learning neural network for compact face representations, the instructions when executed by a processor cause the processor to execute a method comprising: presenting face images to the neural network, wherein the neural network is a pyramid convolutional neural network (CNN) comprising at least N shared layers where N≧2 and at least one unshared network coupled to the Nth shared layer; the neural network processing the face images to produce compact representations of the face images, wherein the compact representations have not more than 64 dimensions; processing the compact representations to produce estimates of a metric, for which actual values of the metric are known; and training the neural network based on the estimates of the metric compared to the actual values of the metric, wherein training the pyramid CNN comprises: training CNN levels 1 to N in that order, wherein CNN level n comprises an input for receiving the face images, the first n shared layers of the pyramid CNN, the unshared network of the pyramid CNN, and an output producing the compact representations of the face images; wherein the input is coupled to a first of the n shared layers; each shared layer includes convolution, non-linearity and down-sampling; an nth of the n shared layers is coupled to the unshared network; and the unshared network is coupled to the output, wherein training CNN level n comprises: presenting face images to the input, each face image producing the corresponding compact representation at the output, processing the compact representations to produce estimates of a metric, for which actual values of the metric are known, and adapting the nth shared layer and the unshared network based on the estimates of the metric compared to the actual values of the metric.
Human faces, e.g. facial parts, sketches or expressions · CPC title
Classification, e.g. identification · CPC title
References adjustable by an adaptive method, e.g. learning · CPC title
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.