What technology area does this patent fall under?

Primary CPC classification G06N3/082. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jun 09 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Convolution neural network training apparatus and method thereof

US2016162782A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2016162782-A1
Application number	US-201514960942-A
Country	US
Kind code	A1
Filing date	Dec 7, 2015
Priority date	Dec 9, 2014
Publication date	Jun 9, 2016
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus and method of training a convolutional neural network (CNN) are provided. A method of training a CNN including a plurality of convolution layers stored in a memory involves approximating, using a processor, a convolution layer among the plurality of convolution layers using a low-rank approximation; reducing the number of output reconstruction filters of the approximated convolution layer; and modifying a structure of the CNN based on an approximation result and the reduced number of output reconstruction filters.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of training a convolutional neural network (CNN) comprising a plurality of convolution layers stored in a non-transitory memory, the method comprising: approximating, using a processor, a convolution layer among the plurality of convolution layers using a low-rank approximation; reducing the number of output reconstruction filters of the approximated convolution layer; modifying a structure of the CNN based on an approximation result and the reduced number of output reconstruction filters; and training the modified CNN. 2 . The method of claim 1 , after the training of the modified CNN, further comprising: performing the operations of: sequentially approximating convolution layers that follow the approximated convolution layer; reducing the number of output reconstruction filters of the currently approximated convolution layer and modifying the structure of the CNN; and training the modified CNN. 3 . The method of claim 1 , after the training of the modified CNN, further comprising: classifying image data using the trained CNN; and in response to an accuracy of the classification not satisfying a designated criteria, performing the operations of: reducing the number of output reconstruction filters; modifying the structure of the CNN; and training the modified CNN. 4 . The method of claim 1 , further comprising: modifying a structure of at least one convolution layer that follows the approximated convolution layer, wherein the structure of the CNN is modified based on a result of the modifying of the structure of said at least one convolution layer. 5 . The method of claim 4 , wherein the structure of the CNN is modified by changing the number of convolution filters of the at least one convolution layer that follows the approximated convolution layer. 6 . The method of claim 4 , after the training of the modified CNN, further comprising: classifying image data using the trained CNN; and in response to an accuracy of the classification not satisfying a designated criteria, performing the operations of: modifying the structure of at least one convolution layer that follows the approximated convolution layer; modifying the structure of the CNN; and training the modified CNN. 7 . The method of claim 1 , wherein one convolution layer among the plurality of convolution layers is approximated into one or more input conversion filters, one or more convolution filters, and one or more output reconstruction filters. 8 . The method of claim 7 , wherein the input conversion filter is configured to reduce the number of channels of input data, the convolution filter is configured to perform a convolution operation on input data having a reduced number of channels, and the output reconstruction filter is configured to restore a convolution operation result to have the same number of channels as the number of channels of the convolution layer. 9 . A non-transitory computer-readable medium storing instructions that, when executed by a computer processor, cause the computer processor to train a convolution neural network stored in a non-transitory memory according to the method of claim 1 . 10 . An apparatus for training a convolution neural network (CNN) comprising a plurality of convolution layers stored in a non-transitory memory, the apparatus comprising: an approximation processor configured to approximate a convolution layer among the plurality of convolution layers using a low-rank approximation; a filter count changer configured to reduce the number of output reconstruction filters of the approximated convolution layer; and a training processor configured to modify a structure of the CNN based on an approximation result and the reduced number of output reconstruction filters and to train the modified CNN. 11 . The apparatus of claim 10 , wherein the approximation processor is configured to sequentially approximate another convolution layer that follows the approximated convolution layer in response to the modified CNN being trained. 12 . The apparatus of claim 10 , further comprising: a classifier configured to classify image data using the trained CNN, wherein the filter count changer re-changes the number of output reconstruction filters in response to an accuracy of the classification not satisfying a designated criteria. 13 . The apparatus of claim 10 , further comprising: a layer structure modifier configured to modify a structure of at least one convolution layer that follows the approximated convolution layer, wherein the training processor modifies the structure of the CNN based on a result of modifying the structure of said convolution layer that follows the approximated convolution layer. 14 . The apparatus of claim 13 , wherein the layer structure modifier is configured to modify the structure of at least one convolution layer by changing the number of convolution filters of said convolution layer. 15 . The apparatus of claim 13 , further comprising: a classifier configured to classify image data using the trained CNN, wherein the layer structure modifier is configured to re-modify the structure of the at least one convolution layer that follows the approximated convolution layer in response to an accuracy of the classification not satisfying a designated criteria. 16 . The apparatus of claim 10 , wherein the approximation processor is configured to approximate the convolution layer among the plurality of convolution layers into one or more input conversion filters, one or more convolution filters, and one or more output reconstruction filters. 17 . The apparatus of claim 16 , wherein the input conversion filter is configured to reduce the number of channels of input data, the convolution filter is configured to perform a convolution operation on input data having a reduced number of channels, and the output reconstruction filter is configured to restore a convolution operation result to have the same number of channels as the number of channels of the convolution layer. 18 . An apparatus for training a neural network, the apparatus comprising: a non-transitory memory storing a convolution neural network (CNN) comprising a plurality of convolution layers; and a processor configured to approximate a convolution layer among the plurality of convolution layers using a low-rank approximation, reduce a number of output reconstruction filters of the approximated convolution layer, modify a structure of the CNN based on a result of the approximating of the convolution layer and the reduced number of output reconstruction filters, and train the modified CNN stored in the non-transitory memory. 19 . The apparatus of claim 18 , wherein the apparatus trains the modified CNN by retrieving input image data from a training data memory storage. 20 . The apparatus of claim 18 , wherein the processor is further configured to classify input image data retrieved from a training data memory storage by using the modified CNN, and further modify the modified CNN in response to an accuracy of the classification not satisfying a designated criteria.

Assignees

Samsung Electronics Co Ltd

Inventors

Park Hyoung Min

Classifications

G06N3/045
Combinations of networks · CPC title
G06N3/082Primary
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title
G06N3/0495
Quantised networks; Sparse networks; Compressed networks · CPC title
G06N3/09
Supervised learning · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

View patent family 56094617

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016162782A1 cover?: An apparatus and method of training a convolutional neural network (CNN) are provided. A method of training a CNN including a plurality of convolution layers stored in a memory involves approximating, using a processor, a convolution layer among the plurality of convolution layers using a low-rank approximation; reducing the number of output reconstruction filters of the approximated convolutio…
Who is the assignee on this patent?: Samsung Electronics Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06N3/082. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jun 09 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Rank-constrained neural networks

Convolutional Neural Network Using a Binarized Convolution Layer

Object-centric Fine-grained Image Classification

Frequently asked questions