What technology area does this patent fall under?

Primary CPC classification G06N3/08. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Sep 06 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Apparatus and method

Patent metadata
Field	Value
Publication number	US-2018253401-A1
Application number	US-201815903290-A
Country	US
Kind code	A1
Filing date	Feb 23, 2018
Priority date	Mar 2, 2017
Publication date	Sep 6, 2018
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus comprising circuitry that implements an artificial neural network training algorithm that uses weight tying.

First claim

Opening claim text (preview).

1 . An apparatus comprising circuitry that implements an artificial neural network training algorithm that uses weight tying. 2 . The apparatus of claim 1 , wherein the circuitry is configured to update the weight tying using a predefined number of iterations of a clustering algorithm. 3 . The apparatus of claim 1 , wherein the circuitry is configured to compute a weight-tied weight matrix based on an index matrix and based on a value vector. 4 . The apparatus of claim 2 , wherein the predefined number of iterations of the clustering algorithm used to update the weight tying is one. 5 . The apparatus of claim 2 , wherein the circuitry is configured to, in each iteration of the clustering algorithm, update a value vector according to [ v ( l ) ] k = 1 #  { I ( l ) = k }  ∑ ij , I ( l ) = k  [ W ( l ) ] ij where W (l) is a full-precision weight matrix for layer l of the neural network, and I (l) is the index matrix. 6 . The apparatus of claim 2 , wherein the circuitry is configured to update, in each iteration of the clustering algorithm, an index matrix according to [ I (l) ] ij =arg min k=1, . . . , K (l) |[W (l) ] ij −[v (l) ] k | 7 . The apparatus of claim 3 , wherein the circuitry is configured to quantize the values of the value vector. 8 . The apparatus of claim 3 , wherein the circuitry is configured to quantize the values of the value vector after updating the weight tying. 9 . The apparatus of claim 7 , wherein the circuitry is configured to quantize the values of the value vector to the nearest power-of-two. 10 . The apparatus of claim 9 , wherein the circuitry is configured to quantize the values of the value vector according to the quantization scheme: x q = { s · 2 ⌊ b ⌋ b - ⌊ b ⌋ ≤ log 2  1.5 s · 2 ⌈ b ⌉ b - ⌊ b ⌋ > log 2  1.5 where s=sign(x) and b=log 2 |x|, and where x is the value which is to be quantized and x q is the quantized value. 11 . The apparatus of claim 3 , wherein a value vector comprises more than three values. 12 . The apparatus of claim 1 , wherein the circuitry is configured to update full precision weights based on gradients. 13 . The apparatus of claim 12 , wherein the circuitry is configured to compute the gradients based on a cost function and based on the weight-tied weight matrix. 14 . The apparatus of claim 12 , wherein the circuitry is configured to compute the cost function based on a loss function and based on a forward pass function. 15 . The apparatus of claim 12 , wherein the circuitry is configured to compute the gradients based on a backward pass function. 16 . The apparatus of claim 1 , wherein the training algorithm is a stochastic gradient descent training algorithm. 17 . The apparatus of claim 1 , wherein the artificial neural network is a deep convolutional neural network. 18 . An apparatus comprising circuitry that implements an artificial neural network, the artificial neural network having been trained by a neural network training algorithm that uses weight tying. 19 . The apparatus of claim 18 , wherein in the circuitry implements the artificial neural network multiplierless. 20 . A method of training an artificial neural network, the method comprising performing an artificial neural network training algorithm that uses weight tying.

Assignees

Sony Corp

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06N3/063
using electronic means · CPC title
G06N3/08Primary
Learning methods · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/082
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

Patent family

Related publications grouped by family.

View patent family 58213027

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018253401A1 cover?: An apparatus comprising circuitry that implements an artificial neural network training algorithm that uses weight tying.
Who is the assignee on this patent?: Sony Corp
What technology area does this patent fall under?: Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Sep 06 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).