Apparatus and method

US2018253401A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2018253401-A1
Application numberUS-201815903290-A
CountryUS
Kind codeA1
Filing dateFeb 23, 2018
Priority dateMar 2, 2017
Publication dateSep 6, 2018
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus comprising circuitry that implements an artificial neural network training algorithm that uses weight tying.

First claim

Opening claim text (preview).

1 . An apparatus comprising circuitry that implements an artificial neural network training algorithm that uses weight tying. 2 . The apparatus of claim 1 , wherein the circuitry is configured to update the weight tying using a predefined number of iterations of a clustering algorithm. 3 . The apparatus of claim 1 , wherein the circuitry is configured to compute a weight-tied weight matrix based on an index matrix and based on a value vector. 4 . The apparatus of claim 2 , wherein the predefined number of iterations of the clustering algorithm used to update the weight tying is one. 5 . The apparatus of claim 2 , wherein the circuitry is configured to, in each iteration of the clustering algorithm, update a value vector according to [ v ( l ) ] k = 1 #  { I ( l ) = k }  ∑ ij , I ( l ) = k  [ W ( l ) ] ij where W (l) is a full-precision weight matrix for layer l of the neural network, and I (l) is the index matrix. 6 . The apparatus of claim 2 , wherein the circuitry is configured to update, in each iteration of the clustering algorithm, an index matrix according to [ I (l) ] ij =arg min k=1, . . . , K (l) |[W (l) ] ij −[v (l) ] k | 7 . The apparatus of claim 3 , wherein the circuitry is configured to quantize the values of the value vector. 8 . The apparatus of claim 3 , wherein the circuitry is configured to quantize the values of the value vector after updating the weight tying. 9 . The apparatus of claim 7 , wherein the circuitry is configured to quantize the values of the value vector to the nearest power-of-two. 10 . The apparatus of claim 9 , wherein the circuitry is configured to quantize the values of the value vector according to the quantization scheme: x q = { s · 2 ⌊ b ⌋ b - ⌊ b ⌋ ≤ log 2  1.5 s · 2 ⌈ b ⌉ b - ⌊ b ⌋ > log 2  1.5 where s=sign(x) and b=log 2 |x|, and where x is the value which is to be quantized and x q is the quantized value. 11 . The apparatus of claim 3 , wherein a value vector comprises more than three values. 12 . The apparatus of claim 1 , wherein the circuitry is configured to update full precision weights based on gradients. 13 . The apparatus of claim 12 , wherein the circuitry is configured to compute the gradients based on a cost function and based on the weight-tied weight matrix. 14 . The apparatus of claim 12 , wherein the circuitry is configured to compute the cost function based on a loss function and based on a forward pass function. 15 . The apparatus of claim 12 , wherein the circuitry is configured to compute the gradients based on a backward pass function. 16 . The apparatus of claim 1 , wherein the training algorithm is a stochastic gradient descent training algorithm. 17 . The apparatus of claim 1 , wherein the artificial neural network is a deep convolutional neural network. 18 . An apparatus comprising circuitry that implements an artificial neural network, the artificial neural network having been trained by a neural network training algorithm that uses weight tying. 19 . The apparatus of claim 18 , wherein in the circuitry implements the artificial neural network multiplierless. 20 . A method of training an artificial neural network, the method comprising performing an artificial neural network training algorithm that uses weight tying.

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • using electronic means · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018253401A1 cover?
An apparatus comprising circuitry that implements an artificial neural network training algorithm that uses weight tying.
Who is the assignee on this patent?
Sony Corp
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Sep 06 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).