What technology area does this patent fall under?

Primary CPC classification G06N3/063. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Nov 01 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Tool for facilitating efficiency in machine learning

US2018314936A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2018314936-A1
Application number	US-201715581152-A
Country	US
Kind code	A1
Filing date	Apr 28, 2017
Priority date	Apr 28, 2017
Publication date	Nov 1, 2018
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A mechanism is described for facilitating smart distribution of resources for deep learning autonomous machines. A method of embodiments, as described herein, includes detecting one or more sets of data from one or more sources over one or more networks, and introducing a library to a neural network application to determine optimal point at which to apply frequency scaling without degrading performance of the neural network application at a computing device.

First claim

Opening claim text (preview).

What is claimed is: 1 . An apparatus comprising: detection/observation logic, as facilitated by or at least partially incorporated into a processor, to detect one or more sets of data from one or more sources over one or more networks; and energy/communication efficiency logic, as facilitated by or at least partially incorporated into the processor, to introduce a library to a neural network application to determine optimal point at which to apply frequency scaling without degrading performance of the neural network application. 2 . The apparatus of claim 1 , wherein the optimal point is determined through gradient synchronization using a tree-like structure such that local weight vectors start the one or more nodes represented as leaves of the tree-like structure and communicate up to a root of the tree-like structure, wherein the library accounts for skew characteristics associated with the gradient synchronization to decide a core frequency. 3 . The apparatus of claim 1 , wherein the energy/communication logic is further to introduce sparse matrix representation for weights to overlap communication and computation across multiple nodes associated with neural network application to reduce communication costs. 4 . The apparatus of claim 1 , further comprising debugging logic, as facilitated by or at least partially incorporated into the processor, to automatically analyze failed execution of programs including or relevant to the neural network application to obtain insights on one or more faults of hardware performance counters. 5 . The apparatus of claim 4 , wherein the debugging logic is further to provide one or more of successful execution information obtained from successful execution of programs and failed execution information obtained from failed execution of programs to a trained network model to seek out one or more of the hardware performance counters that are regarded as faulty or outside a range of approval. 6 . The apparatus of claim 1 , further comprising error propagation logic, as facilitated by or at least partially incorporated into the processor, to perform local error propagation by computing high precision and low precision for local weights and compute local errors at each of the multiple nodes, wherein performing local error propagation includes facilitating weight synchronization across the multiple nodes to track the local errors for accuracy and reduced communication. 7 . The apparatus of claim 1 , wherein the apparatus comprises an autonomous machine including one or more of a vehicle, a device, and an equipment, wherein the autonomous machine comprises one or more processors including the processor having a graphics processor, wherein the graphics processor is co-located with an application processor on a common semiconductor package. 8 . A method comprising: detecting one or more sets of data from one or more sources over one or more networks; and introducing a library to a neural network application to determine optimal point at which to apply frequency scaling without degrading performance of the neural network application at a computing device. 9 . The method of claim 8 , wherein the optimal point is determined through gradient synchronization using a tree-like structure such that local weight vectors start the one or more nodes represented as leaves of the tree-like structure and communicate up to a root of the tree-like structure, wherein the library accounts for skew characteristics associated with the gradient synchronization to decide a core frequency. 10 . The method of claim 8 , further comprising introducing sparse matrix representation for weights to overlap communication and computation across multiple nodes associated with neural network application to reduce communication costs. 11 . The method of claim 8 , further comprising automatically analyzing failed execution of programs including or relevant to the neural network application to obtain insights on one or more faults of hardware performance counters. 12 . The method of claim 11 , further comprising providing one or more of successful execution information obtained from successful execution of programs and failed execution information obtained from failed execution of programs to a trained network model to seek out one or more of the hardware performance counters that are regarded as faulty or outside a range of approval. 13 . The method of claim 8 , further comprising performing local error propagation by computing high precision and low precision for local weights and compute local errors at each of the multiple nodes, wherein performing local error propagation includes facilitating weight synchronization across the multiple nodes to track the local errors for accuracy and reduced communication. 14 . The method of claim 8 , wherein the computing device comprises an autonomous machine including one or more of a vehicle, a device, and an equipment, wherein the autonomous machine comprises one or more processors including a graphics processor, wherein the graphics processor is co-located with an application processor on a common semiconductor package. 15 . At least one machine-readable medium comprising instructions that when executed by a local computing device, cause the local computing device to perform operations comprising: detecting one or more sets of data from one or more sources over one or more networks; and introducing a library to a neural network application to determine optimal point at which to apply frequency scaling without degrading performance of the neural network application. 16 . The machine-readable medium of claim 15 , wherein the optimal point is determined through gradient synchronization using a tree-like structure such that local weight vectors start the one or more nodes represented as leaves of the tree-like structure and communicate up to a root of the tree-like structure, wherein the library accounts for skew characteristics associated with the gradient synchronization to decide a core frequency. 17 . The machine-readable medium of claim 15 , wherein the operations further comprise introducing sparse matrix representation for weights to overlap communication and computation across multiple nodes associated with neural network application to reduce communication costs. 18 . The machine-readable medium of claim 15 , wherein the operations further comprise automatically analyzing failed execution of programs including or relevant to the neural network application to obtain insights on one or more faults of hardware performance counters. 19 . The machine-readable medium of claim 18 , wherein the operations further comprise providing one or more of successful execution information obtained from successful execution of programs and failed execution information obtained from failed execution of programs to a trained network model to seek out one or more of the hardware performance counters that are regarded as faulty or outside a range of approval. 20 . The machine-readable medium of claim 15 , wherein the operations further comprise performing local error propagation by computing high precision and low precision for local weights and compute local errors at each of the multiple nodes, wherein performing local error propagation includes facilitating weight synchronization across the multiple nodes to track the local errors for accuracy and reduced communication, wherein the computing device comprises an autonomous machine including one or more of a vehicle, a device,

Assignees

Intel Corp

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06N5/01
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/063Primary
using electronic means · CPC title
G06F9/505
considering the load · CPC title

Patent family

Related publications grouped by family.

View patent family 63797237

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018314936A1 cover?: A mechanism is described for facilitating smart distribution of resources for deep learning autonomous machines. A method of embodiments, as described herein, includes detecting one or more sets of data from one or more sources over one or more networks, and introducing a library to a neural network application to determine optimal point at which to apply frequency scaling without degrading per…
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Nov 01 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).