What technology area does this patent fall under?

Primary CPC classification G06N3/10. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 26 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method for adapting deep learning framework to hardware device based on unified backend engine

US11941532B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11941532-B2
Application number	US-202217726563-A
Country	US
Kind code	B2
Filing date	Apr 22, 2022
Priority date	Nov 25, 2021
Publication date	Mar 26, 2024
Grant date	Mar 26, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed is a method for adapting a deep learning framework to a hardware device based on a unified backend engine, which comprises the following steps: S1, adding the unified backend engine to the deep learning framework; S2, adding the unified backend engine to the hardware device; S3, converting a computational graph, wherein the computational graph compiled and generated by the deep learning framework is converted into an intermediate representation of the unified backend engine; S4, compiling the intermediate representation, wherein the unified backend engine compiles the intermediate representation on the hardware device to generate an executable object; S5, running the executable object, wherein the deep learning framework runs the executable object on the hardware device; S6: managing memory of the unified backend engine.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for adapting a deep learning framework to a hardware device based on a unified backend engine, comprising the following steps: S1: adding the unified backend engine to the deep learning framework; S2: adding the unified backend engine to the hardware device; S3: converting a computational graph, wherein the computational graph compiled and generated by the deep learning framework is converted into an intermediate representation of the unified backend engine, which comprises the following substeps: S31: creating a graph launcher of the unified backend engine, and adding the graph launcher of the unified backend engine to the deep learning framework, wherein the graph launcher inherits from operators of the computational graph of the framework and realizes a forward propagation interface, when the graph computation enters a run-time stage, the framework selects a route compiled and run by the unified backend engine when starting to run a kernel function of each operator in the computational graph by a runner; S32: registering the graph launcher of the unified backend engine, and using the graph launcher to receive the computational graph compiled and generated by the framework, which comprises: S321: creating a global static dictionary, wherein a key value is an element of an enumeration type, listing all optional graph launcher variables, wherein a value value is to implement the graph launcher; S322: adding enumeration members of the graph launcher of the unified backend engine to a key value list of the enumeration type; S323: transmitting a key value of the graph launcher of the unified backend engine to a registry by means of a front end of the framework by using the unified backend engine, and a graph executor of the computational graph of the framework choosing to use a corresponding value value of the graph launcher of the unified backend engine to start a graph computation process; S33: converting the computational graph into the intermediate representation of the unified backend engine, which comprises the following substeps: S331: the graph executor of the framework loading a computational graph of the framework into the backend engine through the graph launcher of the unified backend engine and executing the forward propagation interface when the graph computation enters a running period; S332: creating a computational graph conversion interface in the forward propagation interface, wherein the computational graph conversion interface is responsible for converting the computational graph of the framework into the intermediate representation of the unified backend engine; S333: the computational graph conversion interface first traversing all nodes of the computational graph according to a topological order of the computational graph of the framework, then creating the corresponding intermediate representation of the unified backend engine for operators of each node, and finally, performing the computational graph conversion of the kernel function of each operator to generate the intermediate representation of the unified backend engine; S4: compiling the intermediate representation, wherein the unified backend engine compiles the intermediate representation on the hardware device to generate an executable object; S5: running the executable object, wherein the deep learning framework runs the executable object on the hardware device; S6: managing memory of the unified backend engine. 2. The method for adapting a deep learning framework to a hardware device based on a unified backend engine according to claim 1 , wherein the step S1 comprises the following substeps: S11: the deep learning framework registering the hardware device, adding a device field corresponding to the hardware device to a source code of the deep learning framework, creating an enumeration type of a device type for a hardware targeted by the unified backend engine, and adding the device field corresponding to the hardware in the device type; S12: the deep learning framework registering the unified backend engine and adding a unified backend engine field to the deep learning framework; S13: adding a compiler of the unified backend engine to the deep learning framework; S14: the deep learning framework registering the compiler of the unified backend engine, and registering the newly added compiler in the unified backend engine; S15: adding a computational graph executable object of the unified backend engine to the deep learning framework, adding a corresponding computational graph executable object for the unified backend engine, and implementing a running interface. 3. The method for adapting a deep learning framework to a hardware device based on a unified backend engine according to claim 2 , wherein in order to enable the computational graph generated by the deep learning framework to be compiled and run on a specified hardware device registered by the unified backend engine through a device field object of the specified hardware device, the unified backend engine must acquire the hardware specified by a user at a front end of the framework through the deep learning framework by means of constructing a dictionary in which a hardware type object specified by the user at the front end of the framework and a device ID field object of the unified backend engine are mapped one by one. 4. The method for adapting a deep learning framework to a hardware device based on a unified backend engine according to claim 2 , wherein in the step S13, the compiler of the unified backend engine inherits from unified compilers and implements a corresponding compilation interface; an input of the compiler of the unified backend engine is the computational graph of the framework, each node in a subgraph is traversed in a topological order, and a node of the computational graph is sequentially compiled into a specific executable object to be output as the executable object of the unified backend engine. 5. The method for adapting a deep learning framework to a hardware device based on a unified backend engine according to claim 2 , wherein the compiler of the unified backend engine comprises a following step such that different types of operators are processed: constructing two data structure types, namely, an operator context information type of the unified backend engine and a kernel function type of an operator of the unified backend engine, which comprises: compiling a single operator, wherein the kernel function type of the operator of the unified backend engine is inherited from the computational graph of the framework, the compilation process of a single operator is completed according to a type of the operator, the kernel function of the operator of the unified backend engine is compiled to generate function codes, intermediate caches, and parameters corresponding to input and output of the function codes, the kernel function type of the operator is registered in a kernel function factory of a corresponding operator to the unified backend engine, and a factory registration mode is adopted so that the backend engine can judge whether the engine supports a certain type of operator when dividing subgraphs; storing meta-information and compilation results, wherein the operator context information type of the unified backend engine temporarily stores meta-information and compilation results required by compilation, and provides necessary interfaces for the kernel function type of the operator, the operator context information type accepts two inputs, namely, currently computational graph nodes and all created parameters, and fills the function codes, the intermediate caches, and the parameters corresponding to the input and output of the function codes generated by compiling the kernel function type of the operato

Assignees

Zhejiang Lab

Inventors

Classifications

G06N3/10Primary
Interfaces, programming languages or software development kits, e.g. for simulating neural networks · CPC title
G06N3/04
Architecture, e.g. interconnection topology · CPC title
G06F8/34Primary
Graphical or visual programming · CPC title
G06N3/063Primary
using electronic means · CPC title
G06F8/37
Compiler construction; Parser generation · CPC title

Patent family

Related publications grouped by family.

View patent family 78971620

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11941532B2 cover?: Disclosed is a method for adapting a deep learning framework to a hardware device based on a unified backend engine, which comprises the following steps: S1, adding the unified backend engine to the deep learning framework; S2, adding the unified backend engine to the hardware device; S3, converting a computational graph, wherein the computational graph compiled and generated by the deep learni…
Who is the assignee on this patent?: Zhejiang Lab
What technology area does this patent fall under?: Primary CPC classification G06N3/10. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 26 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).