Parallel Processing Of Data
US-2024338235-A1 · Oct 10, 2024 · US
US2018267784A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2018267784-A1 |
| Application number | US-201815988225-A |
| Country | US |
| Kind code | A1 |
| Filing date | May 24, 2018 |
| Priority date | Nov 25, 2015 |
| Publication date | Sep 20, 2018 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for generating an accelerator program is disclosed, to help increase utilization of an accelerator and increase program development efficiency. In some feasible implementations of the present invention, the method includes: obtaining an accelerator program description that is based on a state machine, where the accelerator program description includes multiple state machines separately configured to implement an application program, and the multiple state machines form a pipeline according to a data dependency in a directed acyclic graph DAG corresponding to the application program; and performing state machine splicing on the state machines in the accelerator program description by using an accelerator compilation tool, to generate an accelerator program.
Opening claim text (preview).
What is claimed is: 1 . A method for generating an accelerator program, comprising: obtaining, by a computer device, an accelerator program description, wherein the accelerator program description comprises multiple state machines configured to implement an application program, and the multiple state machines form a pipeline according to a data dependency in a directed acyclic graph (DAG) corresponding to the application program; and Performing state machine splicing on the multiple state machines in the accelerator program description by using an accelerator compilation tool of the computer device, to generate an accelerator program. 2 . The method according to claim 1 , wherein the performing state machine splicing on the state machines in the accelerator program description by using the accelerator compilation tool of the computer device, to generate the accelerator program comprises: establishing an intermediate expression of each state machine in the accelerator program description, and splicing intermediate expressions that have a same structure and that are of different state machines, to generate a combined state machine and obtain the accelerator program. 3 . The method according to claim 2 , wherein the establishing the intermediate expression of each state machine in the accelerator program description comprises: Analyzing a basic structure of each state machine in the accelerator program description by using the accelerator program description as basic input and based on an accelerator microarchitecture characteristic, and converting each state machine into the intermediate expression, wherein the intermediate expression comprises one or more of a sequential block, a cyclical block, or a repetitive block. 4 . The method according to claim 3 , wherein the splicing intermediate expressions that have the same structure and that are of different state machines comprises: Splicing sequential blocks in the different state machines, splicing cyclical blocks in the different state machines, or splicing repetitive blocks in the different state machines. 5 . The method according to claim 3 , wherein the splicing intermediate expressions that have the same structure and that are of different state machines comprises: determining, according to a timing constraint relationship between the different state machines, intermediate expressions that are of the different state machines and that have the same structure and that can be concurrently executed; splicing the intermediate expressions so as to concurrently execute instructions in the different state machines. 6 . The method according to claim 1 , further comprising: Performing program correctness detection on the generated accelerator program. 7 . The method according to claim 6 , wherein the performing program correctness detection on the generated accelerator program comprises: performing program correctness detection on the accelerator program by detecting whether the generated accelerator program satisfies a constraint of an accelerator instruction set. 8 . The method according to claim 7 , wherein the performing correctness detection on the accelerator program by detecting whether the generated accelerator program satisfies the constraint of the accelerator instruction set comprises: performing syntax structure detection, resource conflict detection, and data correlation detection on the generated accelerator program. 9 . The method according to claim 1 , wherein the obtaining an accelerator program description comprises: describing a data flow diagram of the application program by using the DAG, creating a state machine of each node in the DAG by using the accelerator instruction set, and establishing the pipeline between the state machines according to a time delay indicated by a side in the DAG, to obtain the accelerator program description. 10 . The method according to claim 9 , wherein each node in the DAG represents a functional component of a processor or an accelerator that executes the application program. 11 . A system for generating an accelerator program, comprising: at least one processor; and a non-transitory computer-readable storage medium coupled to the at least one processor and storing programming instructions for execution by the at least one processor, wherein the programming instructions instruct the at least one processor to: obtain an accelerator program description, wherein the accelerator program description comprises multiple state machines configured to implement an application program, and the multiple state machines form a pipeline according to a data dependency in a directed acyclic graph DAG corresponding to the application program; and perform state machine splicing on the multiple state machines in the accelerator program description by using an accelerator compilation tool, to generate an accelerator program. 12 . The system according to claim 11 , wherein the programming instructions instruct the at least one processor to: establish an intermediate expression of each state machine in the accelerator program description, and splice intermediate expressions that have a same structure and that are of different state machines, to generate a combined state machine and obtain the accelerator program. 13 . The system according to claim 12 , wherein the programming instructions instruct the at least one processor to: analyze a basic structure of each state machine in the accelerator program description by using the accelerator program description as basic input and based on an accelerator microarchitecture characteristic, and convert each state machine into the intermediate expression, wherein the intermediate expression comprises one or more of a sequential block, a cyclical block, or a repetitive block. 14 . The system according to claim 13 , wherein the programming instructions instruct the at least one processor to: splice sequential blocks in the different state machines, splice cyclical blocks in the different state machines, or splice repetitive blocks in the different state machines. 15 . The system according to claim 14 , wherein the programming instructions instruct the at least one processor to: determine, according to a timing constraint relationship between the different state machines, intermediate expressions that are of the different state machines and that have the same structure and that can be concurrently executed; splice, the intermediate expressions so as to concurrently execute instructions in the different state machines. 16 . The system according to claim 11 , wherein the programming instructions instruct the at least one processor to: perform program correctness detection on the generated accelerator program. 17 . The system according to claim 16 , wherein the programming instructions instruct the at least one processor to: perform correctness detection on the accelerator program by detecting whether the generated accelerator program satisfies a constraint of an accelerator instruction set. 18 . The system according to claim 17 , wherein the programming instructions instruct the at least one processor to: perform syntax structure detection, resource conflict detection, and data correlation detection on the generated accelerator program. 19 . The system according to claim 11 , further comprising: the programming instructions instruct the at least one processor to: describe a data flow diagram of the application program by using the DAG
Syntactic analysis · CPC title
Software pipelining · CPC title
Dependency analysis; Data or control flow analysis · CPC title
Finite state machines · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.