Hardware acceleration wait time awareness in central processing units with multi-thread architectures

US9870255B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9870255-B2
Application numberUS-201213557211-A
CountryUS
Kind codeB2
Filing dateJul 25, 2012
Priority dateJul 29, 2011
Publication dateJan 16, 2018
Grant dateJan 16, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Provided is a hardware accelerator, central processing unit, and computing device. A hardware accelerator includes a task accelerating unit configured to, in response to a request for a new task issued by a hardware thread, accelerate the processing of the new task and produce a processing result for the task; a task time prediction unit configured to predict the total waiting time of the new task for returning to a specified address associated with the hardware thread. One aspect of this disclosure makes the hardware thread aware of the time to be waited for before getting a processing result, facilitating its task planning accordingly.

First claim

Opening claim text (preview).

What is claimed is: 1. A hardware accelerator, comprising: a task accelerating unit configured to, in response to a request for a new task issued by a hardware thread, accelerate processing of the new task and produce a processing result for the task; and a task time prediction unit configured to predict total waiting time of the new task for returning to a specified address associated with the hardware thread based upon a sum of time required by the hardware accelerator to complete execution of the new task itself and time required by the hardware accelerator to complete those tasks that have not been completed therein when the new task is received; and wherein the request issued to the hardware accelerator from the hardware thread comprises an address of an inner register of the hardware thread for specifying the specified address associated with the hardware thread to which the total waiting time of the new task will be returned by the hardware accelerator. 2. The hardware accelerator of claim 1 , wherein the task time prediction unit comprises: a task model engine configured to in response to reception of the new task, evaluate task execution time for the new task based on a task model; and an accumulator, in which time required to complete all tasks that have not yet been completed in the task accelerating unit is stored as an accumulation result, and which accumulates the task execution time evaluated by the task model engine to the accumulation result for notifying the hardware thread of the newly accumulated result as total waiting time of the new task, and subtracts the corresponding task execution time from the accumulation result stored in the accumulator after the completion of a task. 3. The hardware accelerator of claim 1 , wherein the task time prediction unit comprises: a task model engine configured to evaluate a task execution time for the new task based on a task model; and a task framer, in which time required to complete all tasks that have not yet been completed in the task accelerating unit is stored as an accumulation result, and which, in response to receiving the new task, appends a field of task execution time for the new task evaluated by the task module engine to the new task before putting it into a queue within the task accelerating unit, accumulates the task execution time within the task execution time field to the accumulation result stored in the task framer, so as to notify the hardware thread of the newly accumulated result as the total waiting time of the new task, and then subtracting the task execution time within a corresponding task execution time field from the accumulation result stored in the task framer after the completion of a task. 4. The hardware accelerator of claim 1 , wherein the task time prediction unit comprises: an import detector configured to detect the new task imported into the queue within the task accelerating unit; a task model engine configured to, in response to the new task being detected by the import detector, evaluate a task execution time for the new task based on a task model; and a calculation writer, in which time required to complete all tasks that have not yet been completed in the task accelerating unit is stored as an accumulation result, and which, based on the task execution time evaluated by the task model engine, appends a field of task execution time for the new task to the new task, accumulates the task execution time within the task execution time field to the accumulation result stored in the calculation writer, so as to notify the hardware thread of the newly accumulated result as the total waiting time of the new task, and subtracts the task execution time within a corresponding task execution time field from the accumulation result stored in the calculation writer after the completion of a task. 5. The hardware accelerator of claim 1 , wherein the specified address associated with the hardware thread is an inner and predetermined address of the hardware thread. 6. The hardware accelerator of claim 1 , wherein the specified address associated with the hardware thread is a memory address outside of the hardware thread. 7. The hardware accelerator of claim 1 , wherein the task model engine evaluates the task execution time for the new task based on one or more of the following items: processing frequency, input data size, and average cache hit ratio. 8. The hardware accelerator of claim 1 , wherein the total waiting time of the new task is sent to the task model engine for adjusting the task model. 9. The hardware accelerator of claim 1 , further comprising a storage unit configured to store the total waiting time of the new task at a register of the hardware thread. 10. A central processing unit, comprising: a hardware thread; a task accelerating unit configured to, in response to a request for a new task issued by the hardware thread, accelerate processing of the new task and produce a processing result for the task, the task accelerating unit configured to offload and execute the new task to a hardware accelerator, the hardware accelerator shared by multiple hardware threads, the hardware accelerator including a processing speed higher than that of the hardware threads; and a task time prediction unit configured to predict total waiting time of the new task for returning to a specified address associated with the hardware thread based upon a sum of time required by the hardware accelerator to complete execution of the new task itself and time required by the hardware accelerator to complete those tasks that have not been completed therein when the new task is received; and wherein the request issued to the hardware accelerator from the hardware thread comprises an address of an inner register of the hardware thread for specifying the specified address associated with the hardware thread to which the total waiting time of the new task will be returned by the hardware accelerator. 11. The central processing unit of claim 10 , wherein the task time prediction unit comprises: a task model engine configured to in response to reception of the new task, evaluate task execution time for the new task based on a task model; and an accumulator, in which time required to complete all tasks that have not yet been completed in the task accelerating unit is stored as an accumulation result, and which accumulates the task execution time evaluated by the task model engine to the accumulation result for notifying the hardware thread of the newly accumulated result as total waiting time of the new task, and subtracts the corresponding task execution time from the accumulation result stored in the accumulator after the completion of a task. 12. The central processing unit of claim 10 , wherein the task time prediction unit comprises: a task model engine configured to evaluate a task execution time for the new task based on a task model; and a task framer, in which time required to complete all tasks that have not yet been completed in the task accelerating unit is stored as an accumulation result, and which, in response to receiving the new task, appends a field of task execution time for the new task evaluated by the task module engine to the new task before putting it into a queue within the task accelerating unit, accumulates the task execution time within the task execution time field to the accumulation result stored in the task framer, so as to notify the hardware thread of the newly accumulated result as the total waiting time of the new task, and then subtracting the task execution time within a corresponding task execution time field from the accumulation result stored in the task framer afte

Assignees

Inventors

Classifications

  • Interprogram communication · CPC title

  • G06F9/4843Primary

    by program, e.g. task dispatcher, supervisor, operating system · CPC title

  • Offload · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9870255B2 cover?
Provided is a hardware accelerator, central processing unit, and computing device. A hardware accelerator includes a task accelerating unit configured to, in response to a request for a new task issued by a hardware thread, accelerate the processing of the new task and produce a processing result for the task; a task time prediction unit configured to predict the total waiting time of the new t…
Who is the assignee on this patent?
Hou Rui, Ge Yi, Wang Kun, and 2 more
What technology area does this patent fall under?
Primary CPC classification G06F9/4843. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 16 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).