Application programming interface to modify thread
US-2024289129-A1 · Aug 29, 2024 · US
US9588771B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9588771-B2 |
| Application number | US-201313791298-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 8, 2013 |
| Priority date | Dec 29, 2005 |
| Publication date | Mar 7, 2017 |
| Grant date | Mar 7, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In one embodiment, the present invention includes a method for directly communicating between an accelerator and an instruction sequencer coupled thereto, where the accelerator is a heterogeneous resource with respect to the instruction sequencer. An interface may be used to provide the communication between these resources. Via such a communication mechanism a user-level application may directly communicate with the accelerator without operating system support. Further, the instruction sequencer and the accelerator may perform operations in parallel. Other embodiments are described and claimed.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: a first instruction sequencer to perform instructions; and a second instruction sequencer coupled the first instruction sequencer, including: an accelerator to perform at least one operation on data received from the first instruction sequencer, wherein the accelerator comprises a heterogeneous resource with respect to the first instruction sequencer comprising next instruction pointer logic to determine a next instruction to be executed by the accelerator and further comprising a disabled system resource that is adapted to perform a graphics function but is disabled due to presence of an add-in graphics device that is adapted to perform the graphics function; and an interface logic coupled to the accelerator to enable inter-sequencer communication of a user-level shred of a user-level application between the first instruction sequencer and the accelerator, responsive to a monitor signal from the accelerator, by translation of the inter-sequencer communication, the inter-sequencer communication including a first shred transfer instruction to transfer architectural state information including register values and configuration information and a second shred transfer instruction to transfer command information including the data and an accelerator function to be applied to the data, wherein the first instruction sequencer is to associate an event handler with a signal to be received from the accelerator to notify the first instruction sequencer upon completion of the at least one operation, and the first instruction sequencer is to execute operations independent of and in parallel with the accelerator, and to thereafter execute the event handler responsive to the notification, wherein the event handler is to receive and process result data from the at least one operation. 2. The processor of claim 1 , wherein the accelerator comprises a fixed function unit and the interface logic comprises a finite state machine (FSM) coupled to the fixed function unit. 3. The processor of claim 1 , wherein the first instruction sequencer comprises a processing engine of a native instruction set architecture (ISA), and the accelerator comprises a processing engine of a non-native ISA. 4. The processor of claim 1 , wherein the interface logic comprises: first logic to enable ingress communication of the architectural state information to the accelerator and egress communication of status information to the first instruction sequencer; and second logic to virtualize the accelerator for use under the user-level application and under an operating system. 5. The processor of claim 1 , wherein the processor comprises a plurality of first instruction sequencers and a plurality of accelerators. 6. The processor of claim 5 , wherein the plurality of first instruction sequencers each comprise a processing engine of a native instruction set architecture (ISA), and the plurality of accelerators comprise a processing engine of a non-native ISA. 7. The processor of claim 1 , wherein the accelerator includes: a first portion to be adapted for use by the first instruction sequencer under the user-level application and without operating system (OS) support, and a second portion to be adapted for use via an operating system-enabled application. 8. A non-transitory machine-readable medium having stored thereon instructions, which if performed by a machine cause the machine to perform a method comprising: communicating architectural state information including register values and configuration information from a user-level shred of a user-level application executing on a first instruction sequencer of a multi-core processor to an accelerator of the multi-core processor via a first user-level instruction to configure the accelerator, wherein the accelerator comprises a heterogeneous resource with respect to the first instruction sequencer, the heterogeneous resource comprising a graphics processor having a plurality of processing engines, at least some of the plurality of processing engines to be disabled for execution of a graphics function by configuration of a system including a graphics component to perform the graphics function according to a disable indicator; communicating a request from the user-level shred of the user-level application to the accelerator via a second user-level instruction, including data and an accelerator function to be applied to the data; providing the request to the accelerator via an interface logic associated with the accelerator, wherein the interface logic translates the request; and performing a first function in the accelerator responsive to the request in parallel with a second function in the first instruction sequencer, the second function independent of the first function and comprising a task unrelated to the graphics function. 9. The non-transitory machine-readable medium of claim 8 , wherein communicating the request comprises sending the request to the interface logic and passing the request from the interface logic to the accelerator according to a private protocol between the interface logic and the accelerator. 10. The non-transitory machine-readable medium of claim 9 , wherein the method further comprises sending the request to the interface logic via a first instruction set architecture and wherein the accelerator comprises a resource of a second instruction set architecture. 11. The non-transitory machine-readable medium of claim 8 , wherein the method further comprises communicating the request without operating system (OS) support, wherein the accelerator is transparent to the OS. 12. The non-transitory machine-readable medium of claim 8 , wherein the method further comprises performing a third user-level instruction in the first instruction sequencer to monitor for a signal for an event from the accelerator and associating the signal with a user-level handler. 13. The non-transitory machine-readable medium of claim 8 , wherein the method further comprises asynchronously receiving an event in the first instruction sequencer from the accelerator to indicate task status and responsive to the event initiating a user-level handler to receive and process result data from the accelerator. 14. The non-transitory machine-readable medium of claim 8 , wherein the method further comprises configuring the at least some of the plurality of processing engines using the user-level application and without operating system (OS) support. 15. The non-transitory machine-readable medium of claim 8 , wherein the method further comprises virtualizing the accelerator via the interface logic so that a first subset of functionality within the accelerator is visible to the user-level application and a second subset of functionality within the accelerator is visible to an operating system (OS), wherein the interface logic includes storages each to store the architectural state of the accelerator for a corresponding context. 16. A system comprising: a processor including a first instruction sequencer to perform instructions, a second instruction sequencer coupled the first instruction sequencer via an interface logic, the second instruction sequencer including an accelerator to perform at least one operation on data and according to an accelerator function received from the first instruction sequencer via a second user-level transfer instruction, wherein the accelerator comprises a heterogeneous resource with respect to the first instruction sequencer, the interface logic to translate an inter-sequencer communication of a user-level application, the inter-sequencer commu
Processor architectures; Processor configuration, e.g. pipelining · CPC title
for non-native instruction execution, e.g. executing a command; for Java instruction set · CPC title
Instruction operation extension or modification · CPC title
Instruction analysis, e.g. decoding, instruction word fields · CPC title
Arrangements for executing machine instructions, e.g. instruction decode (for executing microinstructions G06F9/22) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.