What technology area does this patent fall under?

Primary CPC classification G06F9/30007. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jun 28 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Scalable computing array

US9378181B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9378181-B2
Application number	US-201213672828-A
Country	US
Kind code	B2
Filing date	Nov 9, 2012
Priority date	Nov 9, 2012
Publication date	Jun 28, 2016
Grant date	Jun 28, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and apparatus for providing a scalable computing array are provided herein. The method includes determining a width of a processor based on a software program, and a specified policy. The processor may be configured to comprise a number of lanes based on the width, and a thread of the software program may be executed using the configured processor.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for providing a scalable computing array, comprising: determining a width of a processor based on a software program, and a specified policy; configuring the processor to comprise a number of lanes based on the width; and executing a thread of the software program using the configured processor, wherein determining the width of the processor comprises invoking a function call from within the thread, and wherein the thread specifies the width. 2. The method of claim 1 , wherein determining the width of the processor comprises: a compiler compiling the software program; and the compiler determining the width of the processor based on a width of an instruction of the program. 3. The method of claim 1 , wherein determining the width of the processor comprises: score boarding the thread, wherein score boarding comprises: determining a power performance of the thread; and identifying a memory cache hierarchy behavior; and determining the width based on the power performance of the thread, or the memory cache hierarchy behavior. 4. The method of claim 1 , comprising adjusting a clock frequency based on the specified policy. 5. The method of claim 4 , comprising adjusting a clock frequency based on: a compile time directive; or a function call invoked from within the thread. 6. The method of claim 3 , wherein a voltage is adjusted based on score boarding. 7. The method of claim 1 , wherein a voltage is adjusted based on: a compile time directive; or a function call invoked from within the thread. 8. The method of claim 1 , wherein a clock is adjusted to a preset rate based on: a compile time directive; or a function call invoked from within the thread. 9. The method of claim 3 , wherein a clock is adjusted at run time based on score boarding. 10. The method of claim 3 , wherein a voltage is adjusted based on score boarding or compile time hints. 11. The method of claim 1 , wherein one or more lanes of the processor are powered off in response to a determination by the controller that the one or more lanes are inactive. 12. The method of claim 1 , wherein a lane comprises: one byte wide arithmetic and logic unit (ALU); and a register of the one byte wide ALU. 13. The method of claim 1 , wherein the policy specifies determining the width based on one or more of: thread priority; balancing stalls; power targets; performance targets; thread resource use priority; and thread memory hierarchy preferences for pinning pages. 14. The method of claim 1 , wherein the software program comprises very long instruction words (VLIW), or single instruction multiple data (SIMD) instructions. 15. An apparatus, comprising: a plurality of arithmetic and logic units (ALUs); a plurality of registers of the ALUs; and a plurality of single instruction multiple data (SIMD) controllers; and a controller, wherein the controller: configures one or more processors, each processor comprising: one of the SIMD controllers; a specified number of the ALUs; and a specified number of the registers; and modifies the specified number of the ALUs during runtime of a thread executing on one of the one or more process, based on an instruction of the thread and a specified policy, wherein determining the specified number of ALUs comprises invoking a function call from within the thread, and wherein the thread specifies the number of ALUs. 16. The apparatus of claim 15 , wherein the specified number of the ALUs is modified by: powering on one or more of the ALUs; and powering on one or more of the registers. 17. The apparatus of claim 15 , wherein the specified number of the ALUs is modified by: powering off one or more of the ALUs; and powering off one or more of the registers. 18. The apparatus of claim 15 , wherein the number of ALUs in each processor of the one or more processors is configured using a machine instruction. 19. The apparatus of claim 15 , wherein at least one of the one or more processors is a VLIW processor, and the number of ALUs in the VLIW processor is configured using a machine instruction. 20. The apparatus of claim 15 , wherein a power policy is configured for each processor of the one or more processors using a machine instruction. 21. The apparatus of claim 15 , wherein the number of ALUs in each processor of the one or more processors is configured using a context control register. 22. The apparatus of claim 15 , wherein at least one of the one or more processors is a VLIW processor, and the number of ALUs in the VLIW processor is configured using a context control register. 23. The apparatus of claim 15 , wherein a power policy is configured for each processor of the one or more processors using a context control register. 24. The apparatus of claim 15 , wherein a policy state may comprise at least one of a power off state, a low power state, a normal power state, a high power state, a power burst state, or any combination thereof. 25. The apparatus of claim 24 , wherein each policy state comprises corresponding voltage and frequency levels that are predetermined or set manually. 26. At least one machine readable medium comprising a plurality of instructions that, in response to being executed on a computing device, cause the computing device to: configure one or more processors to execute a thread of a software program, each of the processors, comprising: a SIMD controller; a specified number of arithmetic logic units (ALUs); and a specified number of registers; and modify the specified number of ALUs during runtime of the thread executing on one or more of the processors based on an instruction of the thread and a specified policy, wherein determining the specified number of ALUs comprises invoking a function call from within the thread, and wherein the thread specifies the number of ALUs. 27. The machine readable medium of claim 26 , comprise an instruction that, in response to being executed on the computing device, cause the computing device to modify the specified number of ALUs at runtime. 28. The machine readable medium of claim 26 , comprise an instruction that, in response to being executed on the computing device, cause the computing device to determine an initial number of ALUs for the thread based on a compilation of the software program. 29. A printing device to print a workload processed using a scalable computing array, comprising a print object module configured to: determine a width of a workload to be printed; adjust a width of an SIMD processing unit based on the width of the printing workload; and process the printing workload using the SIMD processing unit.

Assignees

Intel Corp

Inventors

Krig Scott

Classifications

G06F9/30007Primary
to perform operations on data operands · CPC title
G06F15/8007Primary
single instruction multiple data [SIMD] multiprocessors · CPC title
G06F9/30014
with variable precision · CPC title
G06F9/30036
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
G06F9/3887
controlled by a single instruction for multiple data lanes [SIMD] · CPC title

Patent family

Related publications grouped by family.

View patent family 50682883

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9378181B2 cover?: A method and apparatus for providing a scalable computing array are provided herein. The method includes determining a width of a processor based on a software program, and a specified policy. The processor may be configured to comprise a number of lanes based on the width, and a thread of the software program may be executed using the configured processor.
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06F9/30007. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jun 28 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).