What technology area does this patent fall under?

Primary CPC classification G06F30/33. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Aug 03 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Multi-core compact executable trace processor

US2017220719A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2017220719-A1
Application number	US-201615011724-A
Country	US
Kind code	A1
Filing date	Feb 1, 2016
Priority date	Feb 1, 2016
Publication date	Aug 3, 2017
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described herein are a processor and a method of operating the processor to simulate a many-core target machine. The processor includes a plurality of processing cores arranged in a predetermined manner and a global target clock counter (GTCC) configured to count a number of simulated clock cycles in the target machine. A global stall controller (GSC) configured to halt execution of all the processing cores based on a determination of at least one processing core being in a fault condition; and wherein the processor acquires a base clock per instruction (CPI) of a target machine, the CPI corresponding to an average number of clock cycles required by the target machine to execute a single instruction, translates an application of the target machine to a compact executable trace to be executed by the processor, and adjusts a speed of simulation by adjusting an update rate of the global target clock counter.

First claim

Opening claim text (preview).

What is claimed is: 1 . A device for simulating a many-core target machine, the device comprising: a processor including: a plurality of processing cores arranged in a predetermined manner; a global target clock counter (GTCC) configured to count a number of simulated clock cycles in the target machine; a global stall controller (GSC) configured to halt execution of all the processing cores based on a determination of at least one processing core being in a fault condition; and wherein the processor is configured to: acquire a base clock per instruction (CPI) of a target machine, the CPI corresponding to an average number of clock cycles required by the target machine to execute a single instruction, translate an application of the target machine to a compact executable trace to be executed by the processor, determine whether to query an off-chip memory based on detecting a cache miss event, determine whether to adjust a simulation speed based on receiving a control signal from a router, and adjust dynamically, a speed of simulation of the processor by adjusting an update rate of the global target clock counter. 2 . The device of claim 1 , wherein the CPI of the target machine is acquired by simulating on a timing simulator, a benchmark of the target machine, the simulation being performed by ignoring the cache miss event. 3 . The device of claim 1 , wherein the processor is further configured to profile each instruction of the target application to generate a profiled image of the application, the profiled image including an object for each unique instruction of the target application, and wherein each instruction of the target application is mapped to a unique address in the profiled image via a hash function. 4 . The device of claim 3 , wherein the processor is further configured to refine the profiled image to generate instructions for the processor to execute. 5 . The device of claim 1 , wherein a first core of the plurality of cores is a master core configured to execute a master thread of the application of the target machine. 6 . The device of claim 5 , wherein the other cores of the plurality of cores are worker cores configured to execute parallel portions of the application of the target machine. 7 . The device of claim 6 , wherein the plurality of cores are arranged in a ring-network. 8 . The device of claim 1 , wherein each processing core of the plurality of cores is configured to evaluate an amount of time required by the target machine to execute an instruction. 9 . The device of claim 1 , wherein the processor is further configured to set the update rate of the GTCC to an initial value that is based on a number of target clock cycles required to execute a predetermined number of instructions, and a number of host cycles required to execute a single instruction. 10 . The device of claim 9 , wherein the processor is further configured to: reduce the simulation speed by half the initial value, based on the GSC receiving the request, and increase the simulation speed two-folds the initial value based on the GSC not receiving the request in a predetermined amount of time. 11 . A method for simulating a many-core target machine, the method being performed by a processor, the method comprising: acquiring a base clock per instruction (CPI) of a target machine, the CPI corresponding to an average number of clock cycles required by the target machine to execute a single instruction, translating an application of the target machine to a compact executable trace to be executed by the processor, determining whether to query an off-chip memory based on detecting a cache miss event, determining, by the processor whether to adjust a simulation speed based on receiving a control signal from a router, and adjusting dynamically, by the processor, a speed of simulation by adjusting an update rate of a global target clock counter (GTCC). 12 . The method of claim 11 , further comprising: profiling each instruction of the target application to generate a profiled image of the application, the profiled image including an object for each unique instruction of the target application, and wherein each instruction of the target application is mapped to a unique address in the profiled image via a hash function. 13 . The method of claim 12 , further comprising: refining the profiled image to generate instructions for the processor to execute. 14 . The method of claim 11 , further comprising: setting by the processor, the update rate of the GTCC to an initial value that is based on a number of target clock cycles required to execute a predetermined number of instructions, and a number of host cycles required to execute a single instruction. 15 . The method of claim 14 , further comprising: reducing the simulation speed by half the initial value, based on a global stall controller (GSC) included in the processor, receiving the control signal, and increasing the simulation speed two-folds the initial value based on the GSC not receiving the control signal in a predetermined amount of time. 16 . A non-transitory computer readable medium having stored thereon a program that when executed by a computer, causes the computer to execute a method of simulating a many-core target machine, the method comprising: acquiring a base clock per instruction (CPI) of a target machine, the CPI corresponding to an average number of clock cycles required by the target machine to execute a single instruction, translating an application of the target machine to a compact executable trace to be executed by the processor, determining whether to query an off-chip memory based on detecting a cache miss event, determining, whether to adjust a simulation speed based on receiving a control signal from a router, and adjusting dynamically, a speed of simulation by adjusting an update rate of a global target clock counter (GTCC). 17 . The non-transitory computer readable medium of claim 16 , the method further comprising: profiling each instruction of the target application to generate a profiled image of the application, the profiled image including an object for each unique instruction of the target application, and wherein each instruction of the target application is mapped to a unique address in the profiled image via a hash function. 18 . The non-transitory computer readable medium of claim 16 , the method further comprising: refining the profiled image to generate instructions for the processor to execute. 19 . The non-transitory computer readable medium of claim 16 , the method further comprising: setting the update rate of the GTCC to an initial value that is based on a number of target clock cycles required to execute a predetermined number of instructions, and a number of host cycles required to execute a single instruction. 20 . The non-transitory computer readable medium of claim 16 , the method further comprising: reducing the simulation speed by half the initial value, based on a global stall controller (GSC) included in the processor, receiving the control signal, and increasing the simulation speed two-folds the initial value based on the GSC not receiving the control signal in a predetermined amount of time.

Assignees

Univ King Fahd Pet & Minerals

Inventors

Classifications

G06F2115/10
Processors · CPC title
G06F2119/12
Timing analysis or timing optimisation · CPC title
G06F30/33Primary
Design verification, e.g. functional simulation or model checking · CPC title
G06F13/124
where hardware is a sequential transfer control unit, e.g. microprocessor, peripheral processor or state-machine · CPC title
G06F11/3648
using additional hardware · CPC title

Patent family

Related publications grouped by family.

View patent family 59386801

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2017220719A1 cover?: Described herein are a processor and a method of operating the processor to simulate a many-core target machine. The processor includes a plurality of processing cores arranged in a predetermined manner and a global target clock counter (GTCC) configured to count a number of simulated clock cycles in the target machine. A global stall controller (GSC) configured to halt execution of all the pro…
Who is the assignee on this patent?: Univ King Fahd Pet & Minerals
What technology area does this patent fall under?: Primary CPC classification G06F30/33. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Aug 03 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).