Hardware-based edge profiling

US9703667B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9703667-B2
Application numberUS-201514628253-A
CountryUS
Kind codeB2
Filing dateFeb 22, 2015
Priority dateFeb 22, 2015
Publication dateJul 11, 2017
Grant dateJul 11, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method comprising: counting each occurrence of a hardware event by a Performance Monitoring Counter of a hardware processor during the execution of a target program code; orderly and continuously storing in a buffer of a Taken Branch Trace (TBT) Facility of said hardware processor a predefined TBT size of last taken branches of said target program code during its execution; every time said counting equals a sampling rate, triggering sampling of said buffer, to receive a TBT comprising current said predefined TBT size of last taken branches; constructing a full branch trace for each said TBT based on said target program code; extracting a predefined Chopped Branch Trace (CBT) size of last branches from each said full branch trace, to receive a chopped branch trace for said each TBT; and incrementally storing each said chopped branch trace to generate an edge profile of said target program code.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: counting each occurrence of a hardware event by a Performance Monitoring Counter (PMC) of at least one hardware processor during the execution of a target program code, wherein each occurrence of each hardware event comprises any of: a taken and a not taken branch; orderly and continuously storing in a buffer of a Taken Branch Trace Facility (TBTC) of said at least one hardware processor a predefined Taken Branch Trace (TBT) size of last taken branches of said target program code during its execution; every time said counting of any of said taken and said not taken branches equals a sampling rate, triggering sampling of said buffer, to receive a taken branch trace comprising current said predefined TBT size of last taken branches, wherein: (a) said sampling rate is variable and fixed in average, (b) said sampling rate value equals a predefined number summed with a randomly chosen delta, and (c) the values of said randomly chosen delta are substantially lower than the value of said predefined number; constructing a full branch trace for each said taken branch trace based on said target program code, wherein said full branch trace comprises all of the completed branches of said target program code between the first and last taken branches of said taken branch trace and including the first and last taken branches of said taken branch trace; extracting a predefined Chopped Branch Trace (CBT) size of last branches from each said full branch trace, to receive a uniform chopped branch trace for said each taken branch trace; and incrementally storing each said uniform chopped branch trace to generate a uniform edge profile of said target program code. 2. The method according to claim 1 , wherein the storing of said predefined TBT size of last taken branches is performed in a cyclic manner. 3. The method according to claim 1 , wherein, when a nontaken branch is completed, said counting equals said sampling rate, and wherein said method further comprises appending the nontaken branch to the full branch trace. 4. The method according to claim 1 , wherein the edge profile is a call-graph profile and the branch is a function call. 5. The method of claim 1 , wherein the occurrence of a hardware event is a completion of a branch of the target program code. 6. The method of claim 1 , wherein the occurrence of a hardware event is a completion of an instruction of the target program code. 7. A computer program product comprising a non-transitory computer-readable storage medium having operating program code embodied therewith, the operating program code executable by at least one hardware processor, wherein the at least one hardware processor is configured to: count each occurrence of a hardware event during the execution of a target program code, wherein each occurrence of each hardware event comprises any of: a taken and a not taken branch; orderly and continuously store in a buffer a predefined Taken Branch Trace (TBT) size of last taken branches of said target program code during its execution; and every time said counting of any of said taken and said not taken branches equals a sampling rate, trigger sampling of said buffer, to receive a taken branch trace comprising current said predefined TBT size of last taken branches, wherein: (a) said sampling rate is variable and fixed in average, (b) said sampling rate value equals a predefined number summed with a randomly chosen delta, and (c) the values of said randomly chosen delta are substantially lower than the value of said predefined number, and wherein the operating program code is executable by the at least one hardware processor to: construct an full branch trace for each said taken branch trace based on said target program code, wherein said full branch trace comprises all of the completed branches of said target program code between the first and last taken branches of said taken branch trace and including the first and last taken branches of said taken branch trace; extract a predefined, Chopped Branch Trace (CBT) size of last branches from each said full branch trace, to receive a uniform chopped branch trace for said each taken branch trace; and incrementally store each said uniform chopped branch trace to generate a uniform edge profile of said target program code. 8. The computer program product of claim 7 , wherein the storing of said predefined TBT size of last taken branches is performed in a cyclic manner. 9. The computer program product of claim 7 , wherein, when a nontaken branch is completed, said counting equals said sampling rate, and wherein said operating program code is further executable by the at least one hardware processor to append the nontaken branch to the full branch trace. 10. The computer program product of claim 7 , wherein the edge profile is a call-graph profile and the branch is a function call. 11. The computer program product of claim 7 , wherein the occurrence of a hardware event is selected from a group consisting of: a completion of a branch of the target program code and a completion of an instruction of the target program code. 12. A system comprising at least one hardware processor, the at least one hardware processor comprising: a Taken Branch Trace Facility (TBTC) comprising a buffer, the TBTC configured to orderly and continuously store in said buffer a predefined TBT size of last taken branches of multiple branches of a target program code during its executing; a Performance Monitoring Counter (PMC) configured, during the executing of said target program code, to: a. count each occurrence of a hardware event, comprising any of: a taken and a not taken branch, and b. every time said count of any of said taken and said not taken branches equals a sampling rate, trigger sampling of said BTF, to receive a taken branch trace comprising current said predefined TBT size of last taken branches, wherein: (a) said sampling rate is variable and fixed in average, (b) said sampling rate value equals a predefined number summed with a randomly chosen delta, and (c) the values of said randomly chosen delta are substantially lower than the value of said predefined number, wherein the hardware processor is configured, by executing an operating program code, to: construct a full branch trace for each taken branch trace based on said target program code, wherein said full branch trace comprises all of the completed branches of said target program code between and the first and last taken branches of said taken branch trace and including the first and last taken branches of said taken branch trace, extract a predefined CBT size of last branches from each said full branch trace, to receive a uniform chopped branch trace for said each taken branch trace, and incrementally store in a storage device each said uniform chopped branch trace to generate a uniform edge profile of said target program code. 13. The system of claim 12 , wherein the buffer is cyclic. 14. The system of claim 12 , wherein, when a nontaken branch is completed, said counting equals said sampling rate, and wherein said at least one hardware processor is further configured, by executing said operating program code, to append the nontaken branch to the full branch trace. 15. The system of claim 12 , wherein the edge profile is a call-graph profile and the multiple branches are function calls. 16. The system of claim 12 , wherein the occurrence of a hardware event is selected from a group consisting of: a completion of a branch of the target program code and a completion of an instruction of the target program co

Assignees

Inventors

Classifications

  • using additional hardware · CPC title

  • by tracing the execution of the program · CPC title

  • where the computing system component is a central processing unit [CPU] · CPC title

  • G06F8/443Primary

    Optimisation · CPC title

  • Monitoring involving counting · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9703667B2 cover?
A method comprising: counting each occurrence of a hardware event by a Performance Monitoring Counter of a hardware processor during the execution of a target program code; orderly and continuously storing in a buffer of a Taken Branch Trace (TBT) Facility of said hardware processor a predefined TBT size of last taken branches of said target program code during its execution; every time said co…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F11/3024. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 11 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).