Instruction level tracing for analyzing processor failure

US11119890B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11119890-B2
Application numberUS-201916553991-A
CountryUS
Kind codeB2
Filing dateAug 28, 2019
Priority dateAug 28, 2019
Publication dateSep 14, 2021
Grant dateSep 14, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method for instruction-level tracing for analyzing processor failure includes detecting a failure during operation of a processor circuit. The method further includes parsing a miscompare trace to determine a plurality of opcodes executed by the processor prior to the failure. The method further includes generating a workload comprising a set of opcodes by filtering the set of opcodes from the miscompare trace. The method further includes performing a consistency check of the workload to determine a commit ratio of the workload, the commit ratio indicative of a number of times the failure occurs when the workload is executed a predetermined number of times. The method further includes using the workload for debugging the failure based on the commit ratio being above a predetermined threshold.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for instruction-level tracing for analyzing processor failure to debug a circuit of a processor, the computer-implemented method comprising: detecting, by the processor, a failure during operation of the processor; parsing, by the processor, a miscompare trace to determine a plurality of opcodes executed by the processor prior to the failure, the miscompare trace comprising an instruction stream executed by the processor, wherein the instruction stream comprises the opcodes, which indicate machine instructions executed by the processor; generating, by the processor, a workload comprising a set of opcodes by filtering the set of opcodes from the miscompare trace; performing, by the processor, a consistency check of the workload to determine a commit ratio of the workload, the commit ratio indicative of a number of times the failure occurs when the workload is executed a predetermined number of times; and executing, by the processor, the workload for debugging the failure based on the commit ratio being above a predetermined threshold. 2. The computer-implemented method of claim 1 , wherein generating the workload comprises determining a type of the failure and filtering the set of opcodes based on the type of the failure. 3. The computer-implemented method of claim 2 , wherein, based on the type of the failure being core/thread hang, filtering the set of opcodes comprises searching for the last opcode that caused the failure. 4. The computer-implemented method of claim 2 , wherein, based on the type of the failure being one associated with a specific component circuit of the processor, filtering the set of opcodes comprises searching for opcodes that are specifically executed by the specific component circuit. 5. The computer-implemented method of claim 4 , wherein the specific component circuit is one from a group of component circuits comprising a load-store unit, an instruction-fetch unit, and a vector/scalar execution unit. 6. The computer-implemented method of claim 2 , wherein, based on the type of the failure being a CPU/FPU miscompare, filtering the set of opcodes comprises identifying one or more opcodes that are repeated at least a predetermined number of times and including the one or more opcodes in the workload. 7. The computer-implemented method of claim 6 , wherein the method further comprises, optimizing the workload by selecting an optimizing algorithm from a group comprising static optimization, dynamic optimization, and elimination optimization. 8. A system comprising: a memory; and a processor coupled with the memory, the processor configured to perform a method for instruction-level tracing for analyzing a processor failure, the method comprising: detecting a failure during operation of the processor circuit; parsing a miscompare trace to determine a plurality of opcodes executed by the processor prior to the failure, the miscompare trace comprising an instruction stream executed by the processor, wherein the instruction stream comprises the opcodes, which indicate machine instructions executed by the processor; generating a workload comprising a set of opcodes by filtering the set of opcodes from the miscompare trace; performing a consistency check of the workload to determine a commit ratio of the workload, the commit ratio indicative of a number of times the failure occurs when the workload is executed a predetermined number of times; and executing the workload for debugging the failure based on the commit ratio being above a predetermined threshold. 9. The system of claim 8 , wherein generating the workload comprises determining a type of the failure and filtering the set of opcodes based on the type of the failure. 10. The system of claim 9 , wherein, based on the type of the failure being core/thread hang, filtering the set of opcodes comprises searching for the last opcode that caused the failure. 11. The system of claim 9 , wherein, based on the type of the failure being one associated with a specific component circuit of the processor, filtering the set of opcodes comprises searching for opcodes that are specifically executed by the specific component circuit. 12. The system of claim 11 , wherein the specific component circuit is one from a group of component circuits comprising a load-store unit, an instruction-fetch unit, and a vector/scalar execution unit. 13. The system of claim 9 , wherein, based on the type of the failure being a CPU/FPU miscompare, filtering the set of opcodes comprises identifying one or more opcodes that are repeated at least a predetermined number of times and including the one or more opcodes in the workload. 14. The system of claim 13 , wherein the method further comprises, optimizing the workload by selecting an optimizing algorithm from a group comprising static optimization, dynamic optimization, and elimination optimization. 15. A computer program product comprising a computer-readable storage medium having program instructions embodied therewith, the program instructions executable by a processing circuit to perform a method for instruction-level tracing for analyzing a processor failure, the method comprising: detecting a failure during operation of a processor circuit; parsing a miscompare trace to determine a plurality of opcodes executed by the processor prior to the failure, the miscompare trace comprising an instruction stream executed by the processor, wherein the instruction stream comprises the opcodes, which indicate machine instructions executed by the processor; generating a workload comprising a set of opcodes by filtering the set of opcodes from the miscompare trace; performing a consistency check of the workload to determine a commit ratio of the workload, the commit ratio indicative of a number of times the failure occurs when the workload is executed a predetermined number of times; and executing the workload for debugging the failure based on the commit ratio being above a predetermined threshold. 16. The computer program product of claim 15 , wherein generating the workload comprises determining a type of the failure and filtering the set of opcodes based on the type of the failure. 17. The computer program product of claim 16 , wherein, based on the type of the failure being core/thread hang, filtering the set of opcodes comprises searching for the last opcode that caused the failure. 18. The computer program product of claim 16 , wherein, based on the type of the failure being one associated with a specific component circuit of the processor, filtering the set of opcodes comprises searching for opcodes that are specifically executed by the specific component circuit. 19. The computer program product of claim 16 , wherein, based on the type of the failure being a CPU/FPU miscompare, filtering the set of opcodes comprises identifying one or more opcodes that are repeated at least a predetermined number of times and including the one or more opcodes in the workload. 20. The computer program product of claim 19 , wherein the method further comprises, optimizing the workload by selecting an optimizing algorithm from a group comprising static optimization, dynamic optimization, and elimination optimization.

Assignees

Inventors

Classifications

  • by tracing the execution of the program · CPC title

  • Error or fault detection not based on redundancy (power supply failures G06F1/30; network fault management H04L41/06) · CPC title

  • within a central processing unit [CPU] · CPC title

  • Circuit details, i.e. tracer hardware · CPC title

  • Tester hardware, i.e. output processing circuits {(G06F11/263 takes precedence)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11119890B2 cover?
A computer-implemented method for instruction-level tracing for analyzing processor failure includes detecting a failure during operation of a processor circuit. The method further includes parsing a miscompare trace to determine a plurality of opcodes executed by the processor prior to the failure. The method further includes generating a workload comprising a set of opcodes by filtering the s…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F11/3636. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 14 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).