Remotely debugging an operating system

US10078576B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10078576-B2
Application numberUS-201615083375-A
CountryUS
Kind codeB2
Filing dateMar 29, 2016
Priority dateMar 29, 2016
Publication dateSep 18, 2018
Grant dateSep 18, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Remotely debugging a non-responsive operating system (OS) of a computer system. Central processing units (CPUs) in a computer system are bound to receive queues of a network adapter. Interrupts for a CPU is disabled, wherein the CPU is not available to process hardware interrupt requests queued in the bound receive queues. A debugging message including debugging commands is received by the network adapter, wherein the debugging message is stored in a first receive queue of the network adapter bound to a first CPU. If the first CPU is available, the debugging commands in the debugging message stored in the first of the one or more receive queues of the network adapter are identified by a debugger of the computer system. The identified debugging commands are executed by the CPU to debug the non-responsive OS of the computer system.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: binding each of a plurality of central processing units (CPUs) in a computer system to one or more receive queues of a network adapter, wherein each of the one or more receive queues is configured to queue a hardware interrupt request of the network adapter; disabling interrupts for one or more of the plurality of CPUs, wherein the one or more of the plurality of CPUs are not available to process hardware interrupt requests queued in the one or more bound receive queues, and an operating system (OS) of the computer system becomes non-responsive; receiving, by the network adapter, a debugging message including debugging commands, wherein the debugging message is stored in a first of the one or more receive queues of the network adapter bound to a first of the plurality of CPUs, and wherein the debugging message corresponds to a first hardware interrupt request queued in the first of the one or more receive queues; responsive to determining that a first of the plurality of CPUs is available to process the first hardware interrupt request, identifying, by a debugger of the computer system, the debugging commands in the debugging message stored in the first of the one or more receive queues of the network adapter; executing the identified debugging commands, by the first of the plurality of CPUs to debug the non-responsive OS of the computer system; responsive to determining that the first of the plurality of CPUs is not available to process the first hardware interrupt, receiving, by the network adapter, the debugging message including debugging commands, wherein the debugging message is stored in a second of the one or more receive queues of the network adapter bound to a second of the plurality of CPUs, and wherein the debugging message corresponds to a second hardware interrupt request queued in the second of the one or more receive queues; responsive to determining that the second of the plurality of CPUs is available to process the second hardware interrupt, identifying, by the debugger of the computer system, the debugging commands in the debugging message stored in the second of the one or more receive queues of the network adapter; and executing the identified debugging commands, by the second of the plurality of CPUs to debug the non-responsive OS of the computer system. 2. The method of claim 1 , further comprising: binding, by the network adapter, the one or more receive queues to a remote node using a destination vector based on one or more of: Internet Protocol (IP) address, InfiniBand address, InfiniBand connection, Fibre Channel (FC) address, FC connection, vendor defined management attribute in a data payload, or another network interface identification and location addressing mechanism. 3. The method of claim 1 , further comprising: generating, by an interrupt handler of the network adapter, the first hardware interrupt for the debugging message received by the network adapter; and queueing, by the network adapter, the first hardware interrupt request in the first of the one or more receive queues of the network adapter. 4. The method of claim 1 , wherein the debugging message includes: a back-trace of one of the plurality of CPUs, a register state of one of the plurality of CPUs, a process list, a process back-trace, one or more operational steps for enabling or disabling hardware interrupts for one of the plurality of CPUs, one or more operational steps for hardware interrupt debugging, or a list back-trace of one or more application programs that disabled one or more hardware interrupts. 5. The method of claim 1 , wherein the debugging message received by the network adapter is a network packet including a payload. 6. The method of claim 5 , wherein identifying, by the debugger of the computer system, the debugging commands in the received debugging message, comprises: reading, by the debugger of the computer system, the payload of the received network packet to identify the debugging commands in the debugging message stored in the first of the one or more receive queues of the network adapter. 7. A computer program product comprising: one or more computer readable storage media and program instructions stored on the one or more computer readable storage media, the program instructions comprising: program instructions to bind each of a plurality of central processing units (CPUs) in a computer system to one or more receive queues of a network adapter, wherein each of the one or more receive queues is configured to queue a hardware interrupt request of the network adapter; program instructions to disable interrupts for one or more of the plurality of CPUs, wherein the one or more of the plurality of CPUs are not available to process hardware interrupt requests queued in the one or more bound receive queues, and an operating system (OS) of the computer system becomes non-responsive; program instructions to receive by the network adapter, a debugging message including debugging commands, wherein the debugging message is stored in a first of the one or more receive queues of the network adapter bound to a first of the plurality of CPUs, and wherein the debugging message corresponds to a first hardware interrupt request queued in the first of the one or more receive queues; program instructions to, responsive to determining that a first of the plurality of CPUs is available to process the first hardware interrupt request, identify by a debugger of the computer system, the debugging commands in the debugging message stored in the first of the one or more receive queues of the network adapter; program instructions to execute the identified debugging commands, by the first of the plurality of CPUs to debug the non-responsive OS of the computer system; program instructions to, responsive to determining that the first of the plurality of CPUs is not available to process the first hardware interrupt, receive by the network adapter, the debugging message including debugging commands, wherein the debugging message is stored in a second of the one or more receive queues of the network adapter bound to a second of the plurality of CPUs, and wherein the debugging message corresponds to a second hardware interrupt request queued in the second of the one or more receive queues; program instructions to, responsive to determining that the second of the plurality of CPUs is available to process the second hardware interrupt, identify by the debugger of the computer system, the debugging commands in the debugging message stored in the second of the one or more receive queues of the network adapter; and program instructions to execute the identified debugging commands, by the second of the plurality of CPUs to debug the non-responsive OS of the computer system. 8. The computer program product of claim 7 , wherein the program instructions stored on the one or more computer readable storage media further comprise: program instructions to bind by the network adapter, the one or more receive queues to a remote node using a destination vector based on one or more of: Internet Protocol (IP) address, InfiniBand address, InfiniBand connection, Fibre Channel (FC) address, FC connection, vendor defined management attribute in a data payload, or another network interface identification and location addressing mechanism. 9. The computer program product of claim 7 , wherein the program instructions stored on the one or more computer readable storage media further comprise: program instructions to generate by an interrupt handler of the network adapter, the first hardware interrupt for the debugging message received by the network adapter; and program instructions to queue by the network adapter

Assignees

Inventors

Classifications

  • Debugging of software · CPC title

  • Remedial or corrective actions (recovery from an exception in an instruction pipeline G06F9/3861; by retry G06F11/1402; for recovering from a failure of a protocol instance or entity H04L69/40) · CPC title

  • in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems · CPC title

  • in a virtual computing platform, e.g. logically partitioned systems · CPC title

  • G06F11/366Primary

    using diagnostics (G06F11/0703 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10078576B2 cover?
Remotely debugging a non-responsive operating system (OS) of a computer system. Central processing units (CPUs) in a computer system are bound to receive queues of a network adapter. Interrupts for a CPU is disabled, wherein the CPU is not available to process hardware interrupt requests queued in the bound receive queues. A debugging message including debugging commands is received by the netw…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F11/366. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 18 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).