Identifying patterns in event logs to predict and prevent cloud service outages
US-2021383206-A1 · Dec 9, 2021 · US
US11809264B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11809264-B2 |
| Application number | US-202217656268-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 24, 2022 |
| Priority date | Mar 24, 2022 |
| Publication date | Nov 7, 2023 |
| Grant date | Nov 7, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of systems and methods for exothermic event prediction engine are described. In an embodiment, an Information Handling System (IHS) may include: a processor, a Remote Access Controller (RAC) coupled to the processor, and a memory coupled to the RAC, the memory having program instructions stored thereon that, upon execution by the RAC, cause the RAC to collect telemetry data from the IHS and predict an exothermic failure in the IHS based, at least in part, upon the telemetry data.
Opening claim text (preview).
The invention claimed is: 1. An Information Handling System (IHS), comprising: a processor; a Remote Access Controller (RAC) coupled to the processor; and a memory coupled to the RAC, the memory having program instructions stored thereon that, upon execution by the RAC, cause the RAC to: collect telemetry data from the IHS; predict an exothermic failure in the IHS based, at least in part, upon the telemetry data; compress the telemetry data into a Hardware Event Snapshot (HES); and move the HES to lower-tier storage a selected amount of time after the collection. 2. The IHS of claim 1 , wherein the telemetry data comprises live telemetry. 3. The IHS of claim 1 , wherein the telemetry data comprises historical telemetry. 4. The IHS of claim 1 , wherein the HES excludes duplicated telemetry data. 5. The IHS of claim 1 , wherein the telemetry data is assembled into a usage matrix comprising voltage and thermal readings. 6. The IHS of claim 5 , wherein the voltage readings comprise: a processor voltage, a memory voltage, and a motherboard voltage. 7. The IHS of claim 5 , wherein the thermal readings comprise: an inlet temperature, an outlet temperature, a target outlet temperature limit, and an airflow reading. 8. The IHS of claim 5 , wherein the usage matrix comprises an indication of at least one of: an IHS posture, or an IHS location. 9. The IHS of claim 1 , wherein the telemetry data is collected from one or more of: the processor, a Network Interface Card (NIC), a fiber channel Host Bus Adapter (FC-HBA), a system memory, a Graphics Processing Unit (GPU), a storage drive, a Power Supply Unit (PSU), or a fan. 10. The IHS of claim 1 , wherein to predict the exothermic failure, the program instructions, upon execution by the RAC, further cause the RAC to apply a Very Fast Decision Tree (VFDT) algorithm to the telemetry data. 11. The IHS of claim 1 , wherein the exothermic failure comprises an over-voltage or high temperature condition that causes: a processor failure, a Dual In-line Memory Module (DIMM) slot failure, a power supply failure, a battery failure, or a fan failure. 12. The IHS of claim 1 , wherein the program instructions, upon execution by the RAC, further cause the RAC to scale a fan speed or modify a thermal profile of the IHS to prevent the exothermic failure. 13. The IHS of claim 1 , wherein the program instructions, upon execution by the RAC, further cause the RAC to trigger an alert in response to the prediction. 14. The IHS of claim 13 , wherein the alert is classified in a category selected from the group consisting of: a first level alert that indicates a reconfiguration of the IHS, a second level alert that indicates immediate manual intervention, and a third level alert that powers off the IHS. 15. A method, comprising: collecting telemetry data in an Information Handling System (IHS); and predicting an exothermic event in the IHS based, at least in part, by applying a Very Fast Decision Tree (VFDT) algorithm to the telemetry data. 16. The method of claim 15 , wherein the exothermic event comprises a failure due to an over-voltage or high temperature condition. 17. The method of claim 15 , further comprising: compressing the telemetry data into a Hardware Event Snapshot (HES); and moving the HES to lower-tier storage a selected amount of time after the collection. 18. A hardware memory device having program instructions stored thereon that, upon execution by a Chassis Management Controller (CMC) of a chassis comprising a plurality of Information Handling Systems (IHSs), cause the CMC to: collect telemetry data from the plurality of IHSs; and predict an exothermic event in at least one of the plurality of IHSs based, at least in part, upon application of a Very Fast Decision Tree (VFDT) algorithm to the telemetry data. 19. The hardware memory device of claim 18 , wherein the exothermic event comprises a Dual In-line Memory Module (DIMM) slot failure due to an over-voltage or high temperature condition. 20. The hardware memory device of claim 18 , wherein the program instructions, upon execution by the CMC, further cause the CMC to: compress the telemetry data into a Hardware Event Snapshot (HES); and move the HES to lower-tier storage a selected amount of time after the collection.
by exceeding limits · CPC title
comprising thermal management · CPC title
within a central processing unit [CPU] · CPC title
Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations (thermal management in cooling arrangements of a computing system G06F1/206) · CPC title
Error avoidance (G06F11/07 and subgroups take precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.