Collection and aggregation of device health metrics
US-9842017-B1 · Dec 12, 2017 · US
US11914458B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11914458-B2 |
| Application number | US-202117392838-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 3, 2021 |
| Priority date | Apr 27, 2018 |
| Publication date | Feb 27, 2024 |
| Grant date | Feb 27, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods are disclosed herein for monitoring, detecting, and mitigating hardware and software failures. An error detection module monitors the execution of software processes and detects failures of the monitored processes. The error detection module may monitor reboot events and correlate reboot events with failures of the monitored software processes. If a monitored process fails, the error detection module may log the failure and its cause. If the same process has failed numerous times, causing the user device to experience a reboot loop, remedial action may be taken based on the cause of the failure.
Opening claim text (preview).
What is claimed is: 1. A method for handling software execution failures on a user device, the method comprising: monitoring execution of a software process on the user device; detecting a reboot of the user device; determining a number of reboots of the user device within a predefined time duration; in response to determining that the number of reboots of the user device within the predefined time duration exceeds a failure threshold, performing a remedial action prior to execution of the software process; detecting, within a threshold period of time after the reboot, a second reboot of the user device; determining whether the second reboot was caused by a failure of the remedial action; executing a limited boot process; and transmitting, to a server, a request for a software update. 2. The method of claim 1 , further comprising: determining whether the reboot was caused by a prerequisite process not having been completed; wherein execution of the software process is delayed in response to determining that the reboot was caused by a prerequisite process not having been completed. 3. The method of claim 1 , wherein the remedial action comprises slowing a speed at which the software process is executed to allow a prerequisite process to complete prior to execution of the software process. 4. The method of claim 1 , wherein determining a number of reboots of the user device within a predefined time duration further comprises: incrementing a counter in response to detecting the reboot; and determining whether the value of the counter exceeds the failure threshold. 5. The method of claim 4 , wherein the counter is one of a plurality of counters, each counter being associated with a respective cause of a reboot, and wherein incrementing the counter further comprises identifying a respective counter of the plurality of counters. 6. The method of claim 1 , wherein the software process comprises a plurality of steps, and wherein delaying execution of the software process comprises pausing execution of the software process for a period of time after each step. 7. The method of claim 1 , wherein the software process comprises a plurality of steps, the method further comprising: identifying a step of the plurality of steps at which the reboot occurred; and monitoring execution of a prerequisite process; wherein slowing the speed at which the software process is executed comprises preventing execution of the identified step until the prerequisite process has been completed. 8. The method of claim 1 , wherein the software process stores status information related to the execution and failure of the software process in a data structure. 9. The method of claim 8 , wherein determining a cause of the reboot further comprises: determining a time at which the reboot occurred; accessing the data structure; determining status information having a timestamp at or near the determined time at which the reboot occurred; and determining, based on the status information, the cause of the reboot. 10. A system for handling software execution failures on a user device, the system comprising: control circuitry configured to: monitor execution of a software process on the user device; detect a reboot of the user device; determine a number of reboots of the user device within a predefined time duration; in response to determining that the number of reboots of the user device within the predefined time duration exceeds a failure threshold, performing a remedial action prior to execution of the software process; detect, within a threshold period of time after the reboot, a second reboot of the user device; determine whether the second reboot was caused by a failure of the remedial action; execute a limited boot process; and transmit, to a server, a request for a software update. 11. The method system of claim 10 , the control circuitry further configured to: determine whether the reboot was caused by a prerequisite process not having been completed; wherein execution of the software process is delayed in response to determining that the reboot was caused by a prerequisite process not having been completed. 12. The system of claim 10 , wherein the remedial action comprises slowing a speed at which the software process is executed to allow a prerequisite process to complete prior to execution of the software process. 13. The system of claim 10 , wherein the control circuitry configured to determine a number of reboots of the user device within a predefined time duration is further configured to: increment a counter in response to detecting the reboot; and determine whether the value of the counter exceeds the failure threshold. 14. The system of claim 13 , wherein the counter is one of a plurality of counters, each counter being associated with a respective cause of a reboot, and wherein the control circuitry configured to increment the counter is further configured to identify a respective counter of the plurality of counters. 15. The system of claim 10 , wherein the software process comprises a plurality of steps, and wherein the control circuitry configured to delay execution of the software process is further configured to pause execution of the software process for a period of time after each step. 16. The system of claim 10 , wherein the software process comprises a plurality of steps, and wherein the control circuitry is further configured to: identify a step of the plurality of steps at which the reboot occurred; and monitor execution of a prerequisite process; wherein the control circuitry configured to slow the speed at which the software process is executed is further configured to prevent execution of the identified step until the prerequisite process has been completed. 17. The system of claim 10 , wherein the software process stores status information related to the execution and failure of the software process in a data structure. 18. The system of claim 17 , wherein the control circuitry configured to determine a cause of the reboot is further configured to: determine a time at which the reboot occurred; access the data structure; determine status information having a timestamp at or near the determined time at which the reboot occurred; and determine, based on the status information, the cause of the reboot.
by exceeding a count or rate limit, e.g. word- or bit count limit · CPC title
Updates (security arrangements therefor G06F21/57) · CPC title
the processing taking place on a specific hardware platform or in a specific software environment · CPC title
Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title
Remedial or corrective actions (recovery from an exception in an instruction pipeline G06F9/3861; by retry G06F11/1402; for recovering from a failure of a protocol instance or entity H04L69/40) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.