Predicting the impact of previously unseen computer system failures on the system using a unified topology
US-2024193023-A1 · Jun 13, 2024 · US
US9594620B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9594620-B2 |
| Application number | US-201615088377-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 1, 2016 |
| Priority date | Apr 4, 2011 |
| Publication date | Mar 14, 2017 |
| Grant date | Mar 14, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments are directed to predicting the health of a computer node using health report data and to proactively handling failures in computer network nodes. In an embodiment, a computer system monitors various health indicators for multiple nodes in a computer network. The computer system accesses stored health indicators that provide a health history for the computer network nodes. The computer system then generates a health status based on the monitored health factors and the health history. The generated health status indicates the likelihood that the node will be healthy within a specified future time period. The computer system then leverages the generated health status to handle current or predicted failures. The computer system also presents the generated health status to a user or other entity.
Opening claim text (preview).
We claim: 1. A computer system for predicting the health of a plurality of data processing systems by using health report data, the computer system comprising one or more processors executing computer executable instructions which cause the computer system to perform the following: monitors one or more health indicators for a plurality of data processing systems; accesses one or more stored health indicators that provide a health history for one or more of the monitored plurality of data processing systems; based on both the monitored health indicators and the stored health history, predicts a health status, wherein the predicted health status indicates for at least one of the monitored plurality of data processing systems a probability that the at least one monitored data processing system will be healthy or unhealthy in the future; and presents the predicted health status to a specified entity. 2. The computer system of claim 1 , wherein the computer system further performs the following: makes a determination that the probability that the at least one monitored data processing system will be healthy is below a threshold level; and transfers one or more portions of data stored on the at least one monitored data processing system to one or more other data processing systems in the database system. 3. The computer system of claim 2 , wherein the computer system further performs the following: prevents the at least one monitored data processing system from storing new data. 4. The computer system of claim 2 , wherein making the determination that the probability that the at least one monitored data processing system will be healthy is below a threshold level comprises determining that the at least one monitored data processing system is in a critical state. 5. The computer system of claim 2 , wherein making the determination that the probability that the at least one monitored data processing system will be healthy is below a threshold level comprises determining that the at least one monitored data processing system has experienced one or more failures within a specified time period. 6. The computer system of claim 5 , wherein the computer system further performs the following: assigns an error level to each of the failures. 7. The computer system of claim 6 , wherein the computer system further performs the following: determines that a threshold number of the failures are beyond a specified error level, such that the at least one monitored data processing system is blacklisted. 8. The computer system of claim 7 , wherein monitored data processing systems that have a threshold number of the failures beyond a specified error level are blacklisted, regardless of a monitored data processing system's health history. 9. The computer system of claim 7 , wherein blacklisted data processing systems are put on probation for a specified amount of time to determine whether errors occur during probation. 10. The computer system of claim 9 , wherein the computer system further performs the following: upon determining for a monitored data processing system that was blacklisted that the probationary period is complete and that no further errors have occurred, allowing the monitored data processing system that was blacklisted to continue storing new data and removing the monitored data processing system from the blacklist. 11. The computer system of claim 9 , wherein the computer system further performs the following: upon determining for a monitored data processing system that was blacklisted that the probationary period is complete and that one or more further errors have occurred, preventing the monitored data processing system that was blacklisted from storing new data. 12. The computer system of claim 11 , wherein the computer system further performs the following: relocates data portions that are hosted on the monitored data processing system that is prevented from storing new data. 13. A computer-implemented method for proactively handling failures in a database system comprising a plurality of data processing systems each of which corresponds to a node of the database system, the computer-implemented method being performed by one or more processors executing computer executable instructions for the computer-implemented method, and the computer-implemented method comprising: monitoring one or more health indicators for a plurality of data processing systems each of which corresponds to a node of a database system; accessing at the first data processing system one or more stored health indicators that provide a health history for the one or more of the monitored data processing systems; predicting a health status based on the monitored factors indicators and the health history, wherein the predicted health status indicates the probability that the one or more monitored data processing systems will be healthy or unhealthy in the future; determining, for at least one of the monitored data processing systems, that a threshold number of failures have occurred that are beyond a specified error level; based on the determination, blacklisting the monitored data processing system for which the determination was made; transferring one or more portions of data stored from the monitored data processing system that is blacklisted to one or more of other data processing systems; and preventing the data processing system that was blacklisted from storing new data. 14. The computer-implemented method of claim 13 , wherein the data processing system that is blacklisted is categorized as up and blacklisted, such that the blacklisted data processing system remains used for storing data while the data is transferred to other data processing systems, and no new data is stored on the data processing system that is up and blacklisted. 15. The computer-implemented method of claim 13 , wherein the data processing system that is blacklisted is categorized as down and blacklisted, such that the blacklisted data processing system is no longer used for storing data, data is transferred from the blacklisted data processing system to other data processing systems, and no new data is stored on the data processing system this is down and blacklisted. 16. The computer-implemented method of claim 15 , wherein the data is transferred without waiting for a probationary period. 17. A database system comprising: a plurality of data processing systems; one or more computer-readable storage hardware media, excluding transmission media, and having stored thereon computer-executable instructions that, when executed by one or more processors, cause the database system to be configured with an architecture that proactively handles failures in the plurality of data processing systems by using health report data, and wherein the architecture is configured to perform the following: monitor one or more health indicators for a plurality of data processing systems of the database system; access one or more stored health indicators that provide a health history for one or more of the monitored plurality of data processing systems; based on both the monitored health indicators and the stored health history, predict a health status, wherein the predicted health status indicates for at least one of the monitored plurality of data processing systems a probability that the at least one monitored data processing system will be healthy or unhealthy in the future; and present the predicted health status to a specified entity. 18. The database system of claim 17 , wherein the architecture is fur
Error avoidance (G06F11/07 and subgroups take precedence) · CPC title
Monitoring arrangements determined by the means or processing involved in reporting the monitored data (error or fault reporting or logging G06F11/0766) · CPC title
where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems (multiprogramming arrangements G06F9/46; allocation of resources G06F9/50) · CPC title
related to network devices · CPC title
Display of status information · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.