Operation management apparatus, operation management method and program
US-2015046123-A1 · Feb 12, 2015 · US
US9575828B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9575828-B2 |
| Application number | US-201514794676-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 8, 2015 |
| Priority date | Jul 8, 2015 |
| Publication date | Feb 21, 2017 |
| Grant date | Feb 21, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for assisting evaluation of anomalies in a distributed storage system is disclosed. The method includes a step of monitoring at least one system metric of the distributed storage system. The method further includes steps of maintaining a listing of patterns of the monitored system metric comprising patterns which previously did not result in a failure within one or more nodes of the distributed storage system, and, based on the monitoring, identifying a pattern (i.e., a time series motif) of the monitored system metric as a potential anomaly in the distributed storage system. The method also includes steps of automatically (i.e. without user input) performing a similarity search to determine whether the identified pattern satisfies one or more predefined similarity criteria with at least one pattern of the listing, and, upon positive determination, excepting the identified pattern from being identified as the potential anomaly.
Opening claim text (preview).
What is claimed is: 1. A method for assisting evaluation of anomalies in a distributed storage system, the method comprising: monitoring at least one system metric of the distributed storage system; maintaining a listing of patterns of the at least one system metric which previously did not result in a failure within one or more nodes of the distributed storage system; identifying a pattern of the at least one system metric as a potential anomaly in the distributed storage system; automatically performing a similarity search to determine whether the identified pattern satisfies one or more similarity criteria with at least one pattern of the listing; and upon positive determination, excepting the identified pattern from being identified as the potential anomaly. 2. The method according to claim 1 , wherein the pattern of the at least one system metric is identified as a potential anomaly based on comparison of values of the at least one system metric within a duration of the pattern with values of the at least one system metric within an earlier time interval of the duration of the pattern. 3. The method according to claim 1 , wherein: monitoring the at least one system metric of the distributed storage system comprises monitoring the at least one system metric for a first node of the distributed storage system, the method further comprises monitoring of the at least one system metric for a second node of the distributed storage system, and the pattern of the at least one system metric for the first node is identified as a potential anomaly based on comparison of values of the at least one system metric for the first node with values of the at least one system metric for the second node. 4. The method according to claim 1 , wherein the pattern of the at least one system metric is identified as a potential anomaly based on comparison of values of the at least one system metric with values of at least one other system metric. 5. The method according to claim 1 , further comprising: upon negative determination, determining whether the identified pattern is associated with the failure within one or more nodes of the distributed storage system. 6. The method according to claim 5 , wherein the determination of whether the identified pattern is associated with the failure is based on one or more logs generated by one or more services associated with the at least one system metric. 7. The method according to claim 1 , wherein the patterns of the listing were previously identified as potential anomalies. 8. A system for assisting evaluation of anomalies in a distributed storage system, the system comprising: at least one memory configured to store computer executable instructions, and at least one processor coupled to the at least one memory and configured, when executing the instructions, to: monitor at least one system metric of the distributed storage system; maintain a listing of patterns of the at least one system metric which previously did not result in a failure within one or more nodes of the distributed storage system; identify a pattern of the at least one system metric as a potential anomaly in the distributed storage system; automatically perform a similarity search to determine whether the identified pattern satisfies one or more similarity criteria with at least one pattern of the listing; and upon positive determination, except the identified pattern from being identified as the potential anomaly. 9. The system according to claim 8 , wherein the pattern of the at least one system metric is identified as a potential anomaly based on comparison of values of the at least one system metric within a duration of the pattern with values of the at least one system metric within an earlier time interval of the duration of the pattern. 10. The system according to claim 8 , wherein: monitoring the at least one system metric of the distributed storage system comprises monitoring the at least one system metric for a first node of the distributed storage system, the method further comprises monitoring of the at least one system metric for a second node of the distributed storage system, and the pattern of the at least one system metric for the first node is identified as a potential anomaly based on comparison of values of the at least one system metric for the first node with values of the at least one system metric for the second node. 11. The system according to claim 8 , wherein the pattern of the at least one system metric is identified as a potential anomaly based on comparison of values of the at least one system metric with values of at least one other system metric. 12. The system according to claim 8 , wherein the at least one processor is further configured to, upon negative determination, determine whether the identified pattern is associated with the failure within one or more nodes of the distributed storage system. 13. The system according to claim 12 , wherein the determination of whether the identified pattern is associated with the failure is based on one or more logs generated by one or more services associated with the at least one system metric. 14. The system according to claim 8 , wherein the patterns of the listing were previously identified as potential anomalies. 15. One or more computer readable storage media encoded with software comprising computer executable instructions and when the software is executed operable to perform a method for assisting evaluation of anomalies in a distributed storage system, the method comprising: monitoring at least one system metric of the distributed storage system; maintaining a listing of patterns of the at least one system metric which previously did not result in a failure within one or more nodes of the distributed storage system; identifying a pattern of the at least one system metric as a potential anomaly in the distributed storage system; automatically performing a similarity search to determine whether the identified pattern satisfies one or more similarity criteria with at least one pattern of the listing; and upon positive determination, excepting the identified pattern from being identified as the potential anomaly. 16. The one or more computer readable media according to claim 15 , wherein the pattern of the at least one system metric is identified as a potential anomaly based on comparison of values of the at least one system metric within a duration of the pattern with values of the at least one system metric within an earlier time interval of the duration of the pattern. 17. The one or more computer readable media according to claim 15 , wherein: monitoring the at least one system metric of the distributed storage system comprises monitoring the at least one system metric for a first node of the distributed storage system, the method further comprises monitoring of the at least one system metric for a second node of the distributed storage system, and the pattern of the at least one system metric for the first node is identified as a potential anomaly based on comparison of values of the at least one system metric for the first node with values of the at least one system metric for the second node. 18. The one or more computer readable media according to claim 15 , wherein the pattern of the at least one system metric is identified as a potential anomaly based on comparison of values of the at least one system metric with values of at least one other system metric. 19. The one or more computer readable media according to claim 15 , where
for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS] · CPC title
in a storage system, e.g. in a DASD or network based storage system (drivers for digital recording or reproducing units G06F3/06; circuits for error detection or correction within digital recording or reproducing units G11B20/18; for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS], H04L67/1097) · CPC title
Error or fault detection not based on redundancy (power supply failures G06F1/30; network fault management H04L41/06) · CPC title
Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title
Reaction to server failures by a load balancer · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.