Triggering the increased collection and distribution of monitoring information in a distributed processing system

US10318401B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10318401-B2
Application numberUS-201815957809-A
CountryUS
Kind codeB2
Filing dateApr 19, 2018
Priority dateApr 20, 2017
Publication dateJun 11, 2019
Grant dateJun 11, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A facility comprising systems and method for automatically triggering the collection of comprehensive monitoring information in a distributed processing system. The facility compares the overall performance of distributed processing system to one or more performance metrics and, in response to determining that one or more performance metrics is not satisfied, triggers one or more of the nodes within the distributed processing system to increase one or more of its monitoring rate or its distribution rate. The facility collects and analyzes the collected information to provide resources that can be used to assess and diagnose failures within the distributed processing system. In this manner, the facility reacts to performance anomalies by triggering nodes within in the system to provide comprehensive performance information over a trigger period for diagnostic purposes.

First claim

Opening claim text (preview).

What is claimed as new and desired to be protected by Letters Patent of the United States is: 1. A method for managing data in a file system over a network using one or more processors that execute instructions to perform actions, comprising: instantiating a monitoring engine to perform actions including: monitoring one or more metrics to collect data that is associated with one or more nodes that are part of the file system, wherein each node is a computer that separately provides computing resources over the network that are characterized by the one or more metrics; determining the one or more nodes that are associated with the one or more metrics that exceed one or more trigger levels based on the monitoring; modifying an original monitor rate associated with the one or more determined nodes, wherein the modified monitor rate is associated with a trigger time period, wherein a duration of the trigger time period is selected based on a longest time period that is associated with the one or more metrics that exceed the one or more trigger levels; in response to an expiration of the trigger time period, restoring the modified monitor rate to the original monitor rate; and employing a file system engine to provide one or more reports that include the data associated with the one or more metrics, wherein the one or more reports improve identifying the one or more computers having computing resources characterized by the one or metrics that exceed the one or more trigger levels during the trigger time period. 2. The method of claim 1 , wherein the monitoring engine performs actions, further comprising: distributing the data associated with the one or more metrics and the one or more nodes to the file system engine; modifying an original distribution rate associated with the one or more determined nodes to another distribution rate, wherein the other distribution rate is associated with another trigger time period; and in response to an expiration of the other trigger time period, restoring the original distribution rate. 3. The method of claim 1 , wherein the data for the one or more nodes includes one or more of a lock graph, a task stack, or a backtrace. 4. The method of claim 1 , wherein the monitoring engine performs actions, further comprising: identifying one or more tasks that are associated with a locked resource; identifying the one or more tasks that are waiting for the locked resource; associating the one or more tasks with one or more time values that correspond to one or more attempts to access the locked resource; and generating a lock graph based on the one or more tasks, wherein the lock graph includes a directed graph based on the association with the one or more time values. 5. The method of claim 1 , wherein the monitoring engine performs actions, further comprising, truncating the data associated with the one or more nodes to include data that corresponds to an overlapping time period and to omit data that corresponds to one or more non-overlapping time periods. 6. The method of claim 1 , wherein the monitoring of one or more metrics to collect data further comprises, assigning a separate original monitor rate or a separate modified monitor rate to one or more of the metrics based on the one or more metrics and the one or more nodes. 7. The method of claim 1 , wherein the one or more metrics include one or more of data throughput, latency, processor utilization, disk utilization, a count of dropped network packets, a count of disk inputs over a period of time, or a count of disk outputs over a period of time. 8. A system for managing data in a file system comprising: a network computer, comprising: a transceiver that communicates over the network; a memory that stores at least instructions; and one or more processors that execute instructions that perform actions, including: instantiating a monitoring engine to perform actions including: monitoring one or more metrics to collect data that is associated with one or more nodes that are part of the file system, wherein each node is a computer that separately provides computing resources over the network that are characterized by the one or more metrics; determining the one or more nodes that are associated with the one or more metrics that exceed one or more trigger levels based on the monitoring; modifying an original monitor rate associated with the one or more determined nodes, wherein the modified monitor rate is associated with a trigger time period, wherein a duration of the trigger time period is selected based on a longest time period that is associated with the one or more metrics that exceed the one or more trigger levels; in response to an expiration of the trigger time period, restoring the modified monitor rate to the original monitor rate; and employing a file system engine to provide one or more reports that include the data associated with the one or more metrics, wherein the one or more reports improve identifying the one or more computers having computing resources characterized by the one or metrics that exceed the one or more trigger levels during the trigger time period; and a client computer, comprising: a transceiver that communicates over the network; a memory that stores at least instructions; and one or more processors that execute instructions that perform actions, including: receiving, the one or more reports. 9. The system of claim 8 , wherein the monitoring engine performs actions, further comprising: distributing the data associated with the one or more metrics and the one or more nodes to the file system engine; modifying an original distribution rate associated with the one or more determined nodes to another distribution rate, wherein the other distribution rate is associated with another trigger time period; and in response to an expiration of the other trigger time period, restoring the original distribution rate. 10. The system of claim 8 , wherein the data for the one or more nodes includes one or more of a lock graph, a task stack, or a backtrace. 11. The system of claim 8 , wherein the monitoring engine performs actions, further comprising: identifying one or more tasks that are associated with a locked resource; identifying the one or more tasks that are waiting for the locked resource; associating the one or more tasks with one or more time values that correspond to one or more attempts to access the locked resource; and generating a lock graph based on the one or more tasks, wherein the lock graph includes a directed graph based on the association with the one or more time values. 12. The system of claim 8 , wherein the monitoring engine performs actions, further comprising, truncating the data associated with the one or more nodes to include data that corresponds to an overlapping time period and to omit data that corresponds to one or more non-overlapping time periods. 13. The system of claim 8 , wherein the monitoring of one or more metrics to collect data further comprises, assigning a separate original monitor rate or a separate modified monitor rate to one or more of the metrics based on the one or more metrics and the one or more nodes. 14. The system of claim 8 , wherein the one or more metrics include one or more of data throughput, latency, processor utilization, disk utilization, a count of dropped network packets, a count of disk inputs over a period of time, or a count of disk outputs over a period of time. 15. A processor readable non-transitory storage media that includes instructions for managing data in a file system over a network, wherein execu

Assignees

Inventors

Classifications

  • for systems · CPC title

  • Monitoring arrangements determined by the means or processing involved in reporting the monitored data (error or fault reporting or logging G06F11/0766) · CPC title

  • Metering · CPC title

  • to a system of files or objects, e.g. local or distributed file system or database · CPC title

  • by assessing time · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10318401B2 cover?
A facility comprising systems and method for automatically triggering the collection of comprehensive monitoring information in a distributed processing system. The facility compares the overall performance of distributed processing system to one or more performance metrics and, in response to determining that one or more performance metrics is not satisfied, triggers one or more of the nodes w…
Who is the assignee on this patent?
Qumulo Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/3495. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 11 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).