Efficient network monitoring
US-9264320-B1 · Feb 16, 2016 · US
US9942631B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9942631-B2 |
| Application number | US-201514866567-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 25, 2015 |
| Priority date | Sep 25, 2015 |
| Publication date | Apr 10, 2018 |
| Grant date | Apr 10, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Devices and techniques for out-of-band platform tuning and configuration are described herein. A device can include a telemetry interface to a telemetry collection system and a network interface to network adapter hardware. The device can receive platform telemetry metrics from the telemetry collection system, and network adapter silicon hardware statistics over the network interface, to gather collected statistics. The device can apply a heuristic algorithm using the collected statistics to determine processing core workloads generated by operation of a plurality of software systems communicatively coupled to the device. The device can provide a reconfiguration message to instruct at least one software system to switch operations to a different processing core, responsive to detecting an overload state on at least one processing core, based on the processing core workloads. Other embodiments are also described.
Opening claim text (preview).
What is claimed is: 1. An orchestration controller device for a computer system having a multi-core computing platform architecture and a plurality of network adapters, the multi-core computing platform architecture and network adapters to provide resources for facilitating in-band data flow for at least one software system, the device comprising: at least one telemetry interface to a telemetry collection system, the telemetry collection system to collect platform telemetry metrics via out-of-band (OOB) access to processing cores of the multi-core computing platform, the platform telemetry metrics including processing core workloads; at least one network hardware interface to network adapter hardware of the network adaptors, the at least one network hardware interface to collect network adapter silicon hardware statistics via OOB access to the network adapters; wherein the OOB access to processing cores and to the network adapters avoids utilizing the resources for facilitating the in-band data flow: and processing circuitry configured to: receive the platform telemetry metrics from the telemetry collection system, and receive the network adapter silicon hardware statistics over the at least one network hardware interface, to gather collected statistics, apply a heuristic algorithm using the collected statistics to determine processing core workloads generated by operation of a plurality of software systems communicatively coupled to the device, and provide a reconfiguration message to instruct the at least one software system to switch operations to a different processing core, responsive to detecting an overload state on at least one of the processing cores, based on the processing core workloads. 2. The device of claim 1 , wherein the plurality of software systems includes at least one virtual machine (VM). 3. The device of claim 1 , wherein the processing circuitry is configured to provide the reconfiguration message within a request to a hypervisor. 4. The device of claim 1 , wherein the platform telemetry metrics include metrics of at least two metric types selected from a group including processing core data, chipset data, memory element performance data, data received from an encryption unit, data received from a compression unit, storage data, virtual switch (vSwitch) data, and data received over a network interface card (NIC) connection of a NIC, wherein data received over the NIC connection includes NIC telemetry, wherein NIC telemetry includes at least one of an indication of packets per second received at the NIC and average packet size received at the NIC. 5. The device of claim 1 , further comprising: at least one platform interface to a platform metrics collection system, and wherein the processing circuitry is further configured to gather platform quality of service (PQoS) metrics over the at least one platform interface, and to use the PQoS metrics as inputs to the heuristic algorithm. 6. The device of claim 1 , wherein the processing circuitry is further configured to: instruct a set of at least two processing cores, in sequence, to enter an offline state; provide instructions for performing tests on each of the set of at least two processing cores after a respective one of the set of at least two processing cores has entered the offline state; and rank the set of at least two processing cores based on performance during the tests, subsequent to performing tests, to generate a ranked set of processing cores. 7. The device of claim 6 , wherein the tests include evaluations of at least one of: core-to-cache bandwidth, core-to-memory bandwidth, and core-to-I/O bandwidth. 8. The device of claim 7 , wherein the processing circuitry is further configured to: provide instructions for steering incoming NIC traffic to a processing core of the ranked set of processing cores, based on priority level of the incoming NIC traffic. 9. The device of claim 1 , wherein the processing circuitry is further arranged to: determine, based on the heuristic algorithm, whether service level agreement (SLA) criteria have been met; and report SLA violations to datacenter management software if SLA criteria have not been met. 10. The device of claim 1 , wherein the processing circuitry is further arranged to: receive a configuration state from a management and policy server, the configuration state including at least one processing core identifier and at least one of a workload, a policy, a cache sensitivity, and a bandwidth sensitivity for the respective at least one processing core identifier; provide performance feedback, to the management and policy server, for at least one processing core identified by the at least one processing core identifier; and receive recommendations from the management and policy server for providing the reconfiguration message, based on the performance feedback. 11. The device of claim 10 , wherein the processing circuitry is further arranged to: upon receiving performance monitoring event codes corresponding to a parameter of interest, detect application performance to generate a performance curve relating application performance to the parameter of interest; generate a sensitivity curve, from the performance curve, to determine sensitivity of application performance to the parameter of interest; and provide the sensitivity curve as an input to an algorithm for generating reconfiguration decisions. 12. The device of claim 11 , wherein the parameter of interest includes one of cache occupancy and memory bandwidth, and wherein cache occupancy is independent of memory bandwidth. 13. A method for platform processing core configuration of a computer system having a multi-core computing platform architecture, and a plurality of network adapters, the multi-core computing platform architecture and network adapters facilitating in-band data flow for a plurality of virtual machines (VMs) and a hypervisor hosted on the computer system, the method comprising: receiving platform telemetry metrics from a telemetry collection system, the telemetry collection system collecting platform telemetry metrics via out-of-band (DOB) access to processing cores of the multi-core computing platform, the platform telemetry metrics including processing core workloads, wherein the OOB access avoids utilizing the resources for facilitating the in-band data flow; collecting network adapter silicon hardware statistics over at least one network hardware interface, via OOB access to the network adapters, to gather collected statistics; applying a heuristic algorithm using the collected statistics to determine processing core workloads generated by operation of the plurality of VMs; and providing a reconfiguration message to the hypervisor to instruct at least one of the VMs associated with the hypervisor to switch operations to a different processing core, responsive to detecting an overload state on at least one of the processing cores, based on the processing core workloads. 14. The method of claim 13 , wherein the platform telemetry metrics include metrics of at least two metric types selected from a group including processing core data, chipset data, memory element performance data, data received from an encryption unit, data received from a compression unit, storage data, virtual switch (vSwitch) data, and data received over a network interface card (NIC) connection. 15. The method of claim 13 , further comprising: instructing a set of at least two processing cores to enter, in sequence, an offline state; providing instructions for performing tests on each of the set of at least two processing
Ensuring fulfilment of SLA · CPC title
Active monitoring, e.g. heartbeat, ping or trace-route · CPC title
Determining service level performance parameters or violations of service level contracts, e.g. violations of agreed response time or mean time between failures [MTBF] · CPC title
Automatically-operated arrangements · CPC title
Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.