Machine learning-based traffic classification using compressed network telemetry data

US10375090B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10375090-B2
Application numberUS-201715469716-A
CountryUS
Kind codeB2
Filing dateMar 27, 2017
Priority dateMar 27, 2017
Publication dateAug 6, 2019
Grant dateAug 6, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment, a device in a network receives telemetry data regarding a traffic flow in the network. One or more features in the telemetry data are individually compressed. The device extracts the one or more individually compressed features from the received telemetry data. The device performs a lookup of one or more classifier inputs from an index of classifier inputs using the one or more individually compressed features from the received telemetry data. The device classifies the traffic flow by inputting the one or more classifier inputs to a machine learning-based classifier.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, at a device in a network, telemetry data regarding a traffic flow in the network, wherein each of a plurality of features in the telemetry data are individually compressed so that a separate data compression context is maintained for each of the plurality of features in the telemetry data; extracting, by the device, the plurality of individually compressed features from the received telemetry data; performing, by the device, a lookup of one or more classifier inputs from an index of classifier inputs using at least one of the plurality of individually compressed features from the received telemetry data; and classifying, by the device, the traffic flow by inputting the one or more classifier inputs to a machine learning-based classifier. 2. The method as in claim 1 , wherein classifying the traffic flow comprises: determining, by the device, an application associated with the traffic flow. 3. The method as in claim 1 , wherein classifying the traffic flow comprises: determining, by the device, whether the traffic flow is associated with malware. 4. The method as in claim 1 , wherein the plurality of individually compressed features in the telemetry data comprises at least one of: sequence of packet lengths and time (SPLT) data regarding the traffic flow, sequence of application lengths and time (SALT) data regarding the traffic flow, byte distribution (BD) data regarding the traffic flow, a ciphersuite, or a Transport Layer Security (TLS) extension. 5. The method as in claim 1 , wherein the received telemetry data comprises a NetFlow or Internet Protocol Flow Information Export (IPFIX) record. 6. The method as in claim 1 , wherein a particular one of the individually compressed one or more features in the telemetry data references a previously observed feature in the network. 7. The method as in claim 1 , wherein a particular one of the individually compressed one or more features in the telemetry data is compressed using Lempel-Ziv compression. 8. The method as in claim 1 , wherein the machine learning-based classifier comprises a random forest classifier or a regression-based classifier. 9. An apparatus, comprising: one or more network interfaces to communicate with a network; a processor coupled to the one or more network interfaces and configured to execute a process; and a memory configured to store the process executable by the processor, the process when executed configured to: receive telemetry data regarding a traffic flow in the network, wherein each of a plurality of features in the telemetry data are individually compressed so that a separate data compression context is maintained for each of the plurality of features in the telemetry data; extract the plurality of individually compressed features from the received telemetry data; perform a lookup of one or more classifier inputs from an index of classifier inputs using at least one of the plurality of individually compressed features from the received telemetry data; and classify the traffic flow by inputting the one or more classifier inputs to a machine learning-based classifier. 10. The apparatus as in claim 9 , wherein the apparatus classifies the traffic flow by: determining an application associated with the traffic flow. 11. The apparatus as in claim 9 , wherein the apparatus classifies the traffic flow by: determining whether the traffic flow is associated with malware. 12. The apparatus as in claim 9 , wherein the plurality of individually compressed features in the telemetry data comprises at least one of: sequence of packet lengths and time (SPLT) data regarding the traffic flow, sequence of application lengths and time (SALT) data regarding the traffic flow, byte distribution (BD) data regarding the traffic flow, a ciphersuite, or a Transport Layer Security (TLS) extension. 13. The apparatus as in claim 9 , wherein the received telemetry data comprises a NetFlow or Internet Protocol Flow Information Export (IPFIX) record. 14. The apparatus as in claim 9 , wherein a particular one of the individually compressed one or more features in the telemetry data references a previously observed feature in the network. 15. The apparatus as in claim 9 , wherein a particular one of the individually compressed one or more features in the telemetry data is compressed using Lempel-Ziv compression. 16. The apparatus as in claim 9 , wherein the machine learning-based classifier comprises a random forest classifier or a regression-based classifier. 17. A tangible, non-transitory, computer-readable medium storing program instructions that cause a device in a network to execute a process comprising: receiving, at the device, telemetry data regarding a traffic flow in the network, wherein each of a plurality of features in the telemetry data are individually compressed so that a separate data compression context is maintained for each of the plurality of features in the telemetry data; extracting, by the device, the plurality of individually compressed features from the received telemetry data; performing, by the device, a lookup of one or more classifier inputs from an index of classifier inputs using at least one of the plurality of individually compressed features from the received telemetry data; and classifying, by the device, the traffic flow by inputting the one or more classifier inputs to a machine learning-based classifier. 18. The computer-readable medium as in claim 17 , wherein classifying the traffic flow comprises: determining, by the device, an application associated with the traffic flow or whether the traffic flow is associated with malware. 19. The computer-readable medium as in claim 17 , wherein the plurality individually compressed features in the telemetry data comprises at least one of: sequence of packet lengths and time (SPLT) data regarding the traffic flow, sequence of application lengths and time (SALT) data regarding the traffic flow, byte distribution (BD) data regarding the traffic flow, a ciphersuite, or a Transport Layer Security (TLS) extension. 20. The computer-readable medium as in claim 17 , wherein the received telemetry data comprises a NetFlow or Internet Protocol Flow Information Export (IPFIX) record.

Assignees

Inventors

Classifications

  • G06N20/00Primary

    Machine learning · CPC title

  • the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms · CPC title

  • relying on flow classification, e.g. using integrated services [IntServ] · CPC title

  • Event detection, e.g. attack signature detection · CPC title

  • by monitoring network traffic (monitoring network traffic per se H04L43/00) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10375090B2 cover?
In one embodiment, a device in a network receives telemetry data regarding a traffic flow in the network. One or more features in the telemetry data are individually compressed. The device extracts the one or more individually compressed features from the received telemetry data. The device performs a lookup of one or more classifier inputs from an index of classifier inputs using the one or mo…
Who is the assignee on this patent?
Cisco Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 06 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).