Service access data enrichment for cybersecurity

US11647034B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11647034-B2
Application numberUS-202017019219-A
CountryUS
Kind codeB2
Filing dateSep 12, 2020
Priority dateSep 12, 2020
Publication dateMay 9, 2023
Grant dateMay 9, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Enriched access data supports anomaly detection to enhance network cybersecurity. Network access data is enriched using service nodes representing resource provision and other services, with geolocation nodes representing grouped access origins, and access values representing access legitimacy confidence. Data enrichment provides a trained model by mapping IP addresses to geolocations, building a bipartite access graph whose inter-node links indicate aspects of accesses from geolocations to services, and generating semantic vectors from the graph. Vector generation may include collaborative filtering, autoencoding, neural net embedding, and other machine learning tools and techniques. Anomaly detection systems then calculate service-geolocation or geolocation-geolocation vector distances with anomaly candidate vectors and the model's graph-based vectors, and treat distances past a threshold as anomaly indicators. Some embodiments curtail false positives relative to simply checking network access logs or packets for activity coming from unexpected places. Some avoid or reduce model retraining.

First claim

Opening claim text (preview).

What is claimed is: 1. A cybersecurity data enrichment system, comprising: a digital memory; and a processor in operable communication with the digital memory, the processor configured to perform service access data enrichment and anomaly detection support steps which include (a) obtaining a map of IP addresses to geolocations, (b) building a bipartite access graph having links, each link having a service node and a geolocation node connected by the link, each service node having a service identifier identifying a service, each geolocation node having a geolocation identifier identifying a geolocation, each link connecting the service node of the link with the geolocation node of the link and having an access value derived from at least one service access from the geolocation to the service, (c) generating a respective service vector for at least one service node, the service vector based on at least the access values of one or more links which connect to the service node, (d) generating a respective geolocation vector for at least one geolocation node, the geolocation vector based on at least the access values of one or more links which connect to the geolocation node, the service vectors and geolocation vectors collectively referred to herein as graph-based vectors, and (e) associating at least two of the generated vectors with an anomaly detection system; whereby the cybersecurity data enrichment system is configured to support detection of anomalous service accesses such that a similarity of two given vectors corresponds with a likelihood that a given service was non-maliciously accessed. 2. The cybersecurity data enrichment system of claim 1 , wherein the service identifier includes at least one of the following: an API identifier, a web service identifier, an endpoint URL, a URI, a storage resource identifier, a network resource identifier, a compute resource identifier, a software-as-a-service identifier, a platform-as-a-service identifier, an infrastructure-as-a-service identifier, an email service address, or another denotation of at least one network-accessible item. 3. The cybersecurity data enrichment system of claim 1 , wherein the geolocation identifier expressly identifies at least one of the following: a building, a campus, a district, a city, a metropolitan area, a county, a province, a state, a country, a region containing multiple countries, a legal jurisdiction, or a regulatory jurisdiction. 4. The cybersecurity data enrichment system of claim 1 , wherein the access value includes at least one of the following: an access count, an access duration, an access frequency, an access recency, an access distribution over time intervals, or another legitimacy confidence value which represents an extent of confidence that the access value arises from non-malicious access actions between the geolocation of the link associated with the access value and the service of the link associated with the access value. 5. The cybersecurity data enrichment system of claim 1 , in combination with the anomaly detection system. 6. The combined cybersecurity data enrichment system and anomaly detection system of claim 5 , wherein the anomaly detection system comprises code which upon execution performs anomaly detection steps which include (f) getting an anomaly candidate service access description which includes at least a service identifier and a geolocation identifier corresponding to an anomaly candidate service access, (g) procuring an anomaly candidate vector that is based on at least the anomaly candidate service access description, (h) calculating a vector distance using at least the anomaly candidate vector, and (i) classifying the anomaly candidate service access either as anomalous or as non-anomalous, the classifying based at least in part on the vector distance. 7. A cybersecurity method utilizing vector-enriched service access data to support detection of an anomalous service access, the method comprising: acquiring a set of graph-based vectors which include one or more service vectors and one or more geolocation vectors, the service vectors and the geolocation vectors generated from a bipartite access graph having links, each link having a link service node and a link geolocation node connected by the link, each link service node having a service identifier identifying a service, each link geolocation node having a geolocation identifier identifying a geolocation, each link connecting the link service node of the link with the link geolocation node of the link and having an access value derived from at least one service access from the geolocation to the service, each service vector corresponding to a respective service node and based on at least the access values of all links which connect to the respective service node, each geolocation vector corresponding to a respective geolocation node and based on at least the access values of all links which connect to the respective geolocation node; getting an anomaly candidate service access description which includes at least a service identifier and a geolocation identifier corresponding to an anomaly candidate service access; procuring at least one anomaly candidate vector that is based on at least the anomaly candidate service access description; calculating a vector distance using at least the anomaly candidate vector; and classifying the anomaly candidate service access either as anomalous or as non-anomalous, the classifying based at least in part on the vector distance. 8. The method of claim 7 , comprising: procuring an anomaly candidate service vector which is a graph-based service vector of a service node for a service that is identified by the anomaly candidate service access description service identifier; procuring an anomaly candidate geolocation vector which is a graph-based geolocation vector of a geolocation node for a geolocation that is identified by the anomaly candidate service access description geolocation identifier; and calculating the vector distance between the anomaly candidate service vector and the anomaly candidate geolocation vector. 9. The method of claim 7 , wherein the method procures an anomaly candidate geolocation vector that is based on at least the anomaly candidate service access description, and wherein calculating the vector distance includes calculating the vector distance between the anomaly candidate geolocation vector and each vector of a set of k graph-based geolocation vectors, with k being an integer greater than one. 10. The method of claim 7 , wherein at least one of the graph-based vectors is generated at least in part by collaborative filtering. 11. The method of claim 7 , wherein procuring the anomaly candidate vector comprises looking up a geolocation vector in the set of graph-based vectors, the looking up based at least in part on the anomaly candidate service access description. 12. The method of claim 7 , further comprising selecting bipartite access graph geolocation definitions or services associated with a service node, or both, such that at least a specified service link density threshold amount of service nodes each have multiple links, or such that at least a specified geolocation link density threshold amount of geolocation nodes each have multiple links, or both. 13. The method of claim 7 , wherein: the method further comprises storing, for each of multiple services, geolocation vectors for geolocations which accessed the service; and classifying comprises comparing an anomaly candidate geolocation vector to at least two stored geolocation vectors. 14. The method of claim 7 , wherein acquiring the set of graph-based v

Assignees

Inventors

Classifications

  • H04L63/107Primary

    wherein the security policies are location-dependent, e.g. entities privileges depend on current location or allowing specific operations only from locally connected terminals · CPC title

  • Location-dependent; Proximity-dependent · CPC title

  • Clustering or classification · CPC title

  • Traffic logging, e.g. anomaly detection · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11647034B2 cover?
Enriched access data supports anomaly detection to enhance network cybersecurity. Network access data is enriched using service nodes representing resource provision and other services, with geolocation nodes representing grouped access origins, and access values representing access legitimacy confidence. Data enrichment provides a trained model by mapping IP addresses to geolocations, building…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification H04L63/107. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue May 09 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).