Iot device identification by machine learning with time series behavioral and statistical features
US-2023231860-A1 · Jul 20, 2023 · US
US12488096B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12488096-B2 |
| Application number | US-202318466093-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 13, 2023 |
| Priority date | Sep 13, 2023 |
| Publication date | Dec 2, 2025 |
| Grant date | Dec 2, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
According to one embodiment, a method, computer system, and computer program product for anomaly detection is provided. The embodiment may include receiving login event data of a customer. The embodiment may include labeling each login request of the event data as non-anomalous or anomalous. The embodiment may include performing aggregate feature extraction for each login request. The embodiment may include filtering data of anomalous login requests from data of non-anomalous login requests. The embodiment may include training an autoencoder machine learning model using the data of non-anomalous login requests to learn non-anomalous login request behavior. The embodiment may include passing the data of anomalous login requests through the trained autoencoder ML model to obtain enriched data. The embodiment may include training a classifier model using the enriched data to identify anomalous login requests and output a classification with confidence value.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method, the method comprising: receiving login event data of a customer for a predetermined time period, wherein the login event data comprises login requests; labeling each login request of the event data as non-anomalous or anomalous; performing aggregate feature extraction for each login request using a queue-based mechanism to extract and calculate, in real-time, aggregated features of login requests in a most recent hourly segment of the login event data, wherein the aggregated features comprise failure count per IP address, failure percentage per IP address, failure percent per customer in last x minutes, and number of failed login attempts after a last successful login per user; filtering data of anomalous login requests from data of non-anomalous login requests; training an autoencoder machine learning (ML) model using the data of non-anomalous login requests to learn non-anomalous login request behavior, wherein the data of non-anomalous login requests comprises aggregated features of the non-anomalous login requests, and wherein output from the autoencoder ML model comprises root mean square error (RMSE) values for each input login request; passing the data of anomalous login requests through the trained autoencoder ML model to obtain enriched data comprising the data of anomalous login requests with corresponding RMSE values; and training a classifier model using the enriched data to identify anomalous login requests and output a classification with corresponding confidence value. 2 . The method of claim 1 , wherein labeling each login request of the event data as non-anomalous or anomalous further comprises: sorting the login event data into hourly segments; performing a timeseries analysis using statistical outlier techniques to obtain a timeseries of failed logins per hour; and for each login request, labeling a login request as anomalous where an internet protocol (IP) address of the login request had a failure login percentage and a failure login count greater than specified thresholds. 3 . The method of claim 1 , wherein the autoencoder ML model comprises an artificial neural network which regenerates input data and outputs an RMSE value which is indicative of how similar regenerated data is to input data, and wherein input data comprises a login request, and wherein training the autoencoder ML model using the data of non-anomalous login requests further comprises: performing, before the training, feature engineering and transformations on extracted aggregate features of the data of non-anomalous login requests, wherein the feature engineering and transformations comprise one or more of converting categorical data into numerical data, parsing date-time information, scaling features for training the autoencoder ML model, deletion after feature extraction, and one-hot encoding. 4 . The method of claim 1 , wherein passing the data of anomalous login requests through the trained autoencoder ML model further comprises: performing, before the passing, feature engineering and transformations on extracted aggregate features of the data of anomalous login requests, wherein the feature engineering and transformations comprise one or more of converting categorical data into numerical data using, parsing date-time information, scaling features for training the autoencoder ML model, deletion after feature extraction, and one-hot encoding. 5 . The method of claim 1 , further comprising: training a global model for anomaly detection using login event data of multiple customers, wherein the login event data is generalized to remove identifying information of a customer, and wherein an autoencoder ML model of the global model is trained to learn non-anomalous login request behavior and a classifier model of the global model is trained to identify anomalous login requests; receiving a classification, with corresponding confidence value, from the trained global model; training an evaluation model to combine the classification, with corresponding confidence value, of the trained classifier model and the classification, with corresponding confidence value, from the trained global model to obtain a final classification with corresponding confidence value; and performing an action with regard to a received login request based on the final classification with corresponding confidence value. 6 . The method of claim 1 , further comprising: performing an action with regard to a received login request based on a final classification with corresponding confidence value of the trained classifier model, wherein the action comprises one or more of allowing the received login request, blocking the received login request, requiring multi-factor authentication, and alerting a user to the received login request. 7 . A computer system, the computer system comprising: one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage medium, and program instructions stored on at least one of the one or more tangible storage medium for execution by at least one of the one or more processors via at least one of the one or more memories, wherein the computer system is capable of performing a method comprising: receiving login event data of a customer for a predetermined time period, wherein the login event data comprises login requests; labeling each login request of the event data as non-anomalous or anomalous; performing aggregate feature extraction for each login request using a queue-based mechanism to extract and calculate, in real-time, aggregated features of login requests in a most recent hourly segment of the login event data, wherein the aggregated features comprise failure count per IP address, failure percentage per IP address, failure percent per customer in last x minutes, and number of failed login attempts after a last successful login per user; filtering data of anomalous login requests from data of non-anomalous login requests; training an autoencoder machine learning (ML) model using the data of non-anomalous login requests to learn non-anomalous login request behavior, wherein the data of non-anomalous login requests comprises aggregated features of the non-anomalous login requests, and wherein output from the autoencoder ML model comprises root mean square error (RMSE) values for each input login request; passing the data of anomalous login requests through the trained autoencoder ML model to obtain enriched data comprising the data of anomalous login requests with corresponding RMSE values; and training a classifier model using the enriched data to identify anomalous login requests and output a classification with corresponding confidence value. 8 . The computer system of claim 7 , the method further comprising: sorting the login event data into hourly segments; performing a timeseries analysis using statistical outlier techniques to obtain a timeseries of failed logins per hour; and for each login request, labeling a login request as anomalous where an internet protocol (IP) address of the login request had a failure login percentage and a failure login count greater than specified thresholds. 9 . The computer system of claim 7 , wherein the autoencoder ML model comprises an artificial neural network which regenerates input data and outputs an RMSE value which is indicative of how similar regenerated data is to input data, and wherein input data comprises a login request, and wherein training the autoencoder ML model using the data of non-anomalous login requests further comprises: performing, before the training, feature engineering and transformations on extracted aggregate features of the data of
Test or assess a computer or a system · CPC title
Auto-encoder networks; Encoder-decoder networks · CPC title
Non-supervised learning, e.g. competitive learning · CPC title
Combinations of networks · CPC title
involving long-term monitoring or reporting · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.