What technology area does this patent fall under?

Primary CPC classification G06N3/088. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Sep 16 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and Apparatus for Augmented Data Anomaly Detection

US2021287071A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2021287071-A1
Application number	US-202117200606-A
Country	US
Kind code	A1
Filing date	Mar 12, 2021
Priority date	Mar 12, 2020
Publication date	Sep 16, 2021
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A data anomaly detection method and apparatus in which a deep neural network is trained on baseline data. Sequences of statistics of each layer of the deep neural network are saved, processed and used to train an LSTM autoencoder across a variety of reconstruction error thresholds, and a preferred threshold is selected for an optimized autoencoder. In an Inference mode, a data sample is presented to the autoencoder; the reconstruction error is calculated and compared to the threshold. If it is above the threshold, then the data sample is an out-of-distribution sample, and the sample is tagged as anomalous.

First claim

Opening claim text (preview).

1 . A method for detection of data anomalies via a deep multi-layer neural network architecture, the method being implemented by a computer system that comprises one or more processors executing computer program instructions that, when executed, perform the method, the method comprising: in a neural network training phase: a. obtaining a first collection of actual data items corresponding to one or more groups of data categories, said first collection of actual data items having a first data distribution; b. using a first neural network to generate a set of synthetic data items using a synthetic data generation configuration; c. providing said collection of actual data items and said set of synthetic items to a second neural network; d. using the second neural network to (i) make a classification determination using a set of classification determination configurations including whether each data item in said set of synthetic data items is synthetic or actual, and (ii) update said set of classification determination configurations; e. providing said classification determinations to said first neural network; f. using said classification determinations by said first neural network to update said synthetic data generation configuration; g. repeating steps b through f until said second neural network cannot make a valid classification determination; h. generating autoencoder training sequences of updated classification determination configurations for each layer in said second neural network; in an autoencoder phase: i. providing said autoencoder training sequences to an autoencoder, and said autoencoder training itself to differentiate anomalous data from real data using said autoencoder training sequences across a range of reconstruction error thresholds; j. selecting a preferred reconstruction error threshold based on autoencoder performance during said training step to result in said autoencoder being optimized for recognition of anomalous data; in a data anomaly detection phase: k. submitting to the second neural network a purported data item; l. generating by said second neural network new sequences of classification determination configurations corresponding to said purported data item; m. providing said new sequences to said autoencoder, said autoencoder generating a prediction as to whether said purported data item falls within said first data distribution; n. classifying by said autoencoder said purported data item as anomalous if said purported data item falls outside said first data distribution; o. sending said new sequences to said second neural network if said purported data item is determined by said autoencoder to fall within said first data distribution, and making a classification determination by said second neural network for said purported data items using said set of classification configurations; and p. notifying a user that said purported data item may be anomalous if said second neural network determines that said purported data item is synthetic. 2 . A method according to claim 1 , wherein said first neural network and said second neural network are a generator and a discriminator, respectively, of a generative adversarial network. 3 . A method according to claim 1 , wherein said actual data is text data and said anomalous data is malicious text. 4 . A system comprising: a computer system that comprises one or more processors executing computer program instructions that, when executed, cause the computer system to: in a neural network training phase: a. obtain a first collection of actual data items corresponding to one or more groups of data categories, said first collection of actual data items having a first data distribution; b. use a first neural network to generate a set of synthetic data items using a synthetic data generation configuration; c. provide said collection of actual data items and said set of synthetic items to a second neural network; d. use the second neural network to (i) make a classification determination using a set of classification determination configurations including whether each data item in said set of synthetic data items are synthetic or actual, and (ii) update said set of classification determination configurations; e. provide said classification determinations to said first neural network; f. use said classification determinations by said first neural network to update said synthetic data generation configuration; g. repeat steps b through f until said second neural network cannot make a valid classification determination; h. generating autoencoder training sequences of updated classification determination configurations for each layer in said second neural network; in an autoencoder training phase: i. provide said autoencoder training sequences to an autoencoder to train itself to differentiate anomalous data from real data using said autoencoder training sequences across a range of reconstruction error thresholds; j. select a preferred reconstruction error threshold based on autoencoder performance during said training step to result in said autoencoder being optimized for recognition of anomalous data; in a data anomaly detection phase: k. submit to the second neural network a purported data item; l. generate by said second neural network new sequences of classification determination configurations corresponding to said purported data item; m. provide said new sequences to said autoencoder, and generate by said autoencoder a prediction as to whether said purported data item falls within said first data distribution; n. classify by said autoencoder said purported data item as anomalous if said purported data item falls outside said first data distribution; o. send said new sequences to said second neural network if said purported data item is determined by said autocoder to fall within said first data distribution, and make a classification determination by said second neural network for said purported data item using said set of classification configurations; p. notify a user that said purported data item may be anomalous or malicious if said second neural network determines that said purported data item is synthetic. 5 . A system according to claim 4 , wherein said first neural network and said second neural network are a generator and a discriminator, respectively of a generative adversarial network. 6 . A system according to claim 4 , wherein said actual data is text data, and said anomalous data is malicious text. 7 . An apparatus comprising: a first neural network configured to a. generate a set of synthetic data items using a synthetic data generation configuration; and b. provide a collection of actual text data items and said set of synthetic items to a second neural network, said collection of actual text data items having a first data distribution; a second neural network configured to (i) make a classification determination using a set of classification determination configurations whether each data item in said set of synthetic data items are synthetic or actual data, (ii) make a classification determination for each data item in said set of synthetic data items and said collection of actual data items using a set of classification configurations; and (iii) update said set of classification determination configurations; (iv) provide said classification determinations to said first neural network; said first neural network further configured to: c. use said classification determinations by said second neural network to update said synthetic data generation configuration; said second neural network further configured to: (v) generate autoencoder training sequences of updated classification determination configurations for each layer in

Assignees

Morgan State Univ

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06N3/088Primary
Non-supervised learning, e.g. competitive learning · CPC title
G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06F18/24
Classification techniques · CPC title
G06N3/047
Probabilistic or stochastic networks · CPC title

Patent family

Related publications grouped by family.

View patent family 77663735

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021287071A1 cover?: A data anomaly detection method and apparatus in which a deep neural network is trained on baseline data. Sequences of statistics of each layer of the deep neural network are saved, processed and used to train an LSTM autoencoder across a variety of reconstruction error thresholds, and a preferred threshold is selected for an optimized autoencoder. In an Inference mode, a data sample is present…
Who is the assignee on this patent?: Morgan State Univ
What technology area does this patent fall under?: Primary CPC classification G06N3/088. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Sep 16 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Generating digital video summaries utilizing aesthetics, relevancy, and generative neural networks

Method for the automated creation of rules for a rule-based anomaly recognition in a data stream

Image captioning utilizing semantic text modeling and adversarial learning

Frequently asked questions