Device, method, and system for concept drift detection

US2023069347A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2023069347-A1
Application numberUS-202217697067-A
CountryUS
Kind codeA1
Filing dateMar 17, 2022
Priority dateAug 31, 2021
Publication dateMar 2, 2023
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Aspects relate to determining the presence or absence of concept drift in seasonal time series data. A concept drift detection device including a data input unit for receiving a time series data set that includes a set of past time series data and a set of current time series data, a baseline model generation unit for generating a baseline model based on a subset of the set of past time series data, a feature extraction unit for extracting a set of past data features, a set of baseline data features, and a set of current data features, a distance calculation unit for calculating a baseline distance and a current distance, and a concept drift detection unit for determining, based on a baseline statistic and the current distance, presence or absence of concept drift between the set of current time series data and the set of past time series data.

First claim

Opening claim text (preview).

What is claimed is: 1 . A concept drift detection device for detecting concept drift in a time series data set, the concept drift detection device comprising: a data input unit configured to receive a time series data set that includes a set of past time series data relating to a first time period and a set of current time series data relating to a second time period subsequent to the first time period; a baseline model generation unit configured to generate a baseline model based on a subset of the set of past time series data; a feature extraction unit configured to: divide the set of past time series data into a set of past windows, divide the set of current time series data into a set of current windows, and divide a set of baseline data created by the baseline model into a set of baseline windows, and calculate a set of baseline data features from the set of baseline windows, calculate a set of past data features from the set of past windows, and calculate a set of current data features from the set of current windows; a distance calculation unit configured to: calculate a baseline distance between a subset of the set of past data features and a subset of the baseline data features that relate to a corresponding time frame, and calculate a current distance between a subset of the set of current data features and a subset of the baseline data features that relate to a corresponding time frame; and a concept drift detection unit configured to: calculate, based on the baseline distance, a baseline statistic that indicates a reference for determining concept drift, and determine, based on the baseline statistic and the current distance, presence or absence of concept drift between the set of current time series data and the set of past time series data. 2 . The concept drift detection device according to claim 1 , further comprising: a distance smoothing unit configured to use an exponentially weighted moving average technique to smooth the baseline distance and the current distance. 3 . The concept drift detection device according to claim 1 , wherein: the baseline model is a trained machine learning model configured to generate, as the baseline data, a set of predicted time series data based on the subset of the set of past time series data. 4 . The concept drift detection device according to claim 1 , wherein: the time series data set is seasonal time series data that includes a set of seasonal time patterns that repeat periodically over a defined time period; and each seasonal time pattern of the set of seasonal time patterns includes a set of seasonal time pattern points corresponding to a set of time features. 5 . The concept drift detection device according to claim 4 , wherein: the concept drift detection unit is configured to: calculate, as the baseline statistic, a seasonal baseline that indicates a reference for determining concept drift for each of the set of seasonal time pattern points of the baseline distance; and determine that concept drift exists for a first seasonal time pattern point of the current distance in a case that the first seasonal time pattern point of the current distance exceeds a statistical threshold with respect to the seasonal baseline. 6 . The concept drift detection device according to claim 5 , wherein: the concept drift detection unit is configured to: output a concept drift notification in a case that concept drift is determined to exist for a predetermined number of seasonal time pattern points of the current distance. 7 . The concept drift detection device according to claim 6 , wherein: the baseline model generation unit is configured to update the baseline model based on a second time series data set in a case that concept drift is determined to exist for a predetermined number of seasonal time pattern points of the current distance. 8 . A concept drift detection method for detecting concept drift in a time series data set, the concept drift detection method comprising: receiving a time series data set that includes a set of past time series data relating to a first time period and a set of current time series data relating to a second time period subsequent to the first time period; generating a baseline model based on a subset of the set of past time series data; dividing the set of past time series data into a set of past windows; dividing the set of current time series data into a set of current windows; and dividing a set of baseline data created by the baseline model into a set of baseline windows; calculating a set of baseline data features from the set of baseline windows; calculating a set of past data features from the set of past windows; calculating a set of current data features from the set of current windows; calculating a baseline distance between a subset of the set of past data features and a subset of the baseline data features that relate to a corresponding time frame; calculating a current distance between a subset of the set of current data features and a subset of the baseline data features that relate to a corresponding time frame; calculating, based on the baseline distance, a baseline statistic that indicates a reference for determining concept drift; and determining, based on the baseline statistic and the current distance, presence or absence of concept drift between the set of current time series data and the set of past time series data. 9 . A concept drift detection system for detecting concept drift in a time series data set, the concept drift detection system comprising: a client device; a data storage device configured to store a time series data set that includes a set of past time series data relating to a first time period and a set of current time series data relating to a second time period subsequent to the first time period; and a concept drift detection device configured to detect concept drift in the time series data set, wherein the concept drift detection device further includes: a data input unit configured to receive the time series data set from the data storage device; a baseline model generation unit configured to generate a baseline model based on a subset of the set of past time series data; a feature extraction unit configured to: divide the set of past time series data into a set of past windows, divide the set of current time series data into a set of current windows, and divide a set of baseline data created by the baseline model into a set of baseline windows, and calculate a set of baseline data features from the set of baseline windows, calculate a set of past data features from the set of past windows, and calculate a set of current data features from the set of current windows; a distance calculation unit configured to: calculate a baseline distance between a subset of the set of past data features and a subset of the baseline data features that relate to a corresponding time frame, and calculate a current distance between a subset of the set of current data features and a subset of the baseline data features that relate to a corresponding time frame; and a concept drift detection unit configured to: calculate, based on the baseline distance, a baseline statistic that indicates a reference for determining concept drift, and determine, based on the baseline statistic and the current distance, presence or absence of concept drift between the set of current time series data and the set of past time series data, and output a concept drift notification to the client device in a case that concept drift is determined to be present between the set of current time series data and the set of past time series data.

Assignees

Inventors

Classifications

  • G06N20/00Primary

    Machine learning · CPC title

  • G06N5/04Primary

    Inference or reasoning models · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2023069347A1 cover?
Aspects relate to determining the presence or absence of concept drift in seasonal time series data. A concept drift detection device including a data input unit for receiving a time series data set that includes a set of past time series data and a set of current time series data, a baseline model generation unit for generating a baseline model based on a subset of the set of past time series …
Who is the assignee on this patent?
Hitachi Ltd
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Mar 02 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).