Relevance decay for time-based evaluation of machine learning applications

US10885464B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10885464-B1
Application numberUS-201615338624-A
CountryUS
Kind codeB1
Filing dateOct 31, 2016
Priority dateOct 31, 2016
Publication dateJan 5, 2021
Grant dateJan 5, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Relevance decay techniques are provided for time-based evaluation of machine learning applications and other classifiers. An exemplary method comprises obtaining time series measurement data; generating an input dataset comprising a plurality of records, wherein each record comprises features extracted from the time series measurement data, a target class corresponding to an event to be identified, and a time lag indicating a difference in time between a given extraction and the event to be identified; evaluating a plurality of classifiers during an evaluation phase using a portion of the input dataset and one or more predefined evaluation metrics weighted using a time-based relevance decay function based on the time lag; and selecting one or more of the classifiers to perform classification of the time series measurement data based on the predefined weighted evaluation metrics during a classification phase. The time lags indicate, for example, a time difference between classification moments of the plurality of classifiers and a respective instance of the event to be identified.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising the steps of: obtaining time series measurement data; generating an input dataset comprising a plurality of records, wherein each record comprises one or more features extracted from said time series measurement data, a target class corresponding to an event to be identified, and a time lag indicating a difference in time between a given extraction and said event to be identified; evaluating a plurality of classifiers during an evaluation phase using at least a portion of said input dataset and one or more predefined evaluation metrics that are weighted for each of the plurality of classifiers using a time-based relevance decay function based on a corresponding time lag indicating a difference between an actual time when a given event occurred and a time when each of the respective classifiers of the plurality of classifiers predicted the given event, wherein the plurality of classifiers perform classification of said at least said portion of said input dataset to assign a class to said at least said portion of said input dataset for comparison to said target class during said evaluation phase; and selecting one or more of said plurality of classifiers to perform classification of said time series measurement data based on said one or more predefined weighted evaluation metrics during a classification phase. 2. The method of claim 1 , further comprising the step of training said plurality of classifiers using at least a portion of said input dataset. 3. The method of claim 1 , wherein said one or more predefined evaluation metrics comprise one or more of accuracy, precision, recall, F1 score, true positive rate and true negative rate that are weighted using said time-based relevance decay function based on said time lag. 4. The method of claim 3 , wherein said time-based relevance decay function comprises one or more of a linear decay function, an exponential decay function and a step function. 5. The method of claim 1 , wherein said time series measurement data comprises one or more of telemetry data and log data. 6. The method of claim 1 , wherein said step of generating said input dataset comprises selecting a portion of said time series measurement data, extracting said one or more features from said selected portion of said time series measurement data, and assigning said target class corresponding to said event to be identified. 7. The method of claim 1 , wherein said time lags indicate a time difference between classification moments of the plurality of classifiers and a respective instance of the event to be identified. 8. The method of claim 1 , wherein said event to be identified comprises one or more of an event to be classified and an event to be predicted using a machine learning application. 9. A computer program product for annotating time series measurement data, comprising a non-transitory machine-readable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed perform the following steps: obtaining time series measurement data; generating an input dataset comprising a plurality of records, wherein each record comprises one or more features extracted from said time series measurement data, a target class corresponding to an event to be identified, and a time lag indicating a difference in time between a given extraction and said event to be identified; evaluating a plurality of classifiers during an evaluation phase using at least a portion of said input dataset and one or more predefined evaluation metrics that are weighted for each of the plurality of classifiers using a time-based relevance decay function based on a corresponding time lag indicating a difference between an actual time when a given event occurred and a time when each of the respective classifiers of the plurality of classifiers predicted the given event, wherein the plurality of classifiers perform classification of said at least said portion of said input dataset to assign a class to said at least said portion of said input dataset for comparison to said target class during said evaluation phase; and selecting one or more of said plurality of classifiers to perform classification of said time series measurement data based on said one or more predefined weighted evaluation metrics during a classification phase. 10. The computer program product of claim 9 , wherein said one or more predefined evaluation metrics comprise one or more of accuracy, precision, recall, F1 score, true positive rate and true negative rate that are weighted using said time-based relevance decay function based on said time lag. 11. The computer program product of claim 10 , wherein said time-based relevance decay function comprises one or more of a linear decay function, an exponential decay function and a step function. 12. The computer program product of claim 9 , wherein said step of generating said input dataset comprises selecting a portion of said time series measurement data, extracting said one or more features from said selected portion of said time series measurement data, and assigning said target class corresponding to said event to be identified. 13. The computer program product of claim 9 , wherein said time lags indicate a time difference between classification moments of the plurality of classifiers and a respective instance of the event to be identified. 14. The computer program product of claim 9 , wherein said event to be identified comprises one or more of an event to be classified and an event to be predicted using a machine learning application. 15. A system for annotating time series measurement data, comprising: a memory; and at least one hardware device, coupled to the memory, operative to implement the following steps: obtaining time series measurement data; generating an input dataset comprising a plurality of records, wherein each record comprises one or more features extracted from said time series measurement data, a target class corresponding to an event to be identified, and a time lag indicating a difference in time between a given extraction and said event to be identified; evaluating a plurality of classifiers during an evaluation phase using at least a portion of said input dataset and one or more predefined evaluation metrics that are weighted for each of the plurality of classifiers using a time-based relevance decay function based on a corresponding time lag indicating a difference between an actual time when a given event occurred and a time when each of the respective classifiers of the plurality of classifiers predicted the given event, wherein the plurality of classifiers perform classification of said at least said portion of said input dataset to assign a class to said at least said portion of said input dataset for comparison to said target class during said evaluation phase; and selecting one or more of said plurality of classifiers to perform classification of said time series measurement data based on said one or more predefined weighted evaluation metrics during a classification phase. 16. The system of claim 15 , wherein said one or more predefined evaluation metrics comprise one or more of accuracy, precision, recall, F1 score, true positive rate and true negative rate that are weighted using said time-based relevance decay function based on said time lag. 17. The system of claim 16 , wherein said time-based relevance decay function comprises one or more of a linear decay function, an exponential decay function and a step function.

Assignees

Inventors

Classifications

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Administration of product repair or maintenance · CPC title

  • Risk analysis of enterprise or organisation activities · CPC title

  • Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem" (market predictions or forecasting for commercial activities G06Q30/0202) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10885464B1 cover?
Relevance decay techniques are provided for time-based evaluation of machine learning applications and other classifiers. An exemplary method comprises obtaining time series measurement data; generating an input dataset comprising a plurality of records, wherein each record comprises features extracted from the time series measurement data, a target class corresponding to an event to be identif…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06N20/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 05 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).