Dynamic and continuous composition of features extraction and learning operation tool for episodic industrial process

US12105772B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12105772-B2
Application numberUS-202017109022-A
CountryUS
Kind codeB2
Filing dateDec 1, 2020
Priority dateDec 1, 2020
Publication dateOct 1, 2024
Grant dateOct 1, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer implemented method of preparing process data for use in an artificial intelligence (AI) model includes collecting and storing raw data as episodic data for each episode of a process. An episode data generator assigns an episode identifier each set of episodic data. The raw data per episode is transformed into a standardized episodic data format that is usable by the AI model. Metrics are assigned to the episodic data and the episodic data is aggregated in an episode store. The data in the episode store is used by a feature extraction and learning module to extract and rank features.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method of preparing process data for use in an artificial intelligence (AI) model, the method executed by one or more processors of a computer, the method comprising: collecting and storing raw data as episodic data for each episode of a process; assigning an episode identifier each set of episodic data; transforming the raw data per episode into a format of standardized episodic data usable by the AI model; assigning metrics to the standardized episodic data; aggregating the standardized episodic data in an episode store; improving a processor cycle efficiency and decreasing storage requirements of the computer by: processing the standardized episodic data related to on arrival features as the data is received; processing the standardized episodic data related to borderline features when computation resource usage permits; and processing the standardized episodic data related to on demand features when needed by the AI model, and optimizing feature generation tasks when an application programming interface (API) call of the AI model is invoked. 2. The computer implemented method of claim 1 , wherein the metrics assigned to the episodic data include lab-generated and real-time metrics. 3. The computer implemented method of claim 1 , wherein the standardized episodic data format includes a table for a subprocess of the episode, the table including, for a time step of the subprocess, a control variable value associated with each time step and a sensor value associated with each time step. 4. The computer implemented method of claim 1 , further comprising communicating an arrival of the standardized episodic data to a feature learning operations (FLOps) module. 5. The computer implemented method of claim 4 , further comprising waiting for a number of episodes of the standardized episodic data until a predetermined number of outcomes of the assigned metrics fall below a quality threshold. 6. The computer implemented method of claim 4 , further comprising assigning, by the FLOps module, features of the standardized episodic data to a feature tag selected from one of on arrival features, borderline features, on demand features, or blacklisted features. 7. The computer implemented method of claim 6 , wherein the features are assigned the feature tags based on key performance indicators for measuring a priority of extraction. 8. The computer implemented method of claim 1 , further comprising: determining a number of recent historical episodes from which features are to be extracted for different API calls; extracting features from selected episodes using the assigned feature tags; sending the extracted features to the AI model; and evaluating a performance of the AI model based on standardized episodic data related to the extracted features. 9. The computer implemented method of claim 8 , further comprising: ranking and re-assigning the feature tags to features based on the key performance indicators for measuring priority of extraction; and evaluating the performance of the AI model based on standardized episodic data related to the re-assigned feature tags. 10. A computer implemented method for managing feature learning and extraction in an artificial intelligence (AI) lifecycle, the method executed by one or more processors of a computer comprising: improving a processor cycle efficiency and decreasing storage requirements of the computer by: ranking and assigning feature tags to features based on key performance indicators for measuring a priority of feature extraction, wherein the features are ranked according to a root cause analysis model; and storing the feature tags for reference during the feature extraction; extracting the features from episodes based on an application programming interface (API) request of an AI model using the stored feature tags; controlling a features library from which the features are selected for ranking and assigning feature tags; determining a most probable path through a root cause analysis model decision tree; and calculating the features in the most probable path. 11. A non-transitory computer readable storage medium tangibly embodying a computer readable program code having computer readable instructions that, when executed, causes a computer device to carry out a method of preparing process data for use in an artificial intelligence (AI) model, the method comprising: collecting and storing raw data as episodic data for each episode of a process; assigning an episode identifier each set of episodic data; transforming the raw data per episode into a format of standardized episodic data usable by the AI model; assigning metrics to the standardized episodic data; aggregating the standardized episodic data in an episode store; and improving a processor cycle efficiency and decreasing storage requirements of the computer device by: processing the standardized episodic data related to on arrival features as the raw data is received; processing the standardized episodic data related to borderline features when computation resource usage permits; and processing the standardized episodic data related to on demand features when needed by the AI model; and optimizing feature generation tasks when an application programming interface (API) call of the AI model is invoked. 12. The non-transitory computer readable storage medium of claim 11 , wherein the standardized episodic data format includes a table for each subprocess of the episode, the table including, for each time step of the subprocess, each control variable value associated with each time step and each sensor value associated with each time step. 13. The non-transitory computer readable storage medium of claim 11 , wherein the execution of the code by the processor further configures the computing device to perform acts comprising: communicating the arrival of data to a feature learning operations (FLOps) module; and assigning, by the FLOps module, features of the standardized episodic data to a feature tag selected from one of on arrival features, borderline features, on demand features, or blacklisted features. 14. The non-transitory computer readable storage medium of claim 11 , wherein the features are assigned the feature tags based on key performance indicators for measuring a priority of feature extraction.

Assignees

Inventors

Classifications

  • by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

  • via adapters, e.g. between incompatible applications · CPC title

  • characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling · CPC title

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12105772B2 cover?
A computer implemented method of preparing process data for use in an artificial intelligence (AI) model includes collecting and storing raw data as episodic data for each episode of a process. An episode data generator assigns an episode identifier each set of episodic data. The raw data per episode is transformed into a standardized episodic data format that is usable by the AI model. Metrics…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 01 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).