Natural language keyword tag extraction
US-2021056114-A1 · Feb 25, 2021 · US
US12105772B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12105772-B2 |
| Application number | US-202017109022-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 1, 2020 |
| Priority date | Dec 1, 2020 |
| Publication date | Oct 1, 2024 |
| Grant date | Oct 1, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer implemented method of preparing process data for use in an artificial intelligence (AI) model includes collecting and storing raw data as episodic data for each episode of a process. An episode data generator assigns an episode identifier each set of episodic data. The raw data per episode is transformed into a standardized episodic data format that is usable by the AI model. Metrics are assigned to the episodic data and the episodic data is aggregated in an episode store. The data in the episode store is used by a feature extraction and learning module to extract and rank features.
Opening claim text (preview).
What is claimed is: 1. A computer implemented method of preparing process data for use in an artificial intelligence (AI) model, the method executed by one or more processors of a computer, the method comprising: collecting and storing raw data as episodic data for each episode of a process; assigning an episode identifier each set of episodic data; transforming the raw data per episode into a format of standardized episodic data usable by the AI model; assigning metrics to the standardized episodic data; aggregating the standardized episodic data in an episode store; improving a processor cycle efficiency and decreasing storage requirements of the computer by: processing the standardized episodic data related to on arrival features as the data is received; processing the standardized episodic data related to borderline features when computation resource usage permits; and processing the standardized episodic data related to on demand features when needed by the AI model, and optimizing feature generation tasks when an application programming interface (API) call of the AI model is invoked. 2. The computer implemented method of claim 1 , wherein the metrics assigned to the episodic data include lab-generated and real-time metrics. 3. The computer implemented method of claim 1 , wherein the standardized episodic data format includes a table for a subprocess of the episode, the table including, for a time step of the subprocess, a control variable value associated with each time step and a sensor value associated with each time step. 4. The computer implemented method of claim 1 , further comprising communicating an arrival of the standardized episodic data to a feature learning operations (FLOps) module. 5. The computer implemented method of claim 4 , further comprising waiting for a number of episodes of the standardized episodic data until a predetermined number of outcomes of the assigned metrics fall below a quality threshold. 6. The computer implemented method of claim 4 , further comprising assigning, by the FLOps module, features of the standardized episodic data to a feature tag selected from one of on arrival features, borderline features, on demand features, or blacklisted features. 7. The computer implemented method of claim 6 , wherein the features are assigned the feature tags based on key performance indicators for measuring a priority of extraction. 8. The computer implemented method of claim 1 , further comprising: determining a number of recent historical episodes from which features are to be extracted for different API calls; extracting features from selected episodes using the assigned feature tags; sending the extracted features to the AI model; and evaluating a performance of the AI model based on standardized episodic data related to the extracted features. 9. The computer implemented method of claim 8 , further comprising: ranking and re-assigning the feature tags to features based on the key performance indicators for measuring priority of extraction; and evaluating the performance of the AI model based on standardized episodic data related to the re-assigned feature tags. 10. A computer implemented method for managing feature learning and extraction in an artificial intelligence (AI) lifecycle, the method executed by one or more processors of a computer comprising: improving a processor cycle efficiency and decreasing storage requirements of the computer by: ranking and assigning feature tags to features based on key performance indicators for measuring a priority of feature extraction, wherein the features are ranked according to a root cause analysis model; and storing the feature tags for reference during the feature extraction; extracting the features from episodes based on an application programming interface (API) request of an AI model using the stored feature tags; controlling a features library from which the features are selected for ranking and assigning feature tags; determining a most probable path through a root cause analysis model decision tree; and calculating the features in the most probable path. 11. A non-transitory computer readable storage medium tangibly embodying a computer readable program code having computer readable instructions that, when executed, causes a computer device to carry out a method of preparing process data for use in an artificial intelligence (AI) model, the method comprising: collecting and storing raw data as episodic data for each episode of a process; assigning an episode identifier each set of episodic data; transforming the raw data per episode into a format of standardized episodic data usable by the AI model; assigning metrics to the standardized episodic data; aggregating the standardized episodic data in an episode store; and improving a processor cycle efficiency and decreasing storage requirements of the computer device by: processing the standardized episodic data related to on arrival features as the raw data is received; processing the standardized episodic data related to borderline features when computation resource usage permits; and processing the standardized episodic data related to on demand features when needed by the AI model; and optimizing feature generation tasks when an application programming interface (API) call of the AI model is invoked. 12. The non-transitory computer readable storage medium of claim 11 , wherein the standardized episodic data format includes a table for each subprocess of the episode, the table including, for each time step of the subprocess, each control variable value associated with each time step and each sensor value associated with each time step. 13. The non-transitory computer readable storage medium of claim 11 , wherein the execution of the code by the processor further configures the computing device to perform acts comprising: communicating the arrival of data to a feature learning operations (FLOps) module; and assigning, by the FLOps module, features of the standardized episodic data to a feature tag selected from one of on arrival features, borderline features, on demand features, or blacklisted features. 14. The non-transitory computer readable storage medium of claim 11 , wherein the features are assigned the feature tags based on key performance indicators for measuring a priority of feature extraction.
by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation · CPC title
Machine learning · CPC title
via adapters, e.g. between incompatible applications · CPC title
characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling · CPC title
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.