What technology area does this patent fall under?

Primary CPC classification G06N5/01. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Human-in-the-loop interactive model training

US12191007B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12191007-B2
Application number	US-201716618656-A
Country	US
Kind code	B2
Filing date	Sep 29, 2017
Priority date	Aug 30, 2017
Publication date	Jan 7, 2025
Grant date	Jan 7, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Example embodiments relate to a method for training a predictive model from data. The method includes defining a multitude of predicates as binary functions operating on time sequences of the features or logical operations on the time sequences of the features. The method also includes iteratively training a boosting model by generating a number of new random predicates, scoring all the new random predicates by weighted information gain with respect to a class label associated with a prediction of the boosting model, selecting a number of the new random predicates with the highest weighted information gain and adding them to the boosting model, computing weights for all the predicates in the boosting model, removing one or more of the selected new predicates with the highest information gain from the boosting model in response to input from an operator. The method may include repeating the prior steps a plurality of times.

First claim

Opening claim text (preview).

We claim: 1. A computer-implemented method of training a predictive model from data comprising a multitude of features, each feature associated with a real value and a time component, comprising the steps of executing the following instructions in a processor of the computer: a) defining a multitude of predicates as binary functions operating on time sequences of the features or logical operations on the time sequences of the features; b) iteratively training a boosting model by performing the following: 1) Generating a number of new random predicates as binary functions operating on at least one of (i) time sequences of the features or (ii) logical operations on the time sequences of the features; 2) Scoring all the new random predicates by weighted information gain with respect to a class label associated with a prediction of the boosting model; 3) Selecting, from the new random predicates, a number of the new random predicates that are the highest with respect to their weighted information gain scores and adding them to the boosting model; 4) Computing weights for all the predicates in the boosting model; 5) Removing one or more of the selected number of the new random predicates from the boosting model in response to input from an operator; and 6) Repeating the performance of steps 1, 2, 3, 4 and 5 a plurality of times and thereby generating a final iteratively trained boosting model. 2. The method of claim 1 , further comprising the step of c) evaluating the final iteratively trained boosting model. 3. The method of claim 2 , wherein the evaluation step (c) comprises evaluating the final iteratively trained boosting model for at least one of accuracy, complexity, or trustworthiness. 4. The method of claim 1 , wherein the data is in a tuple format of the type {X, x i , t i } where X is the name of feature, x i is a real value of the feature and t i is a time component for the real value x i , and wherein the predicates are defined as binary functions operating on at least one of (i) sequences of tuples or (ii) logical operations on sequences of the tuples. 5. The method of claim 4 , wherein the sequences of tuples are defined by time periods selected from the group consisting of 1 or more days, 1 or more hours, 1 or more minutes, or 1 or more months. 6. The method of claim 1 , wherein the data comprises electronic health record data for a multitude of patients. 7. The method of claim 1 , wherein the method further comprises the step of dividing the predicates into groups based on understandability, namely a first group of relatively more human understandable predicates and a second group of relatively less human understandable predicates and wherein the new random predicates are selected from the first group. 8. The method of claim 7 , wherein the data comprises electronic health record data for a multitude of patients, and wherein the set of predicates are represented in a manner to show the subject matter or source within the electronic health record data of the predicates. 9. The method of claim 8 , wherein the predicates comprise an existence predicate returning a result of 0 or 1 depending on whether a feature exists in the electronic health record data for a given patient in the multitude of patients; and a counts predicate returning a result of 0 or 1 depending on the number of counts of a feature in the electronic health record data for a given patient in the multitude of patients relative to a numeric parameter C. 10. The method of claim 1 , wherein step b) 5) further comprises the step of graphically representing the predicates currently in the boosting model and providing the operator with the ability to remove one or more of the predicates. 11. The method of claim 10 , further comprising the step of graphically representing the weights computed for each of the predicates in step b) 4). 12. The method of claim 1 , further comprising the step of graphically representing a set of predicates added to the boosting model after each of the iterations of step b) 6). 13. The method of claim 1 , wherein step b) further comprises the step of providing the operator with the ability to define a predicate during model training. 14. The method of claim 1 , wherein step b) further comprises the step of removing redundant predicates. 15. The method of claim 1 , further comprising the step of ranking the predicates selected in step b) 3). 16. The method of claim 1 , further comprising the step of generating statistics of predicates in the boosting model and presenting them to the operator. 17. The method of claim 1 , wherein in step b) 5) the one or more predicates are removed which are not causally related to the prediction of the boosting model. 18. A computer-implemented method of training a predictive model from electronic health record data for a multitude of patients, the data comprising a multitude of features, each feature associated with real values and a time component, wherein the data is in a tuple format of the type {X, x i , t i } where X is the name of feature, x i is a real value of the feature and t i is a time component for the real value x i , comprising the steps of implementing the following instructions in a processor of the computer: a) defining a multitude of predicates as at least one of (i) binary functions operating on sequences of the tuples or (ii) logical operations on the sequences of the tuples; b) dividing the multitude of predicates into groups based on understandability, namely a first group of relatively more human understandable predicates and a second group of relatively less human understandable predicates; c) iteratively training a boosting model by performing the following: 1) Generating a number of new random predicates from the first group of predicates as binary functions operating on at least one of (i) sequences of the tuples or (ii) logical operations on the sequences of the tuples; 2) Scoring all the new random predicates by weighted information gain with respect to a class label associated with a prediction of the boosting model; 3) Selecting, from the new random predicates, a number of the new random predicates that are the highest with respect to their weighted information gain scores and adding them to the boosting model; 4) Computing weights for all the predicates in the boosting model; 5) Removing one or more of the selected number of the new random predicates from the boosting model in response to input from an operator; and 6) Repeating the performance of steps 1, 2, 3, 4 and 5 a plurality of times and thereby generating a final iteratively trained boosting model. 19. The method of claim 18 , further comprising the step d) of evaluating the final iteratively trained boosting model. 20. A workstation for providing operator input into iteratively training a boosting model, wherein the workstation comprises an interface and a processor, and wherein the processor is configured to perform operations comprising: 1) Generating a number of new random predicates as binary functions operating on at least one of (i) time sequences of input features or (ii) logical operations on the time sequences of the input features; 2) Scoring all the new random predicates by weighted information gain with respect to a class label associated with a prediction of the boosting model; 3) Selecting, from the new random predicates, a number of the new random predicates that are the highest with respect to their weighted information gain scores and adding them to the boosting

Assignees

Google Llc

Inventors

Classifications

G06N20/00
Machine learning · CPC title
G06N3/042
Knowledge-based neural networks; Logical representations of neural networks · CPC title
G06F7/00
Methods or arrangements for processing data by operating upon the order or content of the data handled (logic circuits H03K19/00) · CPC title
G06N5/045
Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence · CPC title
G06N3/08
Learning methods · CPC title

Patent family

Related publications grouped by family.

View patent family 65525997

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12191007B2 cover?: Example embodiments relate to a method for training a predictive model from data. The method includes defining a multitude of predicates as binary functions operating on time sequences of the features or logical operations on the time sequences of the features. The method also includes iteratively training a boosting model by generating a number of new random predicates, scoring all the new ran…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G06N5/01. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Search item generation method and related device

System and method for visual correlation of digital images

Optimized training of linear machine learning models

Interactive interfaces for machine learning model evaluations

Method for in-loop human validation of disambiguated features

Automated license plate recognition system and method using human-in-the-loop based adaptive learning

Frequently asked questions