Parallel ensemble of machine learning algorithms

US11443244B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11443244-B2
Application numberUS-201916431802-A
CountryUS
Kind codeB2
Filing dateJun 5, 2019
Priority dateJun 5, 2019
Publication dateSep 13, 2022
Grant dateSep 13, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An aspect of the invention includes receiving machine learning (ML) training data that includes a plurality of features for a plurality of observations. The ML training data is broken into a plurality of non-overlapping subsets of features and observations. A first ML algorithm is trained based on a first subset of the features and observations, and a second ML algorithm is trained based on a second subset of the features and observations. The training of the first ML algorithm overlaps in time with the training of the second ML algorithm. The first and second ML algorithms are tested. Either the first or second ML algorithm is selected based at least in part on results of the testing. The selected ML algorithm is retained as a trained ML algorithm for predicting one or more of the plurality of features based on one or more others of the plurality of features.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving machine learning (ML) training data comprising a plurality of features for a plurality of observations; breaking the ML training data into a plurality of non-overlapping subsets of features and observations; training a first ML algorithm based on a first subset of the features and observations, and a second ML algorithm based on a second subset of the features and observations, the training of the first ML algorithm overlapping in time with the training of the second ML algorithm; testing the first ML algorithm and the second ML algorithm; selecting one of the first ML algorithm and the second ML algorithm based at least in part on results of the testing; and retaining the selected ML algorithm as a trained ML algorithm for predicting one or more of the plurality of features based on one or more other features of the other plurality of features. 2. The computer-implemented method of claim 1 , wherein the results of the testing include an error level. 3. The computer-implemented method of claim 1 , wherein the first and second subset include a same subset of the plurality of features. 4. The computer-implemented method of claim 3 , wherein the training, testing, selecting, and retaining are repeated for multiple different subsets of the plurality of features. 5. The computer-implemented method of claim 3 , wherein the training, testing, selecting, and retaining are repeated for all subsets of the plurality of features. 6. The computer-implemented method of claim 1 , wherein the testing the first ML algorithm overlaps in time with the testing of the second ML algorithm. 7. The computer-implemented method of claim 1 , wherein the training and testing are ML algorithm agnostic. 8. The computer-implemented method of claim 1 , wherein the training and testing are repeated until a user defined error threshold is reached. 9. A system comprising: a memory having computer readable instructions; and one or more processors for executing the computer readable instructions, the computer readable instructions controlling the one or more processors to perform operations comprising: receiving machine learning (ML) training data comprising a plurality of features for a plurality of observations; breaking the ML training data into a plurality of non-overlapping subsets of features and observations; training a first ML algorithm based on a first subset of the features and observations, and a second ML algorithm based on a second subset of the features and observations, the training of the first ML algorithm overlapping in time with the training of the second ML algorithm; testing the first ML algorithm and the second ML algorithm; selecting one of the first ML algorithm and the second ML algorithm based at least in part on results of the testing; and retaining the selected ML algorithm as a trained ML algorithm for predicting one or more of the plurality of features based on one or more other features of the other plurality of features. 10. The system of claim 9 , wherein the results of the testing include an error level. 11. The system of claim 9 , wherein the first and second subset include a same subset of the plurality of features. 12. The system of claim 11 , wherein the training, testing, selecting, and retaining are repeated for multiple different subsets of the plurality of features. 13. The system of claim 12 , wherein the training, testing, selecting, and retaining are repeated for all subsets of the plurality of features. 14. The system of claim 12 , wherein the testing the first ML algorithm overlaps in time with the testing of the second ML algorithm. 15. The system of claim 9 , wherein the training and testing are ML algorithm agnostic. 16. The system of claim 9 , wherein the training and testing are repeated until a user defined error threshold is reached. 17. A computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform operations comprising: receiving machine learning (ML) training data comprising a plurality of features for a plurality of observations; breaking the ML training data into a plurality of non-overlapping subsets of features and observations; training a first ML algorithm based on a first subset of the features and observations, and a second ML algorithm based on a second subset of the features and observations, the training of the first ML algorithm overlapping in time with the training of the second ML algorithm; testing the first ML algorithm and the second ML algorithm; selecting one of the first ML algorithm and the second ML algorithm based at least in part on results of the testing; and retaining the selected ML algorithm as a trained ML algorithm for predicting one or more of the plurality of features based on one or more other features of the other plurality of features. 18. The computer program product of claim 17 , wherein the results of the testing include an error level. 19. The computer program product of claim 17 , wherein the first and second subset include a same subset of the plurality of features. 20. The computer program product of claim 19 , wherein the training, testing, selecting, and retaining are repeated for multiple different subsets of the plurality of features.

Assignees

Inventors

Classifications

  • G06N20/20Primary

    Ensemble learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11443244B2 cover?
An aspect of the invention includes receiving machine learning (ML) training data that includes a plurality of features for a plurality of observations. The ML training data is broken into a plurality of non-overlapping subsets of features and observations. A first ML algorithm is trained based on a first subset of the features and observations, and a second ML algorithm is trained based on a s…
Who is the assignee on this patent?
IBM, Int Business Machines Corportation
What technology area does this patent fall under?
Primary CPC classification G06N20/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).