Training diverse and robust ensembles of artificial intelligence computer models

US11783025B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11783025-B2
Application numberUS-202016816625-A
CountryUS
Kind codeB2
Filing dateMar 12, 2020
Priority dateMar 12, 2020
Publication dateOct 10, 2023
Grant dateOct 10, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Mechanisms are provided to implement a hardened ensemble artificial intelligence (AI) model generator. The hardened ensemble AI model generator co-trains at least two AI models. The hardened ensemble AI model generator modifies, based on a comparison of the at least two AI models, a loss surface of one or more of the at least two AI models to prevent an adversarial attack on one AI model, in the at least two AI models, transferring to another AI model in the at least two AI models, to thereby generate one or more modified AI models. At least one of the one or more modified AI models then processes an input to generate an output result.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, in a data processing system comprising a processor and a memory, the memory comprising instructions which are executed by the processor to specifically configure the processor to implement a hardened ensemble artificial intelligence (AI) model generator, the method comprising: co-training, by the hardened ensemble AI model generator, at least two AI models, of an ensemble AI model computing system, on a same domain of training data; determining, during co-training of the at least two AI models, a similarity measure between a first gradient of a first AI model in the at least two AI models, and a second gradient of a second AI model in the at least two AI models; in response to the similarity measure between the first gradient and the second gradient being equal to or greater than a predetermined threshold similarity measure, determining that a loss surface of one of the first AI model or the second AI model is to be modified; in response to determining that the similarity measure is equal to or greater than the predetermined threshold, modifying, by the hardened ensemble AI model generator a loss surface of one of the first AI model or the second AI model, in the at least two AI models, to control a distance between the loss surfaces of the first AI model and the second AI model and prevent an adversarial attack on one AI model transferring to another AI model in the at least two AI models, and thereby generate a modified ensemble of AI models computing system; and processing, by the modified ensemble of AI models computing system, an input to generate an output result, wherein at least one of the AI models in the modified ensemble of AI models computing system generates a correct output for the output result when the input is an adversarial attack on the AI models. 2. The method of claim 1 , wherein the similarity measure between the gradients of the loss surfaces is determined based on at least one of a cosine similarity or a Lp norm similarity. 3. The method of claim 1 , wherein modifying the loss surface of one of the first AI model or the second AI model comprises adding a first regularizer term, having a first regularizer strength value, to a loss function of one of the first AI model or the second AI model, or increasing a second regularizer strength value of a second regularizer term in the loss function of one of the first AI model or the second AI model, and thereby control the similarity of the first gradient and second gradient of the first AI model and the second AI model. 4. The method of claim 3 , wherein the first regularizer strength value or second regularizer strength value is set to a value that maximizes a distance between the loss surfaces of the first AI model and second AI model while minimizing accuracy loss of the at least two AI models in outputs generated by the at least two AI models. 5. The method of claim 1 , wherein the at least two AI models comprises more than two AI models, and wherein the modifying comprises performing multiple pairwise comparisons of pairs of AI models in the at least two AI models and modifying loss surfaces of one or more of the AI models in each pair based on results of the comparisons, wherein the first AI model and the second AI model is one of the pairs of AI models. 6. The method of claim 5 , wherein the pairwise comparisons of pairs of AI models are performed based on at least one of a clique architecture, a star architecture, or a ring architecture. 7. The method of claim 1 , further comprising: wherein processing the input to generate the output result comprises processing the input by each of the AI models in the modified ensemble of AI models and combining the outputs of the AI models in the modified ensemble of AI models to generate a single output result for the modified ensemble. 8. The method of claim 7 , wherein combining the outputs of the AI models in the modified ensemble of AI models to generate a single output result for the modified ensemble comprises at least one of averaging the outputs of the AI models in the modified ensemble of AI models or performing a majority vote operation on the outputs of the AI models in the modified ensemble of AI models. 9. The method of claim 1 , wherein the co-training and modifying operations are performed for each mini-batch of training data used to train the at least two AI models. 10. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed on a data processing system, causes the data processing system to implement a hardened ensemble artificial intelligence (AI) model generator that operates to: co-train at least two AI models, of an ensemble AI model computing system, on a same domain of training data; determine, during co-training of the at least two AI models, a similarity measure between a first gradient of a first AI model in the at least two AI models, and a second gradient of a second AI model in the at least two AI models; determine, in response to the similarity measure between the first gradient and the second gradient being equal to or greater than a predetermined threshold similarity measure, that a loss surface of one of the first AI model or the second AI model is to be modified; modify, in response to determining that the similarity measure is equal to or greater than the predetermined threshold, a loss surface of one of the first AI model or the second AI model, in the at least two AI models, to control a distance between the loss surfaces of the first AI model and the second AI model and prevent an adversarial attack on one AI model transferring to another AI model in the at least two AI models, and thereby generate a modified ensemble of AI models computing system; and process, by the modified ensemble of AI models computing system, an input to generate an output result, wherein at least one of the AI models in the modified ensemble of AI models computing system generates a correct output for the output result when the input is an adversarial attack on the AI models. 11. The computer program product of claim 10 , wherein the similarity measure between the gradients of the loss surfaces is determined based on at least one of a cosine similarity or a Lp norm similarity. 12. The computer program product of claim 10 , wherein the computer readable program further causes the hardened ensemble AI model generator to the loss surface of one of the first AI model or the second AI model comprises adding a first regularizer term, having a first regularizer strength value, to a loss function of one of the first AI model or the second AI model, or increasing a second regularizer strength value of a second regularizer term in the loss function of one of the first AI model or the second AI model, and thereby control the similarity of the first gradient and second gradient of the first AI model and the second AI model. 13. The computer program product of claim 12 , wherein the first regularizer strength value or second regularizer strength value is set to a value that maximizes a distance between the loss surfaces of the first AI model and second AI model while minimizing accuracy loss of the at least two AI models in outputs generated by the at least two AI models. 14. The computer program product of claim 10 , wherein the at least two AI models comprises more than two AI models, and wherein the modifying comprises performing multiple pairwise comparisons of pairs of AI models in the at least two AI models and modifying loss surfaces of one or more of the AI models in each pair

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • G06F21/52Primary

    during program execution, e.g. stack integrity {; Preventing unwanted data erasure; Buffer overflow} · CPC title

  • Matching criteria, e.g. proximity measures · CPC title

  • of input or preprocessed data · CPC title

  • Protecting data · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11783025B2 cover?
Mechanisms are provided to implement a hardened ensemble artificial intelligence (AI) model generator. The hardened ensemble AI model generator co-trains at least two AI models. The hardened ensemble AI model generator modifies, based on a comparison of the at least two AI models, a loss surface of one or more of the at least two AI models to prevent an adversarial attack on one AI model, in th…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F21/52. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 10 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).