System And Method For Verification And Auditing Of Intelligent Systems

US2025190873A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025190873-A1
Application numberUS-202418974727-A
CountryUS
Kind codeA1
Filing dateDec 9, 2024
Priority dateDec 7, 2023
Publication dateJun 12, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This invention relates to a computer-implemented method and system for performing security evaluation on a machine learning (ML) model. The method includes determining a taxonomy of the ML model and of the environment in which the machine learning model is implemented at one or more stages in the model's lifecycle. The method additionally includes generating, based on the determined taxonomy, a set of assumptions about the ML model and the environment. An adversarial test attack is performed on the ML model at a stage in its lifecycle, based at least in part on the set of assumptions, and one or more failure modes in the ML model are identified based on the result of the first adversarial attack. The system may include a threat modelling component, an assessment component, a reporting component, and a risk mitigation component.

First claim

Opening claim text (preview).

We claim: 1 . A method for performing security evaluation on a machine learning model, comprising, using a processor: determining a taxonomy of the machine learning model and of an environment in which the machine learning model is implemented at one or more stages in a lifecycle of the machine learning model; generating, based on the determined taxonomy, a set of assumptions about the machine learning model and the environment; performing a first adversarial test attack on the machine learning model at a stage in the machine learning model's lifecycle, based at least in part on the set of assumptions; and identifying one or more failure modes in the machine learning model based on a result of the first adversarial attack. 2 . The method of claim 1 , further comprising assessing an effect of the one or more failure modes on a subsequent stage in the machine learning model's lifecycle. 3 . The method of claim 1 , wherein determining the taxonomy comprises identifying one or more of: one or more assets associated with the machine learning model, one or more adversaries, one or more adversary goals, an attack specificity, an error specificity, an attack vector, an attack method, an attack phase, an adversary strategy, one or more resources available to the adversary, a level of access an adversary possesses, a level of knowledge the adversary possesses, a vulnerability of an asset associated with the machine learning model, and a defence mechanism of the machine learning model. 4 . The method of claim 3 , wherein the taxonomy comprises identifying a level of access the adversary possesses, and wherein the level of access is evaluated based on one or more of: a model or explanation access, a raw data access, a data collector access, a feature extraction and transformations function access, a model training data access, access to a similar model architecture, and a query-based access. 5 . The method of claim 3 , wherein the taxonomy comprises identifying a level of knowledge the adversary possesses, and wherein the level of knowledge is evaluated based on one or more of: a task knowledge, a platform knowledge, and knowledge of the machine learning model or training data used to build or train the model. 6 . The method of claim 1 , wherein generating the set of assumptions comprises mapping adversarial attack stages to one or more of: asset(s) associated with the machine learning model, a vulnerability of an asset associated with the machine learning model, an attack being in an inference or a training phase of the machine learning model, a level of access an adversary possesses, and a level of knowledge the adversary possesses. 7 . The method of claim 1 , wherein the determining the taxonomy is performed by a threat modelling component that is trained on one or more of: data/deployment flow diagrams, machine learning models, data stores, stakeholders' security goals, and attack scenario catalogues. 8 . The method of claim 1 , wherein one or more determined threats are ranked based at least in part on a degree of cascading impacts of the determined threats on a subsequent stage or stages in the machine learning model's lifecycle or a presence of one or more compensating controls existing in relation to each of the said threats. 9 . The method of claim 1 , wherein generating the set of assumptions comprises identifying adversarial attack stages in terms of ML ATT&CK techniques and mapping the ML ATT&CK techniques to Common Vulnerabilities and Exposures (CVEs). 10 . The method of claim 9 , wherein mapping ATT&CK techniques to CVEs comprises computing a distance measurement between context representations in one or more CVE reports and concept representations of ATT&CK descriptions and generating a plurality of data labels for the mapping based on the computation. 11 . The method of claim 1 , further comprising generating a report comprising information including the one or more failure modes in the machine learning model, an effect of the one or more failure modes on a further stage in the machine learning model's lifecycle, or an adversarial context. 12 . The method of claim 1 , further comprising determining that a configuration of the machine learning model has been updated, and iterating the determining, generating, performing, and identifying. 13 . The method of claim 1 , wherein the test attack performed at least in part based on the assumptions comprises an evasion attack, an inference attack, a poisoning attack on a training dataset or a testing dataset, or a model stealing attack. 14 . The method of claim 1 , further comprising providing a notification at a user device in response to determining a presence of one or more failure modes in the machine learning model. 15 . The method of claim 1 , further comprising performing remediation step(s) on one or more features or inputs of the machine learning model based on the identified failure mode(s). 16 . The method of claim 1 , further comprising monitoring failure mode(s) over a period of time, identifying a pattern associated with one or more failure modes, and adjusting one or more parameters of the machine learning model based on the identified pattern. 17 . A tangible, non-transitory, computer-readable media having instructions thereupon which when implemented by a processor cause the processor to perform a method for performing security evaluation on a machine learning model, comprising: determining a taxonomy of the machine learning model and of an environment in which the machine learning model is implemented at one or more stages in a lifecycle of the machine learning model; generating, based on the determined taxonomy, a set of assumptions about the machine learning model and the environment; performing a first adversarial test attack on the machine learning model at a stage in the machine learning model's lifecycle, based at least in part on the set of assumptions; and identifying one or more failure modes in the machine learning model based on a result of the first adversarial attack. 18 . A system for performing security evaluation on a machine learning model, comprising: a threat modelling component configured to determine a taxonomy of the machine learning model and of an environment in which the machine learning model is implemented at one or more stages in a lifecycle of the machine learning model, wherein the threat modelling component is further configured to generate, based on the determined taxonomy, a set of assumptions about the machine learning model and the environment; an assessment component configured to perform a first adversarial test attack on the machine learning model at a stage in the machine learning model's lifecycle, based at least in part on the set of assumptions generated by the threat modelling component; and wherein the assessment component is further configured to identify one or more failure modes in the machine learning model based on a result of the first adversarial attack. 19 . The system of claim 18 , further comprising a reporting component configured to generate a report comprising information including the one or more failure modes in the machine learning model, an effect of the one or more failure modes on a further stage in the machine learning model's lifecycle, or an adversarial context. 20 . The system of claim 18 , further comprising a risk mitigation component configured to remediation step(s) on one or more features or inputs of the machine learning model based on the identif

Assignees

Inventors

Classifications

  • G06N20/00Primary

    Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025190873A1 cover?
This invention relates to a computer-implemented method and system for performing security evaluation on a machine learning (ML) model. The method includes determining a taxonomy of the ML model and of the environment in which the machine learning model is implemented at one or more stages in the model's lifecycle. The method additionally includes generating, based on the determined taxonomy, a…
Who is the assignee on this patent?
Univ Dublin
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 12 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).