Quantifying the predictive uncertainty of neural networks via residual estimation with I/O kernel

US11681901B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11681901-B2
Application numberUS-202016879934-A
CountryUS
Kind codeB2
Filing dateMay 21, 2020
Priority dateMay 23, 2019
Publication dateJun 20, 2023
Grant dateJun 20, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A residual estimation with an I/O kernel (“RIO”) framework provides estimates of predictive uncertainty of neural networks, and reduces their point-prediction errors. The process captures neural network (“NN”) behavior by estimating their residuals with an I/O kernel using a modified Gaussian process (“GP”). RIO is applicable to real-world problems, and, by using a sparse GP approximation, scales well to large datasets. RIO can be applied directly to any pretrained NNs without modifications to model architecture or training pipeline.

First claim

Opening claim text (preview).

The invention claimed is: 1. A process for estimating residuals of a neural network (NN) model to determine uncertainty in the NN model's predictions of a value of at least one of a physical, chemical or electrical variable, the process comprising: training, by a processor, a NN model to make one or more predictions of the value of the at least one of a physical, chemical or electrical variable using a training data ( , y) input set, wherein the training data input set includes (x i ,y i )) n i=1 wherein y i are the expected outcomes by the NN model given input x i; storing, by the processor, an output data set from the NN model, including the one or more predictions resulting from operation on the training data input set, wherein the output data set includes (x i , ŷ i )} n i=1 , wherein ŷ i are the predicted outcomes by the NN model given input x i ; and training, by the processor, a Gaussian process (GP) to estimate residuals of the NN model when applied to raw input data x* using the training data input set (x i , y i )) n i=1 and the output data set (x i , ŷ i )) n i=1, wherein the training of the Gaussian process (GP) includes: calculating, by the processor, residuals r={r i =y i −ŷ i } n i=1 wherein r denotes the vector of all residuals and ŷ denotes the vector of all NN model predictions; calculating, by the processor, an n×n covariance matrix at all pairs of training points based on a composite kernel K c (( , ŷ), ( , ŷ)), where each entry is given by k c ((x i , ŷ i ), (x j , ŷ j ))=k in (x i , x j )+k out (ŷ i , ŷ j ), for i,j=1,2, . . . , n; and optimizing, by a gradient-based optimizer, GP hyperparameters σ 2 in , l in , σ 2 out , l out , and σ 2 n by maximizing log marginal likelihood log p ( r| ,ŷ )=−½ r T ( K c (( , ŷ ), ( , ŷ ))+σ 2 n I ) −1 r −½log| K c (( , ŷ ), ( ,ŷ ))+σ 2 n I|−n /2log 2π. 2. The process according to claim 1 , wherein the NN model is a fully connected feed-forward network. 3. The process according to claim 1 , further comprising: applying, by the processor, the trained Gaussian process (GP) to predictions ŷ * of a neural network (NN) model applied to raw input data x * , the applying including: calculating, by the processor, residual mean; calculating, by the processor, residual variance; and returning, by the processor, distribution of calibrated prediction ŷ′ * . 4. The process according to claim 3 , further comprising calculating, by the processor, residual mean in accordance with {circumflex over ( r )} * =k T * (K c (( , ŷ), ( , ŷ))+σ 2 n I) −1 r; calculating, by the processor, residual variance in accordance with var({circumflex over (r)} * )=k c ((x * , ŷ * ), (x * , ŷ * ))−k T * (K c (( , ŷ), ( , ŷ))+σ 2 n I) −1 k * ; and returning, by the processor, a distribution of calibrated prediction ŷ′ 2 in accordance with ŷ′ * ˜ (ŷ * +{circumflex over ( r )} * , var({circumflex over (r)} * )). 5. The process according to claim 3 , wherein the NN model is a fully connected feed-forward network.

Assignees

Inventors

Classifications

  • Learning methods · CPC title

  • Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title

  • G06N3/047Primary

    Probabilistic or stochastic networks · CPC title

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11681901B2 cover?
A residual estimation with an I/O kernel (“RIO”) framework provides estimates of predictive uncertainty of neural networks, and reduces their point-prediction errors. The process captures neural network (“NN”) behavior by estimating their residuals with an I/O kernel using a modified Gaussian process (“GP”). RIO is applicable to real-world problems, and, by using a sparse GP approximation, scal…
Who is the assignee on this patent?
Cognizant Tech Solutions U S Corporation
What technology area does this patent fall under?
Primary CPC classification G06N3/047. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 20 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).