Minimizing global error in an artificial neural network

US10068170B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10068170-B2
Application numberUS-201414492440-A
CountryUS
Kind codeB2
Filing dateSep 22, 2014
Priority dateSep 23, 2013
Publication dateSep 4, 2018
Grant dateSep 4, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Computer systems, machine-implemented methods, and stored instructions are provided for minimizing an approximate global error in an artificial neural network that is configured to predict model outputs based at least in part on one or more model inputs. A model manager stores the artificial neural network model. The model manager may then minimize an approximate global error in the artificial neural network model at least in part by causing evaluation of a mixed integer linear program that determines weights between artificial neurons in the artificial neural network model. The mixed integer linear program accounts for piecewise linear activation functions for artificial neurons in the artificial neural network model. The mixed integer linear program comprises a functional expression of a difference between actual data and modeled data, and a set of one or more constraints that reference variables in the functional expression.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: storing an artificial neural network model that is configured to predict one or more outputs based at least in part on one or more inputs, wherein the artificial neural network model comprises an input layer, one or more intermediate layers, and an output layer; and minimizing a global error in the artificial neural network model at least in part by solving a mixed integer linear program that directly determines, without performing a gradient descent, one or more weights between two or more artificial neurons in the artificial neural network model, wherein: the mixed integer linear program comprises one or more piecewise linear activation functions for one or more artificial neurons in the artificial neural network model, and said directly determines said one or more weights comprises branching a candidate set of weights into candidate sub-sets of weights and determining upper and lower bounds for said global error based on the candidate sub-sets of weights; configuring the artificial neural network model based on the one or more weights; wherein the method is performed by one or more computing devices. 2. The method of claim 1 , further comprising replacing, in the artificial neural network model, at least one non-linear activation function with at least one piecewise linear step function. 3. The method of claim 1 , further comprising replacing, in the artificial neural network model, at least one non-linear activation function with at least one continuous piecewise linear function. 4. The method of claim 1 , further comprising replacing, in the artificial neural network model, at least one non-linear activation function with at least one piecewise linear function that includes three or more segments. 5. The method of claim 1 , further comprising replacing, in the artificial neural network model, all of a plurality of non-linear activation functions with corresponding piecewise linear functions that approximate the non-linear activation functions. 6. The method of claim 1 , further comprising: creating the artificial neural network model based on known outputs, and after minimizing the global error in the artificial neural network model, using the artificial neural network model to predict one or more unknown outputs based at least in part on one or more known inputs. 7. The method of claim 1 , wherein the one or more piecewise linear activation functions are non-differentiable and non-usable with an alternative gradient approach that, if used, would minimize local error in the artificial neural network model by iteratively improving the one or more weights. 8. The method of claim 1 , wherein solving the mixed integer linear program comprises using one or more of a branch and cut technique, a cutting plane technique, a branch and price technique, a branch and bound technique, or a Lipschitzian optimization technique. 9. The method of claim 1 , wherein the mixed integer linear program comprises a functional expression of a difference between actual data and modeled data, and a set of one or more constraints that reference two or more variables in the functional expression. 10. One or more non-transitory computer-readable media storing instructions which, when executed by one or more processors, cause: storing an artificial neural network model that is configured to predict one or more outputs based at least in part on one or more inputs, wherein the artificial neural network model comprises an input layer, one or more intermediate layers, and an output layer; and minimizing a global error in the artificial neural network model at least in part by solving a mixed integer linear program that directly determines, without performing a gradient descent, one or more weights between two or more artificial neurons in the artificial neural network model, wherein: the mixed integer linear program comprises one or more piecewise linear activation functions for one or more artificial neurons in the artificial neural network model, and said directly determines said one or more weights comprises branching a candidate set of weights into candidate sub-sets of weights and determining upper and lower bounds for said global error based on the candidate sub-sets of weights; configuring the artificial neural network model based on the one or more weights. 11. The one or more non-transitory computer-readable media of claim 10 , the instructions further comprising instructions for replacing, in the artificial neural network model, at least one non-linear activation function with at least one piecewise linear step function. 12. The one or more non-transitory computer-readable media of claim 10 , the instructions further comprising instructions for replacing, in the artificial neural network model, at least one non-linear activation function with at least one continuous piecewise linear function. 13. The one or more non-transitory computer-readable media of claim 10 , the instructions further comprising instructions for replacing, in the artificial neural network model, at least one non-linear activation function with at least one piecewise linear function that includes three or more segments. 14. The one or more non-transitory computer-readable media of claim 10 , the instructions further comprising instructions for replacing, in the artificial neural network model, all of a plurality of non-linear activation functions with corresponding piecewise linear functions that approximate the non-linear activation functions. 15. The one or more non-transitory computer-readable media of claim 10 , the instructions further comprising instructions for: creating the artificial neural network model based on known outputs, and after minimizing the global error in the artificial neural network model, using the artificial neural network model to predict one or more unknown outputs based at least in part on one or more known inputs. 16. The one or more non-transitory computer-readable media of claim 10 , wherein the one or more piecewise linear activation functions are non-differentiable and non-usable with an alternative gradient approach that, if used, would minimize local error in the artificial neural network model by iteratively improving the one or more weights. 17. The one or more non-transitory computer-readable media of claim 10 , wherein solving the mixed integer linear program comprises using one or more of a branch and cut technique, a cutting plane technique, a branch and price technique, a branch and bound technique, or a Lipschitzian optimization technique. 18. The one or more non-transitory computer-readable media of claim 10 , wherein the mixed integer linear program comprises a functional expression of a difference between actual data and modeled data, and a set of one or more constraints that reference two or more variables in the functional expression.

Assignees

Inventors

Classifications

  • G06N3/08Primary

    Learning methods · CPC title

  • Architecture, e.g. interconnection topology · CPC title

  • Supervised learning · CPC title

  • Feedforward networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10068170B2 cover?
Computer systems, machine-implemented methods, and stored instructions are provided for minimizing an approximate global error in an artificial neural network that is configured to predict model outputs based at least in part on one or more model inputs. A model manager stores the artificial neural network model. The model manager may then minimize an approximate global error in the artificial …
Who is the assignee on this patent?
Oracle Int Corp
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 04 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).