Configuring a neural network using smoothing splines

US12468931B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12468931-B2
Application numberUS-202017075963-A
CountryUS
Kind codeB2
Filing dateOct 21, 2020
Priority dateOct 21, 2020
Publication dateNov 11, 2025
Grant dateNov 11, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An embodiment includes identifying an initial plurality of sets of hyperparameter values at which to evaluate an objective function that relates hyperparameter values to performance values of a neural network. The embodiment also executes training processes on the neural network with the hyperparameters set to the each of the initial sets of hyperparameter values such that the training process provides an initial set of the performance values for the objective function. The embodiment also generates an approximation of the objective function using splines at selected performance values. The embodiment approximates a point at which the approximation of the objective function reaches a maximum value, then determines an updated set of hyperparameter values associated with the maximum value. The embodiment then executes a runtime process using the neural network with the hyperparameters set to the updated set of hyperparameter values.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method comprising: identifying a first set of support points comprising an initial plurality of sets of hyperparameter values at which to evaluate an objective function having coefficients that relate hyperparameter values of a neural network to respective performance values, wherein the respective performance values are representative of an error rate; wherein the hyperparameter values comprise at least one of a dropout rate, a weight norm, a hidden layer size, a convolutional kernel size, a pooling size; wherein the neural network is a convolutional neural network; executing initial training processes on the neural network with hyperparameters set to the initial plurality of sets of hyperparameter values such that the initial training processes provide an initial set of the performance values, one for each set of hyperparameter values, for the objective function; generating an approximation of the objective function using splines that pass through the first set of support points at selected performance values; calculating a coefficient for the approximation of the objective function using QR decomposition, wherein the QR decomposition used for calculating the coefficient for the approximation of the objective function comprises a householder QR decomposition; approximating a point at which the approximation of the objective function reaches a maximum value using a hierarchical Monte Carlo technique; determining an updated set of support points comprising an updated set of hyperparameter values associated with the maximum value, wherein the updated set of hyperparameter values comprises a value of a network subcell layout parameter; and executing an updated training process on the neural network with hyperparameters set to the updated set of hyperparameter values. 2 . The computer-implemented method of claim 1 , wherein the splines used for generating the approximation of the objective function comprise polyharmonic splines. 3 . The computer-implemented method of claim 2 , wherein the generating of the approximation of the objective function comprises using the polyharmonic splines comprising assembling the approximation using radial basis functions. 4 . The computer-implemented method of claim 1 , wherein the neural network is configured for classifying types of data selected from the group consisting of image data, audio data, and text data. 5 . The computer-implemented method of claim 1 , wherein the performance values provide measures of performance of the neural network. 6 . A computer program product comprising one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions executable by a processor to cause the processor to perform operations comprising: wherein the stored program instructions are stored in a computer readable storage device in a data processing system, and wherein the stored program instructions are transferred over a network from a remote data processing system; identifying a first set of support points comprising an initial plurality of sets of hyperparameter values at which to evaluate an objective function having coefficients that relate hyperparameter values of a neural network to respective performance values, wherein the respective performance values are representative of an error rate; executing initial training processes on the neural network with hyperparameters set to the initial plurality of sets of hyperparameter values such that the initial training processes provide an initial set of the performance values, one for each set of hyperparameter values, for the objective function; generating an approximation of the objective function using splines that pass through the first set of support points at selected performance values; calculating a coefficient for the approximation of the objective function using QR decomposition, wherein the QR decomposition used for calculating the coefficient for the approximation of the objective function comprises a householder QR decomposition; approximating a point at which the approximation of the objective function reaches a maximum value using a hierarchical Monte Carlo technique; determining an updated set of support points comprising an updated set of hyperparameter values associated with the maximum value, wherein the updated set of hyperparameter values comprises a value of a network subcell layout parameter; and executing an updated training process on the neural network with hyperparameters set to the updated set of hyperparameter values. 7 . The computer program product of claim 6 , wherein the stored program instructions are stored in a computer readable storage device in a server data processing system, and wherein the stored program instructions are downloaded in response to a request over a network to a remote data processing system for use in a computer readable storage device associated with the remote data processing system, further comprising: program instructions to meter use of the program instructions associated with the request; and program instructions to generate an invoice based on the metered use. 8 . The computer program product of claim 6 , wherein the neural network is a convolutional neural network. 9 . The computer program product of claim 6 , wherein the splines used for generating the approximation of the objective function comprise polyharmonic splines. 10 . The computer program product of claim 9 , wherein the generating of the approximation of the objective function comprises using the polyharmonic splines comprising assembling the approximation using radial basis functions. 11 . A computer system comprising a processor and one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions executable by the processor to cause the processor to perform operations comprising: wherein the stored program instructions are stored in a computer readable storage device in a data processing system, and wherein the stored program instructions are transferred over a network from a remote data processing system; identifying a first set of support points comprising an initial plurality of sets of hyperparameter values at which to evaluate an objective function having coefficients that relate hyperparameter values of a neural network to respective performance values, wherein the respective performance values are representative of an error rate; executing initial training processes on the neural network with hyperparameters set to the initial plurality of sets of hyperparameter values such that the initial training processes provide an initial set of the performance values, one for each set of hyperparameter values, for the objective function; generating an approximation of the objective function using splines that pass through the first set of support points at selected performance values; calculating a coefficient for the approximation of the objective function using QR decomposition, wherein the QR decomposition used for calculating the coefficient for the approximation of the objective function comprises a householder QR decomposition; approximating a point at which the approximation of the objective function reaches a maximum value using a hierarchical Monte Carlo technique; determining an updated set of support points comprising an updated set of hyperparameter values associated with the maximum value, wherein the updated set of hyperparameter values comprises a value of a network subcell layout parameter; and execu

Assignees

Inventors

Classifications

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Validation; Performance evaluation; Active pattern learning techniques · CPC title

  • Selection of the most significant subset of features · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12468931B2 cover?
An embodiment includes identifying an initial plurality of sets of hyperparameter values at which to evaluate an objective function that relates hyperparameter values to performance values of a neural network. The embodiment also executes training processes on the neural network with the hyperparameters set to the each of the initial sets of hyperparameter values such that the training process …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 11 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).