System and method for divide-and-conquer checkpointing
US-11868771-B2 · Jan 9, 2024 · US
US12204362B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12204362-B2 |
| Application number | US-202318494357-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 25, 2023 |
| Priority date | Sep 13, 2016 |
| Publication date | Jan 21, 2025 |
| Grant date | Jan 21, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system and method which allows the basic checkpoint-reverse-mode AD strategy (of recursively decomposing the computation to reduce storage requirements of reverse-mode AD) to be applied to arbitrary programs: not just programs consisting of loops, but programs with arbitrarily complex control flow. The method comprises (a) transforming the program into a formalism that allows convenient manipulation by formal tools, and (b) introducing a set of operators to allow computations to be decomposed by running them for a given period of time then pausing them, while treating the paused program as a value subject to manipulation.
Opening claim text (preview).
The invention claimed is: 1. A method for computing a gradient of a computation implemented as a computer program using a processor and a memory, the method comprising: a. representing the computation as a sequence of computational steps involving primitive computations; b. splitting the sequence of steps at a split point to generate at least a first portion of the computation and a second portion of the computation; c. computing an intermediate state of the computation at the split point by applying the first portion of the computation prior to the split point; and computing a value of the entire computation and the gradient of the computation at the split point using the intermediate state of the computation and the second portion of the computation. 2. The method of claim 1 , the computing the value of the entire computation and the gradient of the computation at the split point further comprising: d. recursively applying the method to the second portion of the computation beginning at the split point to compute both the value of the entire computation and the gradient of the computation at the split point; and e. recursively applying the method to the first portion of the computation up to the split point to compute a gradient of the computation at an input, when a number of computational steps exceeds a minimum. 3. The method of claim 2 , the recursively applying the method to the first portion of the computation comprising: using automatic differentiation in its reverse-accumulation mode to compute the gradient of the computation at the input, when the number of computational steps does not exceed the minimum. 4. The method of claim 1 , wherein the split point is either a midpoint or a point selected based on a trade-off between computation and storage. 5. The method of claim 1 , wherein a functionality is exposed to a user through an application programmer interface comprising an operator or higher-order function that takes a function and its argument as input and returns as output the value of the function at the input and the gradient of the function at the input. 6. The method of claim 5 , wherein the operator or higher-order function is nested. 7. The method of claim 1 , wherein a general-purpose checkpointing mechanism is used to divide the sequence of computational steps into the first portion and the second portion, the general-purpose checkpointing mechanism comprising: evaluating a function at its input and return both the value at its output and the number of steps required to compute the output: evaluating the first specified number of steps of a computation of a function applied to an argument and return the intermediate state of the computation after that specified number of steps as a reusable checkpoint; and resuming a checkpointed computation at its saved intermediate state and return the result of its computation. 8. The method of claim 7 , wherein the general-purpose checkpointing mechanism is provided by an interpreter written in continuation passing style. 9. The method of claim 7 , wherein the general-purpose checkpointing mechanism is provided by a compiler that generates code in continuation passing style. 10. The method of claim 7 , wherein the general-purpose checkpointing mechanism is provided using the POSIX fork( ) primitive. 11. The method of claim 7 , wherein the general-purpose checkpointing mechanism proceeds without knowledge of an amount of computation in the primal computation, and functions in an online fashion, discarding previously acquired checkpoints so as to maintain the logarithmic overhead in both time and space regardless of when the primal computation terminates.
Function evaluation by approximation methods, e.g. inter- or extrapolation, smoothing, least mean square method ({G06F17/18 takes precedence } ; interpolation for numerical control G05B19/18) · CPC title
Differential equations (using digital differential analysers G06F7/64) · CPC title
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
Saving or restoring of program or task context · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.