Sliding window memory optimizations for time-series foundation models

US2025117550A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025117550-A1
Application numberUS-202318377564-A
CountryUS
Kind codeA1
Filing dateOct 6, 2023
Priority dateOct 6, 2023
Publication dateApr 10, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An embodiment senses a raw data sequence by a central processing unit, responsive to the raw data sequence, computes by the central processing unit a transfer data size of the raw data sequence based at least in part on the comparison of a data size of the raw data sequence to a memory size of a graphics processing unit. The embodiment transfers by the central processing unit of the raw data sequence to the graphics processing unit based on the transfer data size. The embodiment trains a foundation model on the raw data sequence where a sliding window algorithm is executed on the raw data sequence by the graphics processing unit, where generating a window of the sliding window algorithm is based on a memory pointer to the raw data sequence.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method comprising: sensing a raw data sequence by a central processing unit, responsive to the raw data sequence, computing by the central processing unit a transfer data size of the raw data sequence based at least in part on a comparison of a data size of the raw data sequence to a memory size of a graphics processing unit; transferring by the central processing unit of the raw data sequence to the graphics processing unit based on the transfer data size; and training a foundation model on the raw data sequence wherein a sliding window algorithm is executed on the raw data sequence by the graphics processing unit, wherein generating a window of the sliding window algorithm is based on a memory pointer to the raw data sequence. 2 . The computer-implemented method of claim 1 , wherein generating the window of the sliding window algorithm comprises a zero-copy operation. 3 . The computer-implemented method of claim 1 , wherein transferring by the central processing unit of the raw data sequence further comprises partitioning the raw data sequence by the central processing unit determined by the transfer data size wherein the transfer data size is less than the memory size of a graphics processing unit. 4 . The computer-implemented method of claim 1 , wherein transferring by the central processing unit of the raw data sequence further comprises partitioning the raw data sequence by the central processing unit determined by the transfer data size wherein the transfer data size is based on a batch size and a sequence length determined in part by a training parameter of the foundation model. 5 . The computer-implemented method of claim 1 , wherein the raw data sequence transferred by the central processing unit to the graphics processing unit is a full dataset. 6 . The computer-implemented method of claim 1 , wherein training the foundation model further comprises executing a graph-based operation during a forward pass of the training. 7 . The computer-implemented method of claim 1 , wherein training the foundation model on the raw data sequence further comprises converting the raw data sequence into a tensor on the graphics processing unit. 8 . A computer program product comprising one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions executable by a processor to cause the processor to perform operations comprising: sensing a raw data sequence by a central processing unit, responsive to the raw data sequence, computing by the central processing unit a transfer data size of the raw data sequence based at least in part on a comparison of a data size of the raw data sequence to a memory size of a graphics processing unit; transferring by the central processing unit of the raw data sequence to the graphics processing unit based on the transfer data size; and training a foundation model on the raw data sequence wherein a sliding window algorithm is executed on the raw data sequence by the graphics processing unit, wherein generating a window of the sliding window algorithm is based on a memory pointer to the raw data sequence. 9 . The computer program product of claim 8 , wherein generating the window of the sliding window algorithm comprises a zero-copy operation. 10 . The computer program product of claim 8 , wherein transferring by the central processing unit of the raw data sequence further comprises partitioning the raw data sequence by the central processing unit determined by the transfer data size wherein the transfer data size is less than the memory size of a graphics processing unit. 11 . The computer program product of claim 8 , wherein transferring by the central processing unit of the raw data sequence further comprises partitioning the raw data sequence by the central processing unit determined by the transfer data size wherein the transfer data size is based on a batch size and a sequence length determined in part by a training parameter of the foundation model. 12 . The computer program product of claim 8 , wherein the raw data sequence transferred by the central processing unit to the graphics processing unit is a full dataset. 13 . The computer program product of claim 8 , wherein training the foundation model further comprises executing a graph-based operation during a forward pass of the training. 14 . The computer program product of claim 8 , wherein training the foundation model on the raw data sequence further comprises converting the raw data sequence into a tensor on the graphics processing unit. 15 . A computer system comprising a processor and one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions executable by the processor to cause the processor to perform operations comprising: sensing a raw data sequence by a central processing unit, responsive to the raw data sequence, computing by the central processing unit a transfer data size of the raw data sequence based at least in part on a comparison of a data size of the raw data sequence to a memory size of a graphics processing unit; transferring by the central processing unit of the raw data sequence to the graphics processing unit based on the transfer data size; and training a foundation model on the raw data sequence wherein a sliding window algorithm is executed on the raw data sequence by the graphics processing unit, wherein generating a window of the sliding window algorithm is based on a memory pointer to the raw data sequence. 16 . The computer system of claim 15 , wherein generating the window of the sliding window algorithm comprises a zero-copy operation. 17 . The computer system of claim 15 , wherein transferring by the central processing unit of the raw data sequence further comprises partitioning the raw data sequence by the central processing unit determined by the transfer data size wherein the transfer data size is less than the memory size of a graphics processing unit. 18 . The computer system of claim 15 , wherein transferring by the central processing unit of the raw data sequence further comprises partitioning the raw data sequence by the central processing unit determined by the transfer data size wherein the transfer data size is based on a batch size and a sequence length determined in part by a training parameter of the foundation model. 19 . The computer system of claim 15 , wherein the raw data sequence transferred by the central processing unit to the graphics processing unit is a full dataset. 20 . The computer system of claim 15 , wherein training the foundation model on the raw data sequence further comprises converting the raw data sequence into a tensor on the graphics processing unit.

Assignees

Inventors

Classifications

  • Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • G06F30/27Primary

    using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025117550A1 cover?
An embodiment senses a raw data sequence by a central processing unit, responsive to the raw data sequence, computes by the central processing unit a transfer data size of the raw data sequence based at least in part on the comparison of a data size of the raw data sequence to a memory size of a graphics processing unit. The embodiment transfers by the central processing unit of the raw data se…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F30/27. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 10 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).