Method and apparatus for data caching

US11803475B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11803475-B2
Application numberUS-201917640276-A
CountryUS
Kind codeB2
Filing dateNov 28, 2019
Priority dateSep 3, 2019
Publication dateOct 31, 2023
Grant dateOct 31, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present invention provides a method and apparatus for data caching. The method comprises: output matrixes are acquired one by one, a plurality of acquired output matrixes are written alternately into two queue sets of a first cache unit according to a sequence in which the output matrixes are acquired, and the output matrixes stored line by line in a first cache unit are written into a second cache unit one by one, according to the sequence in which the output matrixes are written into the second cache unit, valid data of each output matrix of the second cache unit is determined one by one according to preset parameters, and the valid data of each output matrix is written into a third cache unit, and the valid data of the output matrixes stored in the third cache unit are configured to be sequentially written into a memory according to a sequence in which the valid data are written into the third cache unit. In the present solution, the output matrixes are cached by using cache units with the writing speed matching with the computing speed of a processor, and the output matrixes are completely written into a memory one by one according to a sequence of generation time. Therefore, the present invention may solve the problem that the computing speed of the processor does not match with the writing speed of the memory.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for data caching, comprising: acquiring an output matrix from a processor, wherein the output matrix is an N-order matrix, and N is a positive integer; respectively writing N rows of data of the output matrix into N first-level cache queues of a target queue set of a first cache unit; wherein the first cache unit is preconfigured with two queue sets, the target queue set is the queue set that is not used to store a previous output matrix of the output matrix in the two queue sets; and the writing speed of the first cache unit matches with the computing speed of the processor; after the previous output matrix of the output matrix stored in the first cache unit is written into a second cache unit, writing the data of the output matrix stored in the target queue set into the second cache unit line by line, so as to write the output matrix into the second cache unit; wherein the writing speed of the second cache unit matches with the computing speed of the processor; and after valid data of the previous output matrix of the output matrix stored in the second cache unit is written into a third cache unit, determining valid data in the output matrix according to preset parameters, and writing the valid data of the output matrix into the third cache unit; wherein the valid data of a plurality of output matrixes in the third cache unit is configured to be sequentially written into a memory in a sequence in which the output matrixes are acquired, and wherein the writing speed of the third cache unit matches with the computing speed of the processor. 2. The method according to claim 1 , wherein, the output matrix is an output matrix obtained by convolution computation using a two-dimensional systolic array during the computing process of a convolutional neural network; before respectively writing N rows of data of the output matrix into N first-level cache queues of a target queue set of a first cache unit, the method further comprises: rearranging the data matrix according to a preset data storage sequence, to obtain an output matrix after rearranging; respectively writing N rows of data of the output matrix into N first-level cache queues of a target queue set of a first cache unit comprises: respectively writing N rows of data of an output matrix after rearranging into N first-level cache queues of a target queue set of a first cache unit. 3. The method according to claim 1 , wherein, the method further comprises the following step before respectively writing N rows of data of the output matrix into N first-level cache queues of a target queue set of a first cache unit: deleting redundant data of the output matrix, to obtain a filtered output matrix; respectively writing N rows of data of the output matrix into N first-level cache queues of a target queue set of a first cache unit comprises: writing the filtered output matrix into a target queue set of a first cache unit, wherein M rows of data of the filtered output matrix are respectively stored in M cache queues of the target queue set, wherein M is a positive integer less than or equal to N. 4. The method according to claim 1 , wherein, the output matrix is an output matrix obtained by convolution computation using a two-dimensional systolic array during the computing process of the convolutional neural network; determining the valid data in the output matrix according to preset parameters comprises: determining valid data in the output matrix according to a preset step size in the neural network. 5. The method according to claim 2 , wherein, the process of performing convolution computation by using a two-dimensional systolic array to obtain an output matrix comprises: splitting input data of a convolutional layer into a plurality of input matrixes; and performing convolution computation on the input matrix using a two-dimensional systolic array aiming at each input matrix, to obtain an output matrix corresponding to the input matrix. 6. The method according to claim 4 , wherein, the process of performing convolution computation by using a two-dimensional systolic array to obtain an output matrix comprises: splitting input data of a convolutional layer into a plurality of input matrixes; and performing convolution computation on the input matrix using a two-dimensional systolic array aiming at each input matrix, to obtain an output matrix corresponding to the input matrix.

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Overlapped cache accessing, e.g. pipeline (G06F12/0846 takes precedence) · CPC title

  • with two or more cache hierarchy levels (with multilevel cache hierarchies G06F12/0811) · CPC title

  • Latency reduction · CPC title

  • G06F3/0656Primary

    Data buffering arrangements · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11803475B2 cover?
The present invention provides a method and apparatus for data caching. The method comprises: output matrixes are acquired one by one, a plurality of acquired output matrixes are written alternately into two queue sets of a first cache unit according to a sequence in which the output matrixes are acquired, and the output matrixes stored line by line in a first cache unit are written into a seco…
Who is the assignee on this patent?
Inspur Electronic Information Industry Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F12/0855. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 31 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).