Neural network processor for handling differing datatypes
US-2019340489-A1 · Nov 7, 2019 · US
US12254405B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12254405-B2 |
| Application number | US-202117200090-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 12, 2021 |
| Priority date | Mar 12, 2021 |
| Publication date | Mar 18, 2025 |
| Grant date | Mar 18, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Technologies are provided for processing data in neural networks. An example method can include processing, by each layer of a neural network, a row in a first stripe of a data input, the row being processed sequentially in a horizontal direction and according to a layer-by-layer sequence; after processing the row, processing, by each layer, subsequent rows in the first stripe on a row-by-row basis, each subsequent row being processed sequentially in the horizontal direction and according to the layer-by-layer sequence; generating an output stripe based on the processing of the row and subsequent rows; processing, by each layer, a second stripe of the data input, each row in the second stripe being processed in the horizontal direction and according to the layer-by-layer sequence, wherein rows in the second stripe are processed on a row-by-row basis; and generating another output stripe based on the processing of the second stripe.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: memory; and one or more processors coupled to the memory, the one or more processors being configured to: obtain input data comprising stripes that logically segment the input data, each stripe of the input data including at least one row of data; process, by each layer of a neural network, a row of data in a first stripe of the input data, wherein the row of data is processed sequentially in a horizontal direction and according to a layer-by-layer sequence where each preceding layer of the neural network processes a particular row of data before each subsequent layer of the neural network, wherein an input is generated for a particular layer of the neural network based on a combination of a stored output generated by a previous layer of the neural network for one or more blocks in the row of data in the first stripe and one or more stored lines of data from the one or more blocks in the row of data in the first stripe, and wherein an additional output is generated by the particular layer of the neural network based on the output generated by the previous layer for processing by a subsequent layer of the neural network; after processing the row of data in the first stripe, process, by each layer of the neural network, subsequent rows of data in the first stripe on a row-by-row basis, wherein each subsequent row of data is processed sequentially in the horizontal direction and according to the layer-by-layer sequence; generate, by the neural network, a first output stripe based on the processing of the row of data and the subsequent rows of data; after processing the first stripe, process, by each layer of the neural network, rows of data in a second stripe of the input data on a row-by-row basis, wherein each row of data in the second stripe is processed in the horizontal direction and according to the layer-by-layer sequence; and generate a second output stripe based on the processing of the second stripe. 2. The apparatus of claim 1 , wherein, to process the row of data in the first stripe sequentially in the horizontal direction, the one or more processors are configured to: sequentially process, by each layer of the neural network, a plurality of blocks of data in the row, wherein each layer of the neural network processes each preceding block of data along a depth direction before processing a subsequent block along the horizontal direction. 3. The apparatus of claim 2 , wherein, to process the subsequent rows, the one or more processors are configured to: sequentially process, by each layer of the neural network, a respective plurality of blocks of data in each subsequent row, wherein each layer of the neural network processes preceding blocks of data in a subsequent row along the depth direction before processing subsequent blocks of data in the subsequent row. 4. The apparatus of claim 1 , wherein the neural network comprises a pixel-to-pixel neural network, and wherein the input data comprises pixels associated with an image. 5. The apparatus of claim 1 , wherein, to obtain the input data, the one or more processors are configured to: logically segment the input data into the stripes, wherein each stripe comprises a respective portion of the input data. 6. The apparatus of claim 1 , wherein the one or more processors are configured to: store, in a first memory, the output generated by the previous layer of the neural network for the one or more blocks in the row of data in the first stripe; and store, in a second memory associated with the particular layer of the neural network, the one or more lines of data from the one or more blocks in the row of data, wherein the one or more lines of data comprise a portion of a data input of the particular layer of the neural network on a previous stripe-block-row. 7. The apparatus of claim 6 , wherein the one or more processors are configured to: determine a portion of the input for the particular layer; and store the portion of the input in a third memory associated with the particular layer of the neural network. 8. The apparatus of claim 7 , wherein the one or more processors are configured to: generate an additional input for the subsequent layer of the neural network based on a combination of the additional output of the particular layer and the portion of the input for a subsequent layer from a previous stripe-block-row of the subsequent layer; and generate, by the subsequent layer of the neural network, a second additional output based on the additional input for the subsequent layer. 9. The apparatus of claim 8 , wherein the one or more processors are configured to: store the second additional output in fourth memory, wherein the second memory and the third memory comprise line stores in a line store memory, and wherein the first memory and the fourth memory comprise buffers in scratch memory. 10. The apparatus of claim 1 , wherein the one or more processors are configured to: store, in a first memory associated with a particular layer of the neural network, outputs generated by a previous layer of the neural network for one or more blocks in each subsequent row of data; and store, in a second memory, one or more lines of data from the one or more blocks in each subsequent row of data, wherein the one or more lines of data comprise one or more portions of one or more data inputs of the particular layer of the neural network. 11. The apparatus of claim 10 , wherein the one or more processors are configured to: generate inputs for the particular layer of the neural network based on combinations of the outputs generated by the previous layer and the one or more lines of data comprising the one or more portions of the one or more data inputs of the particular layer for a previous stripe-block-row; and generate, by the particular layer of the neural network, additional outputs based on the outputs from the previous layer and a subset of the inputs from the particular layer on the previous stripe-block-row. 12. The apparatus of claim 11 , wherein the one or more processors are configured to: determine portions of the inputs for the particular layer; store the portions of the inputs in a third memory associated with a subsequent layer of the neural network; and store the additional outputs in a fourth memory, wherein the second memory and the fourth memory comprise memory buffers. 13. The apparatus of claim 1 , wherein, to process the row of data, the one or more processors are configured to: process, by each subsequent layer of the neural network, an output of a preceding layer of the neural network, the output corresponding to the row of data. 14. The apparatus of claim 1 , wherein, to process the subsequent rows of data, the one or more processors are configured to: process, by each subsequent layer of the neural network, an output of a preceding layer of the neural network, the output corresponding to a subsequent row of data in the first stripe. 15. The apparatus of claim 1 , wherein the apparatus comprises a camera device. 16. The apparatus of claim 1 , wherein the apparatus comprises a mobile device. 17. A method comprising: obtaining input data comprising stripes that logically segment the input data, each stripe of the input data including at least one row of data; processing, by each layer of a neural network, a row of data in a first stripe of the input data, wherein the row of data is processed sequentially in a horizontal direction and according to a layer-by-layer sequence where each preceding layer of the neural network proce
Convolutional networks [CNN, ConvNet] · CPC title
using neural networks · CPC title
Partitioning the feature space · CPC title
using electronic means · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.