Efficient memory layout for enabling smart data compression in machine learning environments

US10600147B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10600147-B2
Application numberUS-201715682795-A
CountryUS
Kind codeB2
Filing dateAug 22, 2017
Priority dateAug 22, 2017
Publication dateMar 24, 2020
Grant dateMar 24, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A mechanism is described for facilitating efficient memory layout for enabling smart data compression in machine learning environments. A method of embodiments, as described herein, includes facilitating dividing an initial tile representing an image into primary multiple tiles such that each tile of the primary multiple tiles is regarded as an independent image as processed by one or more processors of a computing device. The method may further include computing the primary multiple tiles into secondary multiple tiles compatible in size of a local buffer. The method may further include merging the multiple secondary multiple tiles into a final tile representing the image, and compressing the final tile.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: one or more processors to: divide an initial tile representing an image into primary multiple tiles such that each tile of the primary multiple tiles is regarded as an independent image as processed by the one or more processors of the apparatus; compute the primary multiple tiles into secondary multiple tiles compatible in size of a local buffer; merge the multiple secondary multiple tiles into a final tile representing the image; and compress the final tile. 2. The apparatus of claim 1 , wherein the one or more processors are further to detect one or more of the initial tile and the size of the local buffer. 3. The apparatus of claim 1 , wherein the one or more processors are further to predict at least one of one or more layers associated with computing or processing the primary multiple tiles into the secondary multiple tiles based on the size of the local buffer. 4. The apparatus of claim 3 , wherein computing the primary multiple tiles into the secondary multiple tiles comprises convoluting or pooling the primary multiple tiles at each of the one or more layers. 5. The apparatus of claim 1 , wherein compressing the final tile is lossless, wherein the compressed final tile is used by one or more hardware accelerators for performing machine learning or deep learning processing of the image associated with the compressed final tile. 6. The apparatus of claim 1 , wherein the one or more processors comprise one or more of a graphics processor and an application processor, the graphics processor hosting the one or more hardware accelerators including one or more machine learning hardware accelerators. 7. The apparatus of claim 6 , wherein the graphics processor is co-located with the application processor on a common semiconductor package. 8. A method comprising: dividing an initial tile representing an image into primary multiple tiles such that each tile of the primary multiple tiles is regarded as an independent image as processed by one or more processors of a computing device; computing the primary multiple tiles into secondary multiple tiles compatible in size of a local buffer; merging the multiple secondary multiple tiles into a final tile representing the image; and compressing the final tile. 9. The method of claim 8 , further comprising detecting one or more of the initial tile and the size of the local buffer. 10. The method of claim 8 , further comprising predicting at least one of one or more layers necessary for computation or processing of the primary multiple tiles into the secondary multiple tiles based on the size of the local buffer. 11. The method of claim 10 , wherein computing the primary multiple tiles into the secondary multiple tiles comprises convoluting or pooling of the primary multiple tiles at each of the one or more layers. 12. The method of claim 8 , wherein compressing the final tile is lossless, wherein the compressed final tile is used by one or more hardware accelerators for performing machine learning or deep learning processing of the image associated with the compressed final tile. 13. The method of claim 8 , wherein the one or more processors comprise one or more of a graphics processor and an application processor, the graphics processor hosting the one or more hardware accelerators including one or more machine learning hardware accelerators. 14. The method of claim 13 , wherein the graphics processor is co-located with the application processor on a common semiconductor package. 15. At least one non-transitory machine-readable storage medium storing a plurality of instructions executed on a computing device to facilitate the computing device to perform operations comprising: dividing an initial tile representing an image into primary multiple tiles such that each tile of the primary multiple tiles is regarded as an independent image as processed by one or more processors of the computing device; computing the primary multiple tiles into secondary multiple tiles compatible in size of a local buffer; merging the multiple secondary multiple tiles into a final tile representing the image; and compressing the final tile. 16. The non-transitory machine-readable storage medium of claim 15 , wherein the operations further comprise detecting one or more of the initial tile and the size of the local buffer. 17. The non-transitory machine-readable storage medium of claim 15 , wherein the operations further comprise predicting at least one of one or more layers necessary for computing or processing of the primary multiple tiles into the secondary multiple tiles based on the size of the local buffer. 18. The non-transitory machine-readable storage medium of claim 17 , wherein computing the primary multiple tiles into the secondary multiple tiles comprises convoluting or pooling the primary multiple tiles at each of the one or more layers. 19. The non-transitory machine-readable storage medium of claim 15 , wherein compressing the final tile is lossless, wherein the compressed final tile is used by one or more hardware accelerators for performing machine learning or deep learning processing of the image associated with the compressed final tile. 20. The non-transitory machine-readable storage medium of claim 15 , wherein the one or more processors comprise one or more of a graphics processor and an application processor, the graphics processor hosting the one or more hardware accelerators including one or more machine learning hardware accelerators, wherein the graphics processor is co-located with the application processor on a common semiconductor package. 21. A system comprising: one or more processors to divide an initial tile representing an image into primary multiple tiles such that each tile of the primary multiple tiles is regarded as an independent image as processed by the one or more processors of the apparatus, compute the primary multiple tiles into secondary multiple tiles compatible in size of a local buffer, merge the multiple secondary multiple tiles into a final tile representing the image, and compress the final tile; and memory coupled to the one or more processors. 22. The system of claim 21 , wherein the one or more processors are further to detect one or more of the initial tile and the size of the local buffer, wherein the one or more processors are further to predict at least one of one or more layers associated with computing or processing the primary multiple tiles into the secondary multiple tiles based on the size of the local buffer, wherein computing the primary multiple tiles into the secondary multiple tiles comprises convoluting or pooling the primary multiple tiles at each of the one or more layers. 23. The system of claim 21 , wherein compressing the final tile is lossless, wherein the compressed final tile is used by one or more hardware accelerators for performing machine learning or deep learning processing of the image associated with the compressed final tile. 24. The system of claim 21 , wherein the one or more processors comprise one or more of a graphics processor and an application processor, the graphics processor hosting the one or more hardware accelerators including one or more machine learning hardware accelerators, and wherein the graphics processor is co-located with the application processor on a common semiconductor package.

Assignees

Inventors

Classifications

  • Filling planar surfaces by adding surface attributes, e.g. adding colours or textures · CPC title

  • Predictors, e.g. intraframe, interframe coding · CPC title

  • G06T1/60Primary

    Memory management · CPC title

  • Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks · CPC title

  • Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10600147B2 cover?
A mechanism is described for facilitating efficient memory layout for enabling smart data compression in machine learning environments. A method of embodiments, as described herein, includes facilitating dividing an initial tile representing an image into primary multiple tiles such that each tile of the primary multiple tiles is regarded as an independent image as processed by one or more proc…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06T1/60. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 24 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).