What technology area does this patent fall under?

Primary CPC classification G06F12/023. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 06 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Compression of machine learning models utilizing pseudo-labeled data training

US12056906B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12056906-B2
Application number	US-202318466141-A
Country	US
Kind code	B2
Filing date	Sep 13, 2023
Priority date	Dec 30, 2017
Publication date	Aug 6, 2024
Grant date	Aug 6, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: one or more processors including a graphical processing unit (GPU); and a memory to store data, including data for machine learning; wherein the one or more processors are to: train an original model utilizing a training set of data, the original model being a machine learning model, perform inference with the original model using a set of unlabeled data examples to generate a set of outputs, generate a set of pseudo labels for the unlabeled data examples based on the generated set of outputs from the trained original model, and generating a pseudo-labeled data set using the unlabeled data and the generated pseudo labels, compress the original model to generate a compressed model, and train the compressed model at a lower precision than a precision of the original model, the training of the compressed model utilizing the pseudo-labeled data set. 2. The apparatus of claim 1 , wherein the one or more processors are further to perform one or more additional iterations of model compression, including: further compress the trained compressed model to generate a second compressed model; and train the second compressed model utilizing the pseudo-labeled data set. 3. The apparatus of claim 1 , wherein the one or more processors are further to: evaluate accuracy of the compressed model by performing inference with the trained compressed model utilizing a validation set of data. 4. The apparatus of claim 1 , wherein compressing the trained original model includes one or more of: reducing a number of layers from the trained original model; or reducing a width of one or more layers of the trained original model. 5. The apparatus of claim 1 , wherein performing inference with the original model using the set of unlabeled data examples includes collecting the generated set of outputs as a vector of values. 6. The apparatus of claim 5 , wherein training the compressed model includes teaching the compressed model to generate the vector of values. 7. The apparatus of claim 1 , wherein one or more of the unlabeled data examples are generated based on one or more data elements of the training set of data. 8. The apparatus of claim 1 , wherein the compressed model is trained for installation in an edge device for a system. 9. A non-transitory computer-readable storage medium having stored thereon data representing sequences of instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising: training an original model utilizing a training set of data, the original model being a machine learning model; performing inference with the original model using a set of unlabeled data examples to generate a set of outputs; generating a set of pseudo labels for the unlabeled data examples based on the generated set of outputs from the trained original model, and generating a pseudo-labeled data set using the unlabeled data and the generated pseudo labels; compressing the original model to generate a compressed model; and training the compressed model at a lower precision than a precision of the original model, the training of the compressed model utilizing the pseudo-labeled data set. 10. The medium of claim 9 , further comprising instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: performing one or more additional iterations of model compression, including: further compressing the trained compressed model to generate a second compressed model; and training the second compressed model utilizing the pseudo-labeled data set. 11. The medium of claim 9 , further comprising instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: evaluate accuracy of the compressed model by performing inference with the trained compressed model utilizing a validation set of data. 12. The medium of claim 9 , wherein compressing the trained original model includes one or more of: reducing a number of layers from the trained original model; or reducing a width of one or more layers of the trained original model. 13. The medium of claim 9 , wherein performing inference with the original model using the set of unlabeled data examples includes collecting the generated set of outputs as a vector of values. 14. The medium of claim 13 , wherein training the compressed model includes teaching the compressed model to generate the vector of values. 15. A method comprising: training an original model in a computing system utilizing a training set of data, the original model being a machine learning model; performing inference with the original model using a set of unlabeled data examples to generate a set of outputs; generating a set of pseudo labels for the unlabeled data examples based on the generated set of outputs from the trained original model, and generating a pseudo-labeled data set using the unlabeled data and the generated pseudo labels; compressing the original model to generate a compressed model; and training the compressed model at a lower precision than a precision of the original model, the training of the compressed model utilizing the pseudo-labeled data set. 16. The method of claim 15 , further comprising: performing one or more additional iterations of model compression, including: further compressing the trained compressed model to generate a second compressed model; and training the second compressed model utilizing the pseudo-labeled data set. 17. The method of claim 15 , further comprising: evaluate accuracy of the compressed model by performing inference with the trained compressed model utilizing a validation set of data. 18. The method of claim 15 , wherein compressing the trained original model includes one or more of: reducing a number of layers from the trained original model; or reducing a width of one or more layers of the trained original model. 19. The method of claim 15 , wherein performing inference with the original model using the set of unlabeled data examples includes collecting the generated set of outputs as a vector of values. 20. The method of claim 19 , wherein training the compressed model includes teaching the compressed model to generate the vector of values.

Assignees

Intel Corp

Classifications

G06T15/005
General purpose rendering architectures · CPC title
G06F2212/302
In image processor or graphics adapter · CPC title
G06F2212/401
Compressed data · CPC title
G06F12/023Primary
Free address space management · CPC title
G06T1/20
Processor architectures; Processor configuration, e.g. pipelining · CPC title

Patent family

Related publications grouped by family.

View patent family 64564589

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12056906B2 cover?: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data wit…
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06F12/023. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 06 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).