Three-dimensional models of users wearing clothing items
US-2024071019-A1 · Feb 29, 2024 · US
US12518530B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12518530-B2 |
| Application number | US-202318205696-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 5, 2023 |
| Priority date | Jun 5, 2023 |
| Publication date | Jan 6, 2026 |
| Grant date | Jan 6, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and systems are disclosed for applying machine learning models to compressed videos. The system receives a video, depicting an object, that has previously been compressed using one or more video compression processes. The system analyzes, using one or more machine learning models, the video that has previously been compressed to generate a prediction corresponding to the object depicted in the video, with one or more artifacts resulting from application of the one or more machine learning models to the video that has been previously compressed being absent from the prediction. The system generates a visual output based on the prediction in which the one or more artifacts are absent.
Opening claim text (preview).
What is claimed is: 1 . A method comprising: receiving a compressed video, depicting an object, the compressed video having been previously compressed using one or more video compression processes; analyzing, using one or more machine learning models, the compressed video to generate a prediction corresponding to the object depicted in the compressed video, one or more artifacts resulting from application of the one or more machine learning models to the compressed video that has been previously compressed being absent from the prediction; generating, by the one or more machine learning models based on the prediction, an augmented version of the compressed video in which a virtual object is overlaid on the object depicted in the compressed video; and generating a visual output comprising the augmented version of the compressed video in which the one or more artifacts are absent. 2 . The method of claim 1 , wherein the one or more machine learning models comprise a classifier, and wherein the prediction comprises a classification of the object. 3 . The method of claim 2 , wherein the classification indicates a type of the object. 4 . The method of claim 2 , wherein the classification indicates whether an object is real or fake. 5 . The method of claim 1 , wherein the one or more machine learning models comprise a convolutional neural network associated with a fashion item extended reality experience. 6 . The method of claim 5 , further comprising: receiving a target fashion item associated with the fashion item extended reality experience, wherein the prediction comprises a new video that depicts the object wearing the target fashion item. 7 . The method of claim 1 , wherein the one or more machine learning models are trained by performing training operations comprising: accessing training data comprising a training pair of a training compressed video depicting a training object and a ground truth data associated with the training compressed video; and analyzing, using the one or more machine learning models, the training compressed video to estimate a prediction for the training object. 8 . The method of claim 7 , the training operations comprising: computing a loss based on a deviation between the estimated prediction for the training object and the ground truth data associated with the compressed video; and updating one or more parameters of the one or more machine learning models based on the computed loss. 9 . The method of claim 8 , further comprising repeating the training operations for additional training data until a stopping criterion is met. 10 . The method of claim 9 , wherein the one or more artifacts resulting from application of the one or more machine learning models to the training compressed video are excluded. 11 . The method of claim 8 , further comprising: accessing a first training image; generating a training video using the first training image; applying the one or more video compression processes to the training video to generate the training compressed video comprising a set of artifacts; storing the training video as the ground truth data for the training compressed video in which the set of artifacts is absent; and forming the training pair comprising the training compressed video and the ground truth data. 12 . The method of claim 8 , wherein the ground truth data associated with the training compressed video comprises a first sequence of frames depicting a person wearing a fashion item, and wherein the training compressed video comprises a second sequence of frames depicting the person wearing the fashion item with a set of artifacts. 13 . The method of claim 12 , wherein the second sequence of frames is generated by applying the one or more video compression processes to the first sequence of frames. 14 . The method of claim 13 , wherein the fashion item comprises a virtual fashion item, further comprising generating the first sequence of frames by: applying a fashion item machine learning model to an individual image to overlay the virtual fashion item on the person depicted in the individual image; and replicating the individual image overlaid with the virtual fashion item a threshold quantity of times corresponding to a quantity of the first sequence of frames. 15 . A system comprising: at least one processor; and at least one memory component having instructions stored thereon that, when executed by the at least one processor, cause the at least one processor to perform operations comprising: receiving a compressed video, depicting an object, the compressed video having been previously compressed using one or more video compression processes; analyzing, using one or more machine learning models, the compressed video to generate a prediction corresponding to the object depicted in the compressed video, one or more artifacts resulting from application of the one or more machine learning models to the compressed video that has been previously compressed being absent from the prediction; generating, by the one or more machine learning models based on the prediction, an augmented version of the compressed video in which a virtual object is overlaid on the object depicted in the compressed video; and generating a visual output comprising the augmented version of the compressed video in which the one or more artifacts are absent. 16 . The system of claim 15 , wherein the one or more machine learning models comprise a classifier, and wherein the prediction comprises a classification of the object. 17 . The system of claim 16 , wherein the classification indicates a type of the object. 18 . The system of claim 16 , wherein the one or more machine learning models are trained by performing training operations comprising: accessing training data comprising a training pair of a training compressed video depicting a training object and a ground truth data associated with the training compressed video; analyzing, using the one or more machine learning models, the training compressed video to estimate a prediction for the training object; computing a loss based on a deviation between the estimated prediction for the training object and the ground truth data associated with the compressed video; and updating one or more parameters of the one or more machine learning models based on the computed loss. 19 . The system of claim 18 , wherein the training operations comprise: accessing a first training image; generating a training video using the first training image; applying the one or more video compression processes to the training video to generate the training compressed video comprising a set of artifacts; storing the training video as the ground truth data for the training compressed video in which the set of artifacts is absent; and forming the training pair comprising the training compressed video and the ground truth data. 20 . A non-transitory computer-readable storage medium having stored thereon instructions that, when executed by at least one processor, cause the at least one processor to perform operations comprising: receiving a compressed video, depicting an object, the compressed video having been previously compressed using one or more video compression processes; analyzing, using one or more machine learning models, the compressed video to generate a prediction corresponding to the object depicted in the compressed video, one or more artifacts resulting from application of the one or more machine lea
using neural networks · CPC title
Means for inserting a foreground image in a background image, i.e. inlay, outlay · CPC title
for simulating a person's appearance, e.g. hair style, glasses, clothes · CPC title
Mixing · CPC title
Creating or editing images; Combining images with text · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.