Methods, devices, and computer readable media for training a keypoint estimation network using cGAN-based data augmentation

US12430905B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12430905-B2
Application numberUS-202318315866-A
CountryUS
Kind codeB2
Filing dateMay 11, 2023
Priority dateMay 11, 2021
Publication dateSep 30, 2025
Grant dateSep 30, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Method and devices for training a keypoint estimation network are described. In each training iteration, synthetic images are generated by a generator, each synthetic image being assigned respective assigned keypoints by the generator. Using a prior-iteration of the keypoint estimation network, a set of predicted keypoints is obtained for each synthetic image. Based on an error score between the predicted keypoints and the assigned keypoints, poor quality synthetic images are discarded. The remaining synthetic images, together with real world images, are used to train an updated keypoint estimation network. The performance of the updated keypoint estimation network is validated, and the training iterations are performed until a convergence criteria is satisfied.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for training a keypoint estimation network, the method comprising: performing a plurality of training iterations, each training iteration comprising: obtaining a set of synthetic images generated by a generator, each synthetic image being assigned a respective set of assigned keypoints by the generator; using a prior-iteration keypoint estimation network, obtaining a set of predicted keypoints for each synthetic image; based on computation of an error score between the set of predicted keypoints and the respective set of assigned keypoints for each respective synthetic image: identifying and discarding any synthetic image having an error score that fails a preset threshold; and identifying and adding, to a synthetic dataset, any synthetic image having an error score that satisfies the preset threshold; training an updated keypoint estimation network, using a combined dataset comprising the synthetic dataset combined with a real world dataset containing real world images; and computing a mean error score for the updated keypoint estimation network, the mean error score representing performance of the updated keypoint estimation network on a validation dataset; wherein the training iterations are performed until a convergence criteria is satisfied; and storing the updated keypoint estimation network from a final training iteration as a final keypoint estimation network. 2. The method of claim 1 , wherein the generator is a pre-trained generator, trained using a conditional generative adversarial network (cGAN), to generate a synthetic image conditioned on a set of keypoints sampled from the real world dataset. 3. The method of claim 1 , wherein the generator comprises a pre-trained sub-generator, trained using a conditional generative adversarial network (cGAN), to generate a synthetic feature vector conditioned on a set of keypoints sampled from the real world dataset, the generator further comprising a decoder to reconstruct a synthetic image from the synthetic feature vector. 4. The method of claim 1 , further comprising, for each training iteration: comparing the mean error score for the updated keypoint estimation network with a mean error score computed for the prior-iteration keypoint estimation network; and in response to a determination that the mean error score for the updated keypoint estimation network is greater than the mean error score for the prior-iteration keypoint estimation network: replacing the updated keypoint estimation network with the prior-iteration keypoint estimation network; and retraining the replaced updated keypoint estimation network using the combined dataset. 5. The method of claim 1 , further comprising, for each training iteration: based on computation of the error score between the set of predicted keypoints and the respective set of assigned keypoints for each respective synthetic image: identifying any synthetic image having an error score that satisfies the preset threshold and is within a preset margin of the threshold; replacing the set of assigned keypoints for the identified synthetic image with the set of predicted keypoints for the identified synthetic image; and adding the identified synthetic image to the synthetic dataset. 6. The method of claim 1 , further comprising: fine-tuning the final keypoint estimation network using the real world dataset. 7. The method of claim 1 , further comprising: storing the combined dataset as an augmented training dataset. 8. The method of claim 1 , further comprising: prior to a first training iteration, training an initial keypoint estimation network using only real world images; wherein the initial keypoint estimation network is used as the prior-iteration keypoint estimation network for the first training iteration. 9. The method of claim 1 , wherein the convergence criteria is a convergence of the mean error score. 10. A device for training a keypoint estimation network, the device comprising a processing unit configured to execute instructions to cause the device to: perform a plurality of training iterations, each training iteration comprising: obtaining a set of synthetic images generated by a generator, each synthetic image being assigned a respective set of assigned keypoints by the generator; using a prior-iteration keypoint estimation network, obtaining a set of predicted keypoints for each synthetic image; based on computation of an error score between the set of predicted keypoints and the respective set of assigned keypoints for each respective synthetic image: identifying and discarding any synthetic image having an error score that fails a preset threshold; and identifying and adding, to a synthetic dataset, any synthetic image having an error score that satisfies the preset threshold; training an updated keypoint estimation network, using a combined dataset comprising the synthetic dataset combined with a real world dataset containing real world images; and computing a mean error score for the updated keypoint estimation network, the mean error score representing performance of the updated keypoint estimation network on a validation dataset; wherein the training iterations are performed until a convergence criteria is satisfied; and store the updated keypoint estimation network from a final training iteration as a final keypoint estimation network. 11. The device of claim 10 , wherein the generator is a pre-trained generator, trained using a conditional generative adversarial network (cGAN), to generate a synthetic image conditioned on a set of keypoints sampled from the real world dataset. 12. The device of claim 10 , wherein the generator comprises a pre-trained sub-generator, trained using a conditional generative adversarial network (cGAN), to generate a synthetic feature vector conditioned on a set of keypoints sampled from the real world dataset, the generator further comprising a decoder to reconstruct a synthetic image from the synthetic feature vector. 13. The device of claim 10 , wherein the processing unit is further configured to execute the instructions to cause the device to, for each training iteration: compare the mean error score for the updated keypoint estimation network with a mean error score computed for the prior-iteration keypoint estimation network; and in response to a determination that the mean error score for the updated keypoint estimation network is greater than the mean error score for the prior-iteration keypoint estimation network: replace the updated keypoint estimation network with the prior-iteration keypoint estimation network; and retrain the replaced updated keypoint estimation network using the combined dataset. 14. The device of claim 10 , wherein the processing unit is further configured to execute the instructions to cause the device to, for each training iteration: based on computation of the error score between the set of predicted keypoints and the respective set of assigned keypoints for each respective synthetic image: identify any synthetic image having an error score that satisfies the preset threshold and is within a preset margin of the threshold; replace the set of assigned keypoints for the identified synthetic image with the set of predicted keypoints for the identified synthetic image; and add the identified synthetic image to the synthetic dataset. 15. The device of claim 10 , wherein the processing unit is further configured to execute the instructions to cause the device to: fine-tune the final keypoint estimation network using the real world dataset.

Assignees

Inventors

Classifications

  • Validation; Performance evaluation · CPC title

  • Hand-related biometrics; Hand pose recognition · CPC title

  • G06V10/82Primary

    using neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12430905B2 cover?
Method and devices for training a keypoint estimation network are described. In each training iteration, synthetic images are generated by a generator, each synthetic image being assigned respective assigned keypoints by the generator. Using a prior-iteration of the keypoint estimation network, a set of predicted keypoints is obtained for each synthetic image. Based on an error score between th…
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 30 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).