Automatic labeling and segmentation using machine learning models

US11899749B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11899749-B2
Application numberUS-202117201816-A
CountryUS
Kind codeB2
Filing dateMar 15, 2021
Priority dateMar 15, 2021
Publication dateFeb 13, 2024
Grant dateFeb 13, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In various examples, training methods as described to generate a trained neural network that is robust to various environmental features. In an embodiment, training includes modifying images of a dataset and generating boundary boxes and/or other segmentation information for the modified images which is used to train a neural network.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: obtaining an image including an object depicted in the image, wherein the object is associated with a first set of boundaries; generating, based at least in part on the first set of boundaries, one or more labels corresponding to one or more areas of the image to distinguish areas depicting respective parts of the object from areas depicting respective parts of the image different from the object; performing a set of transformations on the image to generate a set of transformed images; and generating, based at least in part on the one or more generated labels, a second set of boundaries of the object for the set of transformed images, wherein the second set of boundaries enclose a smaller portion of the image than the first set of boundaries. 2. The method of claim 1 , wherein performing the set of transformations further comprises rotating the image by a set of values. 3. The method of claim 1 , wherein the second set of boundaries comprises bounding boxes of the object in the set of transformed images. 4. The method of claim 1 , further comprising: training a neural network based at least in part on the set of transformed images and the second set of boundaries of the object. 5. The method of claim 4 , wherein the neural network is to perform object detection. 6. The method of claim 1 , wherein: the first set of boundaries comprises a first bounding box for the object; the set of transformations comprises a rotation of the image to result in a rotated image; and the second set of boundaries comprises a second bounding box of the object that has sides parallel to the first bounding box for the object and that has area smaller than a box of minimal perimeter that would completely encompass the first bounding box after the rotation has been applied to the first bounding box. 7. The method of claim 1 , wherein performing the set of transformations further comprises modifying a scale value of the image. 8. The method of claim 1 , wherein performing the set of transformations further comprises modifying a luminance value of the image. 9. The method of claim 1 , wherein performing the set of transformations further comprises modifying a contrast value for the image. 10. The method of claim 1 , wherein the one or more generated labels are from one or more segmentations of the image generated by a neural network. 11. The method of claim 10 , wherein the neural network comprises a Mask Region-based Convolutional Neural Network. 12. The method of claim 1 , further comprising associating the second set of boundaries with a label of the one or more generated labels to train a neural network. 13. The method of claim 1 , wherein the image further comprises a frame of a set of frames comprising a video. 14. The method of claim 1 , wherein a boundary of the first set of boundaries further comprises a rectangular region encompassing the object. 15. A system comprising: one or more processors; and memory storing instructions that, as a result of being executed by the one or more processors, cause the system to: obtain an image depicting an object that is associated with first boundaries; use a first model to obtain a segmentation of the image to distinguish pixels of the object from other pixels based at least in part on the first boundaries; use the segmentation to determine second boundaries of the object in a set of modified versions of the image, wherein the second boundaries enclose a region that encompasses the object and is smaller than at least one region enclosed by the first boundaries; and associate the object with the second boundaries of the object in a dataset to update a second model. 16. The system of claim 15 , wherein the first model further comprises a Convolutional Neural Network trained, based at least in part on, a curated dataset. 17. The system of claim 15 , wherein the memory further includes instructions that, as a result of being executed by the one or more processors, cause the system to use the dataset to train the second model to perform object detection. 18. The system of claim 17 , wherein the second model includes a Convolutional Neural Network. 19. The system of claim 15 , wherein the second boundaries comprise a set of bounding boxes of the object. 20. The system of claim 15 , wherein the set of modified versions of the image comprises an image generated based, at least in part, on rotating the image. 21. A system comprising: one or more processors to perform one or more operations, the one or more operations comprising at least: instantiating a pre-labeling tool based at least in part on labels associated with objects within a first set of images of a first dataset and a first set of mask annotations; modifying a second set of images of a second dataset to generate a set of modified images; and updating a neural network based at least part on the set of modified images and a second set of mask annotations, generated by the pre-labeling tool, corresponding to objects within the set of modified images, wherein the second set of mask annotations enclose a smaller area that encompasses the objects compared to the first set of mask annotations. 22. The system of claim 21 , wherein the one or more operations include instantiating a convolutional neural network to generate the first set of mask annotations based at least in part on the first set of images. 23. The system of claim 21 , wherein a set of bounding boxes is included in the first dataset. 24. The system of claim 21 , wherein modifying the second set of images to generate the set of modified images further comprises rotating one or more images of the second set of images different amounts about a rotational axis. 25. The system claim 21 , wherein modifying the second set of images to generate the set of modified images further comprises modifying one or more color values associated with images of the second set of images. 26. The system of claim 21 , wherein modifying the second set of images to generate the set of modified images further comprises modifying luminance values associated with images of the second set of images. 27. The system of claim 21 , wherein modifying the second set of images to generate the set of modified images further comprises modifying one or more scale values associated with images of the second set of images. 28. A method comprising: obtaining a segmentation of an image that is associated with a first boundary; performing a transformation of the image to obtain two or more transformed images; and generating a second boundary of an object depicted in the two or more transformed images based at least in part on the segmentation, wherein the second boundary encloses the object within a smaller region compared to a region enclosed by the first boundary. 29. The method of claim 28 , wherein obtaining the segmentation further comprises using a neural network to generate the segmentation of the image. 30. The method of claim 28 , wherein performing the transformation of the image further comprises rotating the image. 31. The method of claim 28 , wherein the second boundary of the object depicted in the two or more transformed images further comprises a bounding box. 32. The method of claim 28 , wherein the method f

Assignees

Inventors

Classifications

  • G06F18/214Primary

    Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Physics · mapped topic

  • G06T7/12Primary

    Edge-based segmentation · CPC title

  • by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition · CPC title

  • by image rotation, e.g. by 90 degrees · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11899749B2 cover?
In various examples, training methods as described to generate a trained neural network that is robust to various environmental features. In an embodiment, training includes modifying images of a dataset and generating boundary boxes and/or other segmentation information for the modified images which is used to train a neural network.
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06F18/214. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 13 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).