Object detection using image classification models
US-10223611-B1 · Mar 5, 2019 · US
US10733480B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10733480-B2 |
| Application number | US-201816039311-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 18, 2018 |
| Priority date | Jul 18, 2018 |
| Publication date | Aug 4, 2020 |
| Grant date | Aug 4, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
There is described a computing device and method in a digital medium environment for custom auto tagging of multiple objects. The computing device includes an object detection network and multiple image classification networks. An image is received at the object detection network and includes multiple visual objects. First feature maps are applied to the image at the object detection network and generate object regions associated with the visual objects. The object regions are assigned to the multiple image classification networks, and each image classification network is assigned to a particular object region. The second feature maps are applied to each object region at each image classification network, and each image classification network outputs one or more classes associated with a visual object corresponding to each object region.
Opening claim text (preview).
What is claimed is: 1. In a digital medium environment for custom auto tagging of multiple obj ects, a method implemented by a computing device including an object detection network and image classification networks, the method comprising: training the object detection network and the image classification networks to localize image features and classify the localized image features based on a multi-class dataset of images associated with a plurality of classes, and incrementally training the object detection network and the image classification networks based on a custom dataset of images associated with at least one custom class; receiving, at the object detection network, an image that includes multiple visual objects; applying, at the obj ect detection network, a plurality of first feature maps to the image and generate object regions each associated with a respective one of the multiple visual objects; assigning each of the object regions to one of the image classification networks and applying, at each of the image classification networks, a plurality of second feature maps to each obj ect region and outputting at least one class associated with a visual object of the multiple visual objects corresponding to each object region. 2. The method as described in claim 1 , wherein the image is a still image, a video image, or a multimedia image. 3. The method as described in claim 1 , wherein the plurality of second feature maps applied at the image classification networks are the same or similar to the plurality of first feature maps applied at the obj ect detection network. 4. The method as described in claim 3 , wherein at least a portion of the plurality of first feature maps is communicated from the object detection network to at least one of the image classification networks. 5. The method as described in claim 1 , wherein the plurality of second feature maps applied at the image classification networks are different from the plurality of first feature maps applied at the obj ect detection network. 6. The method as described in claim 5 , wherein applying the plurality of second feature maps to each obj ect region includes applying the plurality of first feature maps to each object region in conjunction with the plurality of second feature maps. 7. In a digital medium environment for custom auto tagging of multiple objects using a computing device, the computing device comprising: an object detection network and image classification networks trained to localize image features and classify the localized image features based on a multi-class dataset of images associated with a plurality of classes, and the object detection network and the image classification networks incrementally trained based on a custom dataset of images associated with at least one custom class; the object detection network configured to receive an image that includes multiple visual objects, apply a plurality of first feature maps to the image, and generate a plurality of object regions each associated with a respective one of the multiple visual objects; and the image classification networks configured to receive a particular object region of the plurality of object regions at each of the image classification networks, apply a plurality of second feature maps to each object region, and output at least one class associated with a visual object of the multiple visual objects corresponding to each object region. 8. The computing device as described in claim 7 , wherein the image is a still image, a video image, or a multimedia image. 9. The computing device as described in claim 7 , wherein the plurality of second feature maps applied at the image classification network networks are the same or similar to the plurality of first feature maps applied at the object detection network. 10. The computing device as described in claim 9 , wherein each image classification network receives a portion of the plurality of first feature maps from the object detection network. 11. The computing device as described in claim 7 , wherein the plurality of second feature maps applied at the image classification networks are different from the plurality of first feature maps applied at the object detection network. 12. The computing device as described in claim 11 , wherein the plurality of first feature maps are applied to each object region in conjunction with the plurality of second feature maps. 13. In a digital medium environment for custom auto tagging of multiple objects, a method implemented by a computing device including an object detection network and a plurality of image classification networks, the method comprising: training the object detection network and the plurality of image classification networks to localize image features and classify the localized image features based on a multi-class dataset of images associated with a plurality of classes; training the object detection network and the plurality of image classification networks to localize image features and classify the localized image features based on a custom dataset of images associated with at least one custom class, including incremental training of the object detection network and the plurality of image classification networks, as trained based on the multi-class dataset of images, for a new object category or a sub-category of an existing object category; receiving, at the object detection network, an image that includes a plurality of visual objects; identifying a plurality of object regions associated with the plurality of visual objects, at the object detection network as trained based on the multi-class dataset and the custom dataset; and classifying at least one visual object of the plurality of visual objects with the at least one custom class, at the plurality of image classification networks as trained based on the multi-class dataset and the custom dataset. 14. The method as described in claim 13 , wherein the image is a still image, a video image, or a multimedia image. 15. The method as described in claim 13 , further comprising assigning the plurality of object regions to the plurality of image classification networks, each object region of the plurality of object regions being assigned to a particular image classification network of the plurality of image classification networks. 16. The method as described in claim 13 , wherein training the object detection network and the plurality of image classification networks based on the custom dataset of images is subsequent to training the object detection network and the plurality of image classification networks based on the multi-class dataset of images. 17. The method as described in claim 13 , wherein: a first entity initiates the training the object detection network and the plurality of image classification networks based on the multi-class dataset of images; a second entity initiates the training for the object detection network and the plurality of image classification networks based on the custom dataset of images; and the first and second entities are different. 18. The method as described in claim 13 , wherein the custom dataset of images associated with at least one custom class includes at least one fine-grained class of a general object category. 19. The method as described in claim 13 , wherein classifying the at least one visual object of the plurality of visual objects with the at least one custom class includes classifying the at least one visual object with the at least one fine-grained class of the general object categ
using neural networks · CPC title
Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title
using classification, e.g. of video objects · CPC title
based on distances to training or reference patterns · CPC title
Selection of the most significant subset of features · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.