System and computer-implemented method for segmenting an image
US-11379985-B2 · Jul 5, 2022 · US
US11544503B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11544503-B2 |
| Application number | US-202016885168-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 27, 2020 |
| Priority date | Apr 6, 2020 |
| Publication date | Jan 3, 2023 |
| Grant date | Jan 3, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A domain alignment technique for cross-domain object detection tasks is introduced. During a preliminary pretraining phase, an object detection model is pretrained to detect objects in images associated with a source domain using a source dataset of images associated with the source domain. After completing the pretraining phase, a domain adaptation phase is performed using the source dataset and a target dataset to adapt the pretrained object detection model to detect objects in images associated with the target domain. The domain adaptation phase may involve the use of various domain alignment modules that, for example, perform multi-scale pixel/path alignment based on input feature maps or perform instance-level alignment based on input region proposals.
Opening claim text (preview).
What is claimed is: 1. A method for adapting an object detection model for use with data associated with a target domain, the object detection model pretrained based on a source dataset, the source dataset including labeled image data associated with a source domain that is different than the target domain, the method comprising: inputting, into the object detection model, batches of the source dataset and batches of a target dataset; wherein the target dataset includes image data associated with the target domain; generating, using the object detection model, based on the batches of the source dataset and the batches of the target dataset, a plurality of region proposals; wherein a given region proposal of the plurality of region proposals is indicative of a prediction, by the object detection model, of a bounded region in a given image where a detected object resides; generating, using a segmentation network, based on an image in one or more of the source dataset or target dataset, a dense segmentation map, the dense segmentation map indicative of a drawing type that each pixel in the image belongs to; training, using an instance-level domain alignment objective and a rendering layer segmentation objective as objective functions, the object detection model to generate domain-agnostic region proposals; wherein the instance-level domain alignment objective is based on the plurality of region proposals and the rendering layer segmentation objective is based on the dense segmentation map; and generating, using the object detection model, one or more cross-domain object detection inferences. 2. The method of claim 1 , wherein the batches of the source dataset and the batches of the target dataset are alternately input into the object detection model. 3. The method of claim 1 , further comprising: determining, using a binary domain classifier, a plurality of binary domain predictions, wherein a given binary domain prediction of the plurality of binary domain predictions is indicative of a prediction, by the binary domain classifier, or whether a given region proposal corresponds to the source domain or the target domain; wherein the instance-level domain alignment objective is further based on the plurality of binary domain predictions. 4. The method of claim 3 , wherein training the object detection model includes adjusting a parameter of the object detection model to reduce an accuracy of the binary domain classifier. 5. The method of claim 3 , further comprising: training the binary domain classifier using the instance-level domain alignment objective. 6. The method of claim 1 , further comprising: generating, using the object detection model, based on the batches of the source dataset and the batches of the target dataset, a plurality of feature maps; and training, using a pixel-level domain alignment objective, the object detection model to generate domain-agnostic feature maps; wherein the pixel-level domain alignment objective is based on the plurality of feature maps. 7. The method of claim 6 , wherein the pixel-level domain alignment objective is associated with a first objective function; wherein the instance-level domain alignment objective is associated with a second objective function; and wherein training the object detection model includes: adjusting one or more parameters of the object detection model to minimize an overall loss, the overall loss based on a sum of the first objective function and the second objective function. 8. The method of claim 6 , further comprising: generating, using a second binary domain classifier, based on the plurality of feature maps, a second plurality of binary domain predictions, each of the second plurality of binary domain predictions indicative of a prediction, by the second binary domain classifier, of whether a given pixel in a given feature map is associated with the source domain or the target domain; wherein the pixel-level domain alignment objective is based on the second plurality of binary domain predictions. 9. The method of claim 8 , wherein the plurality of feature maps include: a first feature map at a first resolution; and a second feature map at a second resolution that is different than the first resolution, wherein the first feature map and second feature map are part of a feature pyramid output by a feature pyramid network (FPN) associated with the object detection model; wherein the second binary domain classifier includes: a first resolution-specific domain classifier associated with the first resolution; and a second resolution-specific domain classifier associated with the second resolution; and wherein generating the second plurality of binary domain predictions includes: generating, using the first resolution-specific domain classifier, based on the first feature map, a first binary domain prediction of the second plurality of binary domain predictions, the first binary domain prediction indicative of a prediction, by the first resolution-specific domain classifier, of whether a given pixel in the first feature map is associated with the source domain or the target domain; and generating, using the second resolution-specific domain classifier, based on the second feature map, a second binary domain prediction of the second plurality of binary domain predictions, the second binary domain prediction indicative of a prediction, by the second resolution-specific domain classifier, of whether a given pixel in the second feature map is associated with the source domain or the target domain. 10. The method of claim 1 , wherein the object detection model includes an FPN and a region proposal network (RPN); wherein inputting, into the object detection model, the batches of the source dataset and the batches of the target dataset includes: inputting, into the FPN, the batches of the source dataset and the batches of the target dataset; and generating, using the FPN, based on the batches of the source dataset and the batches of the target dataset, a plurality of feature maps; and wherein generating, using the object detection model, the plurality of region proposals includes: inputting, into the RPN, the plurality of feature maps; wherein the plurality of region proposals are generated, using the RPN, based on the plurality of feature maps. 11. The method of claim 1 , wherein training the object detection model includes: for a particular batch of the source dataset: training the object detection model using a first overall objective function; wherein the first overall objective function is based on an object detection objective and the instance-level domain alignment objective; and for a particular batch of the target dataset: training the object detection model using a second overall objective function; wherein the second overall objective function is based on the instance-level domain alignment objective but not the object detection objective. 12. The method of claim 3 , further comprising: determining, for each of the plurality of binary domain predictions, a probability value indicating a likelihood that the given region proposal is associated with the source domain or the target domain as predicted by the binary domain classifier; wherein the instance-level domain alignment objective includes a focal loss term configured to assign increased weight to region proposals having lower corresponding probability values compared to region proposals having greater corresponding probability values. 13. The method of claim 1 , wherein enabling cross-domain object detection inferences includes: deploying the object detection model
using neural networks · CPC title
using classification, e.g. of video objects · CPC title
Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Validation; Performance evaluation · CPC title
Multiple classes · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.