Domain alignment for object detection domain adaptation tasks

US11544503B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11544503-B2
Application numberUS-202016885168-A
CountryUS
Kind codeB2
Filing dateMay 27, 2020
Priority dateApr 6, 2020
Publication dateJan 3, 2023
Grant dateJan 3, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A domain alignment technique for cross-domain object detection tasks is introduced. During a preliminary pretraining phase, an object detection model is pretrained to detect objects in images associated with a source domain using a source dataset of images associated with the source domain. After completing the pretraining phase, a domain adaptation phase is performed using the source dataset and a target dataset to adapt the pretrained object detection model to detect objects in images associated with the target domain. The domain adaptation phase may involve the use of various domain alignment modules that, for example, perform multi-scale pixel/path alignment based on input feature maps or perform instance-level alignment based on input region proposals.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for adapting an object detection model for use with data associated with a target domain, the object detection model pretrained based on a source dataset, the source dataset including labeled image data associated with a source domain that is different than the target domain, the method comprising: inputting, into the object detection model, batches of the source dataset and batches of a target dataset; wherein the target dataset includes image data associated with the target domain; generating, using the object detection model, based on the batches of the source dataset and the batches of the target dataset, a plurality of region proposals; wherein a given region proposal of the plurality of region proposals is indicative of a prediction, by the object detection model, of a bounded region in a given image where a detected object resides; generating, using a segmentation network, based on an image in one or more of the source dataset or target dataset, a dense segmentation map, the dense segmentation map indicative of a drawing type that each pixel in the image belongs to; training, using an instance-level domain alignment objective and a rendering layer segmentation objective as objective functions, the object detection model to generate domain-agnostic region proposals; wherein the instance-level domain alignment objective is based on the plurality of region proposals and the rendering layer segmentation objective is based on the dense segmentation map; and generating, using the object detection model, one or more cross-domain object detection inferences. 2. The method of claim 1 , wherein the batches of the source dataset and the batches of the target dataset are alternately input into the object detection model. 3. The method of claim 1 , further comprising: determining, using a binary domain classifier, a plurality of binary domain predictions, wherein a given binary domain prediction of the plurality of binary domain predictions is indicative of a prediction, by the binary domain classifier, or whether a given region proposal corresponds to the source domain or the target domain; wherein the instance-level domain alignment objective is further based on the plurality of binary domain predictions. 4. The method of claim 3 , wherein training the object detection model includes adjusting a parameter of the object detection model to reduce an accuracy of the binary domain classifier. 5. The method of claim 3 , further comprising: training the binary domain classifier using the instance-level domain alignment objective. 6. The method of claim 1 , further comprising: generating, using the object detection model, based on the batches of the source dataset and the batches of the target dataset, a plurality of feature maps; and training, using a pixel-level domain alignment objective, the object detection model to generate domain-agnostic feature maps; wherein the pixel-level domain alignment objective is based on the plurality of feature maps. 7. The method of claim 6 , wherein the pixel-level domain alignment objective is associated with a first objective function; wherein the instance-level domain alignment objective is associated with a second objective function; and wherein training the object detection model includes: adjusting one or more parameters of the object detection model to minimize an overall loss, the overall loss based on a sum of the first objective function and the second objective function. 8. The method of claim 6 , further comprising: generating, using a second binary domain classifier, based on the plurality of feature maps, a second plurality of binary domain predictions, each of the second plurality of binary domain predictions indicative of a prediction, by the second binary domain classifier, of whether a given pixel in a given feature map is associated with the source domain or the target domain; wherein the pixel-level domain alignment objective is based on the second plurality of binary domain predictions. 9. The method of claim 8 , wherein the plurality of feature maps include: a first feature map at a first resolution; and a second feature map at a second resolution that is different than the first resolution, wherein the first feature map and second feature map are part of a feature pyramid output by a feature pyramid network (FPN) associated with the object detection model; wherein the second binary domain classifier includes: a first resolution-specific domain classifier associated with the first resolution; and a second resolution-specific domain classifier associated with the second resolution; and wherein generating the second plurality of binary domain predictions includes: generating, using the first resolution-specific domain classifier, based on the first feature map, a first binary domain prediction of the second plurality of binary domain predictions, the first binary domain prediction indicative of a prediction, by the first resolution-specific domain classifier, of whether a given pixel in the first feature map is associated with the source domain or the target domain; and generating, using the second resolution-specific domain classifier, based on the second feature map, a second binary domain prediction of the second plurality of binary domain predictions, the second binary domain prediction indicative of a prediction, by the second resolution-specific domain classifier, of whether a given pixel in the second feature map is associated with the source domain or the target domain. 10. The method of claim 1 , wherein the object detection model includes an FPN and a region proposal network (RPN); wherein inputting, into the object detection model, the batches of the source dataset and the batches of the target dataset includes: inputting, into the FPN, the batches of the source dataset and the batches of the target dataset; and generating, using the FPN, based on the batches of the source dataset and the batches of the target dataset, a plurality of feature maps; and wherein generating, using the object detection model, the plurality of region proposals includes: inputting, into the RPN, the plurality of feature maps; wherein the plurality of region proposals are generated, using the RPN, based on the plurality of feature maps. 11. The method of claim 1 , wherein training the object detection model includes: for a particular batch of the source dataset: training the object detection model using a first overall objective function; wherein the first overall objective function is based on an object detection objective and the instance-level domain alignment objective; and for a particular batch of the target dataset: training the object detection model using a second overall objective function; wherein the second overall objective function is based on the instance-level domain alignment objective but not the object detection objective. 12. The method of claim 3 , further comprising: determining, for each of the plurality of binary domain predictions, a probability value indicating a likelihood that the given region proposal is associated with the source domain or the target domain as predicted by the binary domain classifier; wherein the instance-level domain alignment objective includes a focal loss term configured to assign increased weight to region proposals having lower corresponding probability values compared to region proposals having greater corresponding probability values. 13. The method of claim 1 , wherein enabling cross-domain object detection inferences includes: deploying the object detection model

Assignees

Inventors

Classifications

  • using neural networks · CPC title

  • using classification, e.g. of video objects · CPC title

  • Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Validation; Performance evaluation · CPC title

  • Multiple classes · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11544503B2 cover?
A domain alignment technique for cross-domain object detection tasks is introduced. During a preliminary pretraining phase, an object detection model is pretrained to detect objects in images associated with a source domain using a source dataset of images associated with the source domain. After completing the pretraining phase, a domain adaptation phase is performed using the source dataset a…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06V30/40. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 03 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).