System and method of incremental learning for object detection

US11080558B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11080558-B2
Application numberUS-201916360563-A
CountryUS
Kind codeB2
Filing dateMar 21, 2019
Priority dateMar 21, 2019
Publication dateAug 3, 2021
Grant dateAug 3, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems perform incremental learning object detection in images and/or videos without catastrophic forgetting of previously-learned object classes. A two-stage neural network object detector is trained to locate and identify objects pertaining to an additional object class by iteratively updating the two-stage neural network object detector until an overall detection accuracy criterion is met. The updating is performed so as to balance minimizing a loss of an initial ability to locate and identify objects pertaining to the previously-learned object classes and maximizing an ability to additionally locate and identify the objects pertaining to the additional object class. Assessing whether the overall detection accuracy criterion is met compares outputs of an initial version of the two-stage neural network object detector with a current region proposal output by a current version of the two-stage neural network object detector to determining a region proposal distillation loss and a previously-learned-object identification distillation loss.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for incremental learning object detection in images without catastrophic forgetting of previously-learned one or more object classes, the method comprising: training a two-stage neural network object detector to locate and identify objects pertaining to an additional object class in images by iteratively updating the two-stage neural network object detector until an overall detection accuracy criterion is met, wherein the updating is performed so as to balance minimizing a loss of an initial ability to locate and identify objects pertaining to the previously-learned one or more object classes and maximizing an ability to additionally locate and identify the objects pertaining to the additional object class, and assessing whether the overall detection accuracy criterion is met includes: determining a region proposal distillation loss based on comparing a region proposal output by an initial version of the two-stage neural network object detector with a current region proposal output by a current version of the two-stage neural network object detector, wherein the region proposal distillation loss quantifies a decrease of ability to locate objects pertaining to a previously-learned object classes; and determining a previously learned object identification distillation loss based on comparing a previously learned object prediction obtained by the initial version of the two-stage neural network object detector with an object prediction output by the current version of the two-stage neural network object detector. 2. The method of claim 1 , wherein the two-stage neural network object detector is updated such as to minimize the region proposal distillation loss and the previously learned object identification distillation loss. 3. The method of claim 2 , wherein the two-stage neural network object detector is updated by applying an optimization method to a total loss, which is a combination of the region proposal distillation loss, the previously learned object identification distillation loss, a region proposal network loss of the current version of the two-stage neural network object detector, and an object identification loss of the current version of the two-stage neural network object detector, and the region proposal network loss and the object identification loss quantify the ability to locate and identify the objects pertaining to the additional object class, respectively. 4. The method of claim 3 , wherein, if the total loss exceeds a first predetermined threshold or an ability to detect objects pertaining to one object class among the one or more object classes decreases below a second predetermined threshold, then another optimization method is applied, or the training is abandoned. 5. The method of claim 3 , wherein the optimization method is a stochastic gradient descent method applied to the total loss. 6. The method of claim 1 , wherein the training uses training images, and the overall detection accuracy criterion is evaluated using validation images distinct from the training images. 7. The method of claim 1 , further comprising: additional training the two-stage neural network object detector to locate and identify objects pertaining to another additional object class by iteratively additionally updating the two-stage neural network object detector until another overall detection accuracy criterion is met, wherein the additionally updating is performed so as to minimize a loss of an ability to locate and identify the objects pertaining to the one or more previously-learned object classes and to the additional class while maximizing an additional ability to locate and identify the objects pertaining to the other additional object class. 8. The method of claim 1 , wherein the two-stage neural network object detector is simultaneously trained to locate and identify the objects pertaining to the additional object class and objects pertaining to another additional object class. 9. The method of claim 1 , further comprising at least one of: receiving a user request for training the two-stage neural network object detector to locate and identify the objects pertaining to the additional object class, and retrieving training images including an object pertaining to the additional object class from Internet. 10. A method for incremental learning object detection without catastrophic forgetting of previously-learned object classes, the method comprising: receiving training images including objects pertaining to an object class unknown to an initial version of a two-stage neural network object detector that is able to detect objects pertaining to at least one previously-learned object class; and training the two-stage neural network object detector to detect the objects of the object class initially unknown, using one-by-one plural images among the training images, until a predetermined condition is met by: inputting one image of the plural images to the initial version of the two-stage neural network object detector to obtain a first region proposal and a first object prediction for the at least one previously-learned object class, inputting the one image to a current version of the two-stage neural network object detector to obtain a second region proposal and a second object prediction for the at least one previously-learned object class and the object class initially unknown, comparing the first region proposal with the second region proposal to estimate a region proposal distillation loss quantifying a decrease of ability to locate objects of the at least one previously-learned object class, comparing the first object prediction with the second object prediction to estimate an object identification distillation loss quantifying decrease of ability to identify the objects of the at least one previously-learned object class, comparing the second region proposal with ground-truth labels of the one image to estimate a region proposal network loss for the object class initially unknown, comparing the second object prediction with the ground-truth labels to estimate an object identification loss for the object class initially unknown, calculating a total loss combining the region proposal distillation loss, the object identification distillation loss, the region proposal network loss and the object identification loss, and updating the current version of the two-stage neural network object detector so as to minimize the total loss, wherein the predetermined condition is met when the number of training iterations reaches a predetermined number, or when a total loss decrease rate is below a predetermined threshold. 11. A computer-readable medium containing a computer-readable code that when read by a computer causes the computer to perform a method for incremental learning for object detection in images without catastrophic forgetting of previously-learned one or more object classes, the computer-readable medium comprises: one or more non-transitory computer-readable medium and method stored on the one or more non-transitory computer-readable medium, the method comprising: training a two-stage neural network object detector to locate and identify objects pertaining to an additional object class in images by iteratively updating the two-stage neural network object detector until an overall detection accuracy criterion is met, wherein the updating is performed so as to balance minimizing a loss of an initial ability to locate and identify objects pertaining to the previously-learned one or more object classes and maximizing an ability to additionally locate and identify the objects pertaining to the additional object class

Assignees

Inventors

Classifications

  • the supervisor being a human, e.g. interactive learning with a human teacher · CPC title

  • Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Terrestrial scenes (scenes under surveillance with static cameras G06V20/52; scenes perceived from the exterior of a vehicle G06V20/56; scenes perceived from the interior of a vehicle G06V20/59) · CPC title

  • structured as a network, e.g. client-server architectures · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11080558B2 cover?
Methods and systems perform incremental learning object detection in images and/or videos without catastrophic forgetting of previously-learned object classes. A two-stage neural network object detector is trained to locate and identify objects pertaining to an additional object class by iteratively updating the two-stage neural network object detector until an overall detection accuracy criter…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 03 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).