Clustering historical images using a convolutional neural net and labeled data bootstrapping

US10943146B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10943146-B2
Application numberUS-201916397114-A
CountryUS
Kind codeB2
Filing dateApr 29, 2019
Priority dateDec 28, 2016
Publication dateMar 9, 2021
Grant dateMar 9, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for classifying historical images. A feature extractor may create feature vectors corresponding to a plurality of images. A first classification of the plurality of images may be performed based on the plurality of feature vectors, which may include assigning a label to each of the plurality of images and assigning a probability for each of the assigned labels. The assigned probability for each of the assigned labels may be related to a statistical confidence that a particular assigned label is correctly assigned to a particular image. A subset of the plurality of images may be displayed to a display device. An input corresponding to replacement of an incorrect label with a corrected label for a certain image may be received from a user. A second classification of the plurality of images based on the input from the user may be performed.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for classifying a plurality of images comprising: creating, by a feature extractor, a plurality of feature vectors corresponding to the plurality of images; performing, by a feature classifier, a first classification of the plurality of images based on the plurality of feature vectors, wherein performing the first classification includes: for each image of the plurality of images, assigning a label of a plurality of labels to the image as a whole based on a corresponding feature vector of the plurality of feature vectors such that the label corresponds to every pixel of the image; and assigning a first probability for each of the assigned labels, wherein the assigned first probability for each of the assigned labels is related to a statistical confidence that a particular assigned label is correctly assigned to a particular image; determining a subset of probabilities of the assigned first probabilities; determining a subset of the plurality of images corresponding to the subset of probabilities; receiving a corrected label for replacement of an incorrect label for a certain image of the subset of the plurality of images; receiving a confidence level associated with the corrected label; adjusting the feature classifier using the corrected label and the confidence level associated with the corrected label; and performing, by the adjusted feature classifier, a second classification of the plurality of images based on the plurality of feature vectors, wherein performing the second classification includes: assigning at least one of the plurality of labels to each of the plurality of images, including assigning the corrected label to the certain image; and assigning a second probability for each of the assigned labels. 2. The method of claim 1 , wherein the feature extractor is a convolutional neural network (CNN), the CNN having been previously trained and the CNN being compatible with the plurality of images such that the plurality of images are receivable as inputs by the CNN. 3. The method of claim 1 , wherein the plurality of images are historical images. 4. The method of claim 1 , further comprising: determining a second subset of probabilities of the assigned second probabilities; determining a second subset of the plurality of images corresponding to the second subset of probabilities; receiving a second corrected label for replacement of a second incorrect label for a second certain image of the second subset of the plurality of images. 5. The method of claim 1 , wherein each of the plurality of feature vectors comprise 4096 numbers. 6. The method of claim 1 , wherein the subset of probabilities of the assigned first probabilities includes one of the following: all assigned first probabilities that are less than a probability threshold; all assigned first probabilities that are between a lower probability threshold and an upper probability threshold; one or more first probabilities that are below an average probability of the assigned first probabilities; and one or more first probabilities that are below a median probability of the assigned first probabilities. 7. The method of claim 1 , further comprising: receiving input for creation of a new label, wherein the new label is added to the plurality of labels.

Assignees

Inventors

Classifications

  • the supervisor being a human, e.g. interactive learning with a human teacher · CPC title

  • G06V10/82Primary

    using neural networks · CPC title

  • Interactive pattern learning with a human teacher · CPC title

  • G06K9/6254Primary

    Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10943146B2 cover?
Systems and methods for classifying historical images. A feature extractor may create feature vectors corresponding to a plurality of images. A first classification of the plurality of images may be performed based on the plurality of feature vectors, which may include assigning a label to each of the plurality of images and assigning a probability for each of the assigned labels. The assigned …
Who is the assignee on this patent?
Ancestry Com Operations Inc
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 09 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).