Method and device for Quasi-Gibbs structure sampling by deep permutation for person identity inference

US10339408B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10339408-B2
Application numberUS-201615388039-A
CountryUS
Kind codeB2
Filing dateDec 22, 2016
Priority dateDec 22, 2016
Publication dateJul 2, 2019
Grant dateJul 2, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure provides a method and device for visual appearance based person identity inference. The method may include obtaining a plurality of input images. The input images include a gallery set of images containing, persons-of-interest and a probe set of images containing person detections, and one input image corresponds to one person. The method may further include extracting N feature maps from the input images using a Deep Neural Network, N being a natural number; constructing N structure samples of the N feature maps using conditional random field (CRF) graphical models; learning the N structure samples from an implicit common latent feature space embedded in the N structure samples; and according to the learned structures, identifying one or more images from the probe set containing a same person-of-interest as an image in the gallery set.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for visual appearance based person identity inference, comprising: obtaining a plurality of input images, wherein the input images include a gallery set of images containing persons-of-interest and a probe set of images containing person detections, and one input image corresponds to one person; extracting N feature maps from the input images using a Deep Neural Network (DNN), N being a natural number; constructing N structure samples of the N feature maps using conditional random field (CRF) graphical models, comprising: for a feature map, constructing an initial graph structure by K Nearest Neighbor (KNN) based on feature similarity in a feature space corresponding to the feature map, the graph model including nodes and edges, a node representing one person; performing structure permutations by a plurality of iterations of KNN computation in N feature spaces with a Quasi-Gibbs Structure Sampling (QGSS) process; assigning labels to the nodes that minimize a conditional random field (CRF) energy function over all possible labels, wherein the all possible labels represent all different persons-of-interest in the gallery set; and deriving the N structure samples from the plurality of iterations and the assigned labels; learning the N structure samples from an implicit common latent feature space embedded in the N structure samples; and according to the learned structures, identifying one or more images from the probe set containing a same person-of-interest as an image in the gallery set. 2. The method according to claim 1 , wherein: the plurality of iterations include first a iterations and later b iterations, wherein a and b are natural numbers; results of the first a iterations are discarded, and the N structure samples are derived from the later b iterations. 3. The method according to claim 1 , wherein: a node in the graph model has m possible states, m representing a quantity of different persons-of-interest in the gallery set. 4. The method according to claim 1 , wherein: the labels are assigned to the nodes according to the graph structure after the plurality of iterations are finished. 5. The method according to claim 1 , wherein: a graph of a CRF model representing a re-identification structure is learned through the N structure samples; and an energy minimization with sparse approach is performed to cut the graph into a plurality of clusters, each cluster containing images corresponding to one of the persons-of-interest. 6. The method according to claim 1 , wherein N different kernels are used in the DNN for convolutions with the images in the gallery set and the probe set; and the N feature maps are produced by a last couple of convolution layers in the DNN. 7. The method according to claim 5 , wherein the CRF model with pairwise potentials is: p ⁡ ( Y | X ) = 1 Z ⁡ ( X ) ⁢ ∏ 〈 i , j 〉 ⁢ ψ ij ⁡ ( y i , y j , X ) ⁢ ∏ i ⁢ ψ i ⁡ ( y i , X ) wherein: <i,j> is product over all edges in the graph, ψ i is a node potential and ψ i is an edge potential, X denotes the common latent features derived from the N structure samples implicitly; and Y denotes to labeling of person-of-interest candidates. 8. A device for visual appearance based person identity inference, comprising one or more processors configured to: obtain a plurality of input images, wherein the input images include a gallery set of images containing persons-of-interest and a probe set of images containing person detections, and one input image corresponds to one person; extract N feature maps from the input images using a Deep Neural Network (DNN), N being a natural number; construct N structure samples of the N feature maps using conditional random field (CRF) graphical models, comprising: for a feature map, constructing an initial graph structure by K Nearest Neighbor (KNN) based on feature similarity in a feature space corresponding to the feature map, the graph model including nodes and edges, a node representing one person; performing structure permutations by a plurality of iterations of KNN computation in N feature spaces with a Quasi-Gibbs Structure Sampling (QGSS) process; assigning labels to the nodes that minimize a conditional random field (CRF) energy function over all possible labels, wherein the all possible labels represent all different persons-of-interest in the gallery set; and deriving the N structure samples from the plurality of iterations and the assigned labels; learn the N structure samples from an implicit common latent feature space embedded in the N structure samples; and accor

Assignees

Inventors

Classifications

  • using probabilistic graphical models from image or video features, e.g. Markov models or Bayesian networks · CPC title

  • G06V10/82Primary

    using neural networks · CPC title

  • using classification, e.g. of video objects · CPC title

  • Graphical models, e.g. Bayesian networks · CPC title

  • Distances to closest patterns, e.g. nearest neighbour classification · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10339408B2 cover?
The present disclosure provides a method and device for visual appearance based person identity inference. The method may include obtaining a plurality of input images. The input images include a gallery set of images containing, persons-of-interest and a probe set of images containing person detections, and one input image corresponds to one person. The method may further include extracting N …
Who is the assignee on this patent?
Tcl Res America Inc
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 02 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).