What technology area does this patent fall under?

Primary CPC classification G06V10/50. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 04 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Efficient object detection with patch-level window processing

Patent metadata
Field	Value
Publication number	US-9697439-B2
Application number	US-201414505031-A
Country	US
Kind code	B2
Filing date	Oct 2, 2014
Priority date	Oct 2, 2014
Publication date	Jul 4, 2017
Grant date	Jul 4, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An object detection method includes for each of a set of patches of an image, encoding features of the patch with a non-linear mapping function, and computing per-patch statistics based on the encoded features for approximating a window-level non-linear operation by a patch-level operation. Then, windows are extracted from the image, each window comprising a sub-set of the set of patches. Each of the windows is scored based on the computed patch statistics of the respective sub-set of patches. Objects, if any, can then be detected in the image, based on the window scores. The method and system allow the non-linear operations to be performed only at the patch level, reducing the computation time of the method, since there are generally many more windows than patches, while not impacting performance unduly, as compared to a system which performs non-linear operations at the window level.

First claim

Opening claim text (preview).

What is claimed is: 1. An object detection method comprising: for each of a set of patches, encoding features of each of the patch with a non-linear mapping function; for each patch in the set, computing first and second scalar patch statistics with non-linear operations on the encoded features for approximating a window-level non-linear operation with the patch-level non-linear operations; storing the computed scalar patch statistics; extracting windows from the image, each window comprising a sub-set of the set of patches; scoring each of the windows with a linear function of the stored computed scalar patch statistics of the respective sub-set of patches; and providing for detecting objects in the image based on the window scores, wherein at least one of the encoding patch features, computing patch statistics, extracting windows, scoring the windows, and detection of objects is performed with a processor. 2. The method of claim 1 , wherein the encoding of features comprises: computing a patch descriptor for each of a set of patches of an image; and encoding each of the patch descriptors with a non-linear mapping function. 3. The method of claim 2 , wherein the patch encoding comprises computing a likelihood that the descriptor is emitted by a generative model. 4. The method of claim 3 , wherein the patch encoding comprises a Fisher Vector. 5. The method of claim 1 , wherein the computing of the patch statistics includes performing a non-linear operation on each of the encoded features. 6. The method of claim 5 , wherein the performing of the non-linear operation comprises computing an l 2 -normalization of each of the encoded features. 7. The method of claim 1 , wherein the computed patch statistics include: a statistic which reflects a contribution of the patch to the score of the window for a given target class to be detected; and a statistic which normalizes the encoded patch features. 8. The method of claim 1 , wherein the computed patch statistics include: a norm of the encoded patch features or a function thereof; and a weighted function of the encoded patch features in which weights are weights of a linear classifier trained to score window representations. 9. An object detection method, comprising: for each of a set of patches, encoding features of each of the patch with a non-linear mapping function; computing patch statistics on the encoded features for approximating a window-level non-linear operation with a set of patch-level operations; extracting windows from the image, each window comprising a sub-set of the set of patches; scoring each of the windows based on the computed patch statistics of the respective sub-set of patches; and providing for detecting objects in the image based on the window scores, wherein the patch statistics are of the form: {circumflex over (ψ)}( x k )=( w T φ( x k ),∥φ( x k )∥ 2 2 ) (7) wherein w comprises a vector of weights of a classifier function for classifying a window representation with respect to a selected class; T represents the transpose operator; φ(x k ) represents an encoded patch descriptor which encodes the patch features with a non-linear mapping function; and ∥φ(x k )∥ 2 2 represents the l 2 -norm of the encoded patch descriptor, wherein at least one of the encoding patch features, computing patch statistics, extracting windows, scoring the windows, and detection of objects is performed with a processor. 10. The method of claim 9 , wherein the scoring each of the windows based on the respective window representation comprises computing a score ŝ(χ) for the window representation as a function of: ∑ i = 1 K ⁢ w T ⁢ φ ⁡ ( x i ) ∑ i = 1 K ⁢  φ ⁡ ( x i )  2 2 where K represents the set of patches in the window; and x i represents one of the K patches in the window. 11. An object detection method, comprising: for each of a set of patches, encoding features of each of the patch with a non-linear mapping function; computing patch statistics on the encoded features for approximating a window-level non-linear operation with a set of patch-level operations; extracting windows from the image, each window comprising a sub-set of the set of patches; scoring each of the windows based on the computed patch statistics of the respective sub-set of patches, wherein the scoring of each of the windows employs integral images for pooling of the weighted encoded patch features, wherein when integral images are used, the scoring of each of the windows comprises four look-up operations on an integral image H: {circumflex over ( s )}(χ)={tilde over ( g )}( H ( x 0 ,y 0 )+ H ( x 1 ,y 1 )− H ( x 0 ,y 1 )− H ( x 1 ,y 0 ))+ b, (10) where {tilde over (g)}(u, v)=u/√{square root over (v)}, (x 0 , y 0 ) are the coordinates of the upper left corner of window χ, and (x 1 , y 1 ) are the coordinates of the lower right corner of window χ, and providing for detecting objects in the image based on the window scores, wherein at least one of the encoding patch features, computing patch statistics, extracting windows, scoring the windows, and detection of objects is performed with a processor. 12. The method of claim 11 , wherein the method includes generating a data structure H that, for any location (x, y) in image , stores the cumulative sums of all the patch statistics {circumflex over (ψ)}(x) for all the patches x above and to the left of (x, y): H ( x,y )=Σ xε (x,y) {circumflex over (ψ)}( x ), (9) where (x, y) is the restriction of to the set of patches above and to the left of (x, y). 13. The method of claim 1 , wherein the scoring of each of the windows employs integral images and the scoring of each of the windows comprises four look-up operations on the integral image H: {circumflex over ( s )}(χ)={tilde over ( g )}( H ( x 0 ,y 0 )+ H ( x 1 ,y 1 )− H ( x 0 ,y 1 )− H ( x 1 ,y 0 ))+ b, (10) where {tilde over (g)}(u, v)=u/√{square root over (v)}, (x 0 , y 0 ) are the coordinates of the upper left corner of window χ, and (x 1 , y 1 ) are the coordinates of the lower right corner of

Assignees

Xerox Corp

Inventors

Classifications

G06V10/774
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06V10/758
Involving statistics of pixels or of feature values, e.g. histogram matching · CPC title
G06V10/50Primary
by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis · CPC title
G06F18/214
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06K9/6256
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 55633027

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9697439B2 cover?: An object detection method includes for each of a set of patches of an image, encoding features of the patch with a non-linear mapping function, and computing per-patch statistics based on the encoded features for approximating a window-level non-linear operation by a patch-level operation. Then, windows are extracted from the image, each window comprising a sub-set of the set of patches. Each …
Who is the assignee on this patent?: Xerox Corp
What technology area does this patent fall under?: Primary CPC classification G06V10/50. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 04 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).