Neural network target feature detection

US12536249B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12536249-B2
Application numberUS-202418435405-A
CountryUS
Kind codeB2
Filing dateFeb 7, 2024
Priority dateMay 7, 2021
Publication dateJan 27, 2026
Grant dateJan 27, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of training a neural network for detecting target features in images is described. The neural network is trained using a first data set that includes labeled images, where at least some of the labeled images having subjects with labeled features, including: dividing each of the labeled images of the first data set into a respective plurality of tiles, and generating, for each of the plurality of tiles, a plurality of feature anchors that indicate target features within the corresponding tile. Target features that correspond to the plurality of feature anchors are detected in a second data set of unlabeled images. Images of the second data set having target features that were not detected are labeled. A third data set that includes the first data set and the labeled images of the second data set is generated. The neural network is trained using the third data set.

First claim

Opening claim text (preview).

What is claimed is: 1 . An image processing system that includes a neural network implemented on a computer for feature detection, comprising: a convolutional neural network having a plurality of layers stacked sequentially, including: a first set of layers, each layer of the first set of layers having a depth-wise convolution and a point-wise convolution, wherein the first set of layers is a first subset of a different neural network; and a second set of layers after the first set of layers that is based upon a compression of a remainder subset of layers of the different neural network, each layer of the second set of layers having a point-wise convolution. 2 . The image processing system of claim 1 , wherein: the convolutional neural network is configured to process an input image to detect target features within the input image; and the image processing system is configured to provide estimated feature locations for detected target features within the input image. 3 . The image processing system of claim 2 , wherein the image processing system further comprises a post-processor configured to receive tile data generated for the input image by the convolutional neural network and convert the tile data to one or more bounding boxes for the detected target features. 4 . The image processing system of claim 3 , wherein the target features are one or more of bodies, faces, eyes, or hands of a subject within the input image. 5 . The image processing system of claim 3 , wherein the image processing system further comprises a central processing unit (CPU) that executes processing of the convolutional neural network. 6 . The image processing system of claim 5 , wherein: the image processing system further comprises a neural processing unit (NPU); and the CPU executes the processing of the convolutional neural network without utilizing the NPU. 7 . The image processing system of claim 2 , wherein: the image processing system further comprises a pre-processor configured to process an original image and provide the processed original image to the convolutional neural network; and the pre-processor comprises a resize processor that resizes the original image and a color representation normalizer that normalizes colors for pixels within the original image. 8 . The image processing system of claim 7 , wherein the pre-processor further comprises a color scale converter that converts a color scale of the original image to a planar format. 9 . The image processing system of claim 1 , wherein the remainder subset comprises depth-wise convolutions and point-wise convolutions. 10 . The image processing system of claim 9 , wherein the compression of the remainder subset comprises omitting the depth-wise convolutions from the remainder subset to form the second set of layers. 11 . The image processing system of claim 1 , wherein the compression of the remainder subset reduces a computational cost of the different neural network. 12 . The image processing system of claim 1 , wherein the compression of the remainder subset reduces a number of parameters of the convolutional neural network relative to the different neural network. 13 . The image processing system of claim 12 , wherein the compression of the remainder subset is performed using a width multiplier that reduces the number of parameters of the remainder subset. 14 . A method for feature detection, the method comprising: resizing an original image according to a convolutional neural network; changing color parameters of the resized image according to the convolutional neural network to generate an input image; processing the input image using the convolutional neural network to detect target features within the input image and generate tile data representing detected target features within the input image, the convolutional neural network comprising a first set of layers that is a first subset of a different neural network and a second set of layers stacked sequentially after the first set of layers and based upon a compression of a remainder subset of the different neural network; and processing the tile data to identify estimated feature locations of the detected target features. 15 . The method of claim 14 , wherein processing the input image using the convolutional neural network comprises: processing the input image using the first set of layers to generate first tile data; processing the first tile data using the second set of layers to generate second tile data; and processing the tile data to identify the estimated feature locations comprises processing the second tile data to generate bounding boxes for the detected target features. 16 . The method of claim 15 , wherein each layer of the first set of layers has a depth-wise convolution and a point-wise convolution. 17 . The method of claim 16 , wherein each layer of the second set of layers has a point-wise convolution and omits depth-wise convolutions of the remainder subset of the different neural network. 18 . The method of claim 17 , wherein processing the input image using the convolutional neural network comprises executing processing of the convolutional neural network on a central processing unit without utilizing a neural processing unit. 19 . The method of claim 15 , wherein the detected target features are one or more of bodies, faces, eyes, or hands of a subject within the input image. 20 . The method of claim 14 , wherein processing the input image using the convolutional neural network comprises executing processing of the convolutional neural network on a central processing unit.

Assignees

Inventors

Classifications

  • Target detection · CPC title

  • Detection; Localisation; Normalisation · CPC title

  • Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

  • Architecture, e.g. interconnection topology · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12536249B2 cover?
A method of training a neural network for detecting target features in images is described. The neural network is trained using a first data set that includes labeled images, where at least some of the labeled images having subjects with labeled features, including: dividing each of the labeled images of the first data set into a respective plurality of tiles, and generating, for each of the pl…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 27 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).