What technology area does this patent fall under?

Primary CPC classification G06V20/58. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Sep 30 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method of segmenting pedestrians in roadside image by using convolutional network fusing features at different scales

Patent metadata
Field	Value
Publication number	US-2021303911-A1
Application number	US-201917267493-A
Country	US
Kind code	A1
Filing date	May 16, 2019
Priority date	Mar 4, 2019
Publication date	Sep 30, 2021
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present invention discloses a roadside image pedestrian segmentation method based on a variable-scale multi-feature fusion convolutional network. For scenes where the pedestrian scale changes significantly in the intelligent roadside terminal image, this method designs two parallel convolutional neural networks to extract the local and global features of pedestrians at different scales in the image, and then fuses the local features and global features extracted by the first network with the local features and global features extracted by the second network at the same level, and then fuse the fused local features and global features for the second time to obtain a variable-scale multi-feature fusion convolutional neural network, and then train the network and input roadside pedestrian images to realize pedestrian segmentation. The present invention effectively solves the problems that most current pedestrian segmentation methods based on a single network structure are prone to segmentation boundary fuzziness and missing segmentation.

First claim

Opening claim text (preview).

1 . A roadside image pedestrian segmentation method based on a variable-scale multi-feature fusion convolutional network, comprising: (1) establishing a pedestrian segmentation dataset; and (2) constructing a variable-scale multi-feature fusion convolutional neural network comprising the steps of firstly designing two parallel convolutional neural networks to extract the local and global features of pedestrians at different scales in the image; wherein a first network designs a fine feature extraction structure for small-scale pedestrians; a second network expands the receptive field of the network at the shallow level for large-scale pedestrians; secondly providing a two-level fusion strategy to fuse extracted features by the following steps first, fusing fuse features of same level at different scales to obtain local and global features that are suitable for variable-scale pedestrians, and then constructing a jump connection structure to fuse the fused local features and global features for the second time so as to obtain the complete local detailed information and global information of variable-scale pedestrians and finally getting a variable-scale multi-feature fusion convolutional neural network the step includes the following sub-steps: Sub-step 1: designing the first convolutional neural network for small-scale pedestrians, including: {circle around ( 1 )} designing pooling layers wherein a number of pooling layers is 2; the pooling layers use a maximum pooling operation, their sampling sizes are both 2×2, and their step length is both 2; {circle around ( 2 )} designing standard convolutional layers a number of standard convolutional layers is 18, of which 8 layers all have a convolutional kernel size of 3×3 and a number of the convolutional kernels is 64, 64, 128, 128, 256, 256, 256 and 2, respectively, and the step length is 1; and the remaining 10 layers all have a convolutional kernel size of 1×1, the number of their convolutional kernels are 32, 32, 64, 64, 128, 128, 128, 128, 128 and 128, respectively, and their step length is 1; {circle around ( 3 )} designing deconvolutional layers. The number of deconvolutional layers is 2, the size of their convolutional kernels is all 3×3 and their step length is all 2, and the numbers of convolutional kernels are 2 and 2, respectively; {circle around ( 4 )} determining the network architecture establishing different network models according to the network layer parameters involved in {circle around ( 1 )}˜{circle around ( 3 )} in sub-step 1 of step (2), and then use the dataset established in step (1) to verify these models, and filtering out the optimal network structure in terms of both accuracy and real-timeliness an optimal network structure is obtained as follows: Standard convolutional layer 1 _ 1 : using 64 3×3 convolutional kernels and input samples with A×A pixels to make convolutions with a step length of 1, and then activating the convolutions with ReLU to obtain a feature map with a dimension of A×A×64; Standard convolutional layer 1 _ 1 _ 1 : using 32 1×1 convolutional kernels and the feature map output by standard convolutional layer 1 _ 1 to make convolutions with a step length of 1, and then activating the convolutions with ReLU to obtain a feature map with a dimension of A×A×32; Standard convolutional layer 1 _ 1 _ 2 : using 32 1×1 convolutional kernels and the feature map output by standard convolutional layer 1 _ 1 _ 1 to make convolutions with a step length of 1, and then activating the convolutions with ReLU to obtain a feature map with a dimension of A×A×32; Standard convolutional layer 1 _ 2 : using 64 3×3 convolutional kernels and the feature map output by standard convolutional layer 1 _ 1 _ 2 to make convolutions with a step length of 1, and then activating the convolutions with ReLU to obtain a feature map with a dimension of A×A×64; Pooling layer 1 : using the feature map output by 2×2 verified standard convolutional layer 1 _ 2 to make the maximum pooling with a step length of 2 to get a feature map with a dimension of A 2 × A 2 × 6 ⁢ 4 ; Standard convolutional layer 2 _ 1 : using 128 3×3 convolutional kernels and the feature map output by pooling layer 1 to make convolutions with a step length of 1, and then activating the convolutions with ReLU to obtain a feature map with a dimension of A 2 × A 2 × 1 ⁢ 2 ⁢ 8 ; Standard convolutional layer 2 _ 1 _ 1 : using 64 1×1 convolutional kernels and the feature map output by standard convolutional layer 2 _ 1 to make convolutions with a step length of 1, and then activating the convolutions with ReLU to obtain a feature map with a dimension of A 2 × A 2 × 6 ⁢ 4 ; Standard convolutional layer 2 _ 1 _ 2 : using 64 1×1 convolutional kernels and the feature map output by standard convolutional layer 2 _ 1 _ 1 to make convolutions with a step length of 1, and then activating the convolutions with ReLU to obtain a feature map with a dimension of A 2 × A 2 × 6 ⁢ 4 ; Standard convolutional layer 2 _ 2 : using 128 3×3 convolutional kernels and the feature map output by standard convolutional layer 2 _ 1 _ 2 to make convolutions with a step length of 1, and then activating the convolutions with ReLU to obtain a feature map with a dimension of A 2 × A 2 × 1 ⁢ 2 ⁢ 8 ; Pooling layer 2 : using the feature map output by 2×2 verified standard convolutional layer 2 _ 2 to make the maximum pooling with a step length of 2 to get a feature map with a dimension of A 4 × A 4 × 1 ⁢ 2 ⁢ 8 ;

Assignees

Univ Southeast

Inventors

Classifications

G06V20/58Primary
Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads · CPC title
G06V10/806
of extracted features · CPC title
G06V10/82
using neural networks · CPC title
G06V10/454
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
G06V10/764
using classification, e.g. of video objects · CPC title

Patent family

Related publications grouped by family.

View patent family 67077920

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021303911A1 cover?: The present invention discloses a roadside image pedestrian segmentation method based on a variable-scale multi-feature fusion convolutional network. For scenes where the pedestrian scale changes significantly in the intelligent roadside terminal image, this method designs two parallel convolutional neural networks to extract the local and global features of pedestrians at different scales in t…
Who is the assignee on this patent?: Univ Southeast
What technology area does this patent fall under?: Primary CPC classification G06V20/58. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Sep 30 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).