Method and system of facial expression recognition using linear relationships within landmark subsets
US-2017286759-A1 · Oct 5, 2017 · US
US10068135B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10068135-B2 |
| Application number | US-201615388372-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 22, 2016 |
| Priority date | Dec 22, 2016 |
| Publication date | Sep 4, 2018 |
| Grant date | Sep 4, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A face detection and tracking method of a robotic device. The method includes obtaining a video frame from a camera of the robotic device; performing a face detection process on the video frame to detect one or more faces in the video frame and, after the face detection process, identifying the detected one or more faces in the video frame. The method also includes performing a vision-based tracking process to track the identified one or more faces using a combination of a feature points tracker and a correlation tracker and, after the vision-based tracking process, performing a detection and tracking fusion process and providing desired target prediction of the identified one or more faces.
Opening claim text (preview).
What is claimed is: 1. A face detection and tracking method of a robotic device, comprising: obtaining a video frame from a camera of the robotic device; performing a face detection process on the video frame to detect one or more faces in the video frame; after the face detection process, identifying the detected one or more faces in the video frame; performing a vision-based tracking process to track the identified one or more faces using a combination of a feature points tracker and a correlation tracker, comprising: for each identified face, determining whether the identified face has been tracked; when it is determined that the identified face has not been tracked, initializing the feature points tracker and the correlation tracker based on a set of feature points of the identified face; when it is determined that the identified face has been tracked, refining the feature points of the identified face; and using the correlation tracker to provide both translation and scale estimation of the identified face; and after the vision-based tracking process, performing a detection and tracking fusion process and providing desired target prediction of the identified one or more faces. 2. The face detection and tracking method according to claim 1 , wherein performing a face detection process further includes: applying a histogram-of-oriented-gradient (HOG) face detector on the video frame to generate a set of bounding boxes of faces in the video frame to represent the detected one or more faces in the video frame. 3. The face detection and tracking method according to claim 2 , wherein the set of bounding boxes are provided as: BB faces ={bb f1 (lx 1 ,ly 1 ,rx 1 ,ry 1 ) ,bb f2 (lx 2 ,ly 2 ,rx 2 ,ry 2 ) . . . bb fn (lx n ,ly n ,rx n ,ry n ) }, wherein n is an integer representing a total number of the one or more faces f 1 , f 2 , . . . , f n , and each bounding box bb f (lx,ly,rx,ry) includes a location of left upper corner (lx,ly) and a location of bottom-right corner (rx,ry), lx, ly, rx, ry being coordinates. 4. The face detection and tracking method according to claim 2 , wherein identifying the detected one or more faces further includes: extracting a facial feature vector of each of the detected one or more faces; comparing the extracted facial feature vector with a database stored with labeled facial feature vectors each with a person identity label; and labeling each face with the personal label of a facial feature vector in the database with a shortest distance to the extracted facial feature vector. 5. The face detection and tracking method according to claim 1 , wherein performing a detection and tracking fusion process further includes: based on the set of bounding boxes, the set of feature points of the identified face, and the translation and scale estimation of the identified face, providing an estimation vector of the identified face including both location information and velocity vector of the identified face. 6. The face detection and tracking method according to claim 5 , wherein providing an estimation vector further includes: based on the set of bounding boxes, the set of feature points of the identified face, and the translation and scale estimation of the identified face, building a feature map and applying an attention mask, a Conventional Neural Network and to a Clock-Work Recurrent Neuron Network to generate the estimation vector of the identified face. 7. A non-transitory computer-readable medium having computer program for, when being executed by a processor, performing a face detection and tracking method on a robotic device, the method comprising: obtaining a video frame from a camera of the robotic device; performing a face detection process on the video frame to detect one or more faces in the video frame; after the face detection process, identifying the detected one or more faces in the video frame; performing a vision-based tracking process to track the identified one or more faces using a combination of a feature points tracker and a correlation tracker, comprising: for each identified face, determining whether the identified face has been tracked; when it is determined that the identified face has not been tracked, initializing the feature points tracker and the correlation tracker based on a set of feature points of the identified face; when it is determined that the identified face has been tracked, refining the feature points of the identified face; and using the correlation tracker to provide both translation and scale estimation of the identified face; and after the vision-based tracking process, performing a detection and tracking fusion process and providing desired target prediction of the identified one or more faces. 8. The non-transitory computer-readable medium according to claim 7 , wherein performing a face detection process further includes: applying a histogram-of-oriented-gradient (HOG) face detector on the video frame to generate a set of bounding boxes of faces in the video frame to represent the detected one or more faces in the video frame. 9. The non-transitory computer-readable medium according to claim 8 , wherein the set of bounding boxes are provided as: BB faces ={bb f1 (lx 1 ,ly 1 ,rx 1 ,ry 1 ) ,bb f2 (lx 2 ,ly 2 ,rx 2 ,ry 2 ) . . . bb fn (lx n ,ly n ,rx n ,ry n ) }, wherein n is an integer representing a total number of the one or more faces f 1 , f 2 , . . . , f n , and each bounding box bb f (lx,ly,rx,ry) includes a location of left upper corner (lx,ly) and a location of bottom-right corner (rx,ry), lx, ly, rx, ry being coordinates. 10. The non-transitory computer-readable medium according to claim 8 , wherein identifying the detected one or more faces further includes: extracting a facial feature vector of each of the detected one or more faces; comparing the extracted facial feature vector with a database stored with labeled facial feature vectors each with a person identity label; and labeling each face with the personal label of a facial feature vector in the database with a shortest distance to the extracted facial feature vector. 11. The non-transitory computer-readable medium according to claim 7 , wherein performing a detection and tracking fusion process further includes: based on the set of bounding boxes, the set of feature points of the identified face, and the translation and scale estimation of the identified face, providing an estimation vector of the identified face including both location information and velocity vector of the identified face. 12. The non-transitory computer-readable medium according to claim 11 , wherein providing an estimation vector further includes: based on the set of bounding boxes, the set of feature points of the identified face, and the translation and scale estimation of the identified face, building a feature map and applying an attention mask, a Conventional Neural Network and to a Clock-Work Recurrent Neuron Network to generate the estimation vector of the identified face. 13. A face detection and tracking system of a robotic device, comprising: a memory; and a processor coupled to the memory, wherein the processor is configured to: obtain a video frame from a camera of the robotic device and to perform a face detection process on the video frame to detect one or more faces in the video frame; after the face detection process, identify the detected o
using neural networks · CPC title
using classification, e.g. of video objects · CPC title
in video content (extracting overlay text G06V20/62; video retrieval G06F16/70; processing of video elementary streams in video servers H04N21/234; processing of video elementary streams in video clients H04N21/44) · CPC title
Smoothing the distance, e.g. radial basis function networks [RBFN] · CPC title
Bounding box · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.