Who is the assignee on this patent?

National Yang Ming Chiao Tung Univ

What technology area does this patent fall under?

Primary CPC classification G06V40/176. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 28 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

System and method of image processing based emotion recognition

US11830292B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11830292-B2
Application number	US-202117467398-A
Country	US
Kind code	B2
Filing date	Sep 6, 2021
Priority date	Jun 30, 2021
Publication date	Nov 28, 2023
Grant date	Nov 28, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system of image processing based emotion recognition is disclosed. The system principally comprises a camera and a main processor. Particularly, there a plurality of function units provided in the main processor, including: face detection unit, feature processing module, feature combination unit, conversion module, facial action judging unit, and emotion recognition unit. According to the present invention, the emotion recognition unit is configured to utilize a facial emotion recognition (FER) model to evaluate or distinguish an emotion state of a user based on at least one facial action, at least one emotional dimension, and a plurality of emotional scores. As a result, the accuracy of the emotion recognition conducted by the emotion recognition unit is significantly enhanced because basis of the emotion recognition comprises basic emotions, emotional dimension(s) and the user's facial action.

First claim

Opening claim text (preview).

What is claimed is: 1. A system of image processing based emotion recognition, comprising: a camera, being faced to a user for capturing a user image; a main processor, being coupled to the camera, and comprising one or more embedded programs including instructions for: detecting a face region from the user image; extracting a plurality of facial features and a plurality of facial expression features from the face region, and subsequently outputting the plurality of facial features by a form of a first feature vector, and outputting the plurality of facial expression features by a form of a second feature vector; combining the first feature vector and the second feature vector to a third feature vector, and then utilizing a recurrent neural network (RNN) model to conduct a dimensionality reduction of the third feature vector, thereby producing an input feature vector; converting the input feature vector to a plurality of emotional scores that are respectively corresponding to a plurality of basic emotions, and also converting the input feature vector to an emotion dimension value; converting the input feature vector to a plurality of facial action values, and then determining a facial action based on the plurality of facial action values; and utilizing a facial emotion recognition (FER) model to evaluate an emotion state of the user according to the facial action, the emotion dimension value, the plurality of emotional scores. 2. The system of claim 1 , wherein the RNN model is established by using artificial neural networks selected from a group consisting of long short-term memory (LSTM) neural networks and gate recurrent unit (GRU) neural networks. 3. The system of claim 1 , wherein the main processor combines the first feature vector and the second feature vector to the third feature vector after completing an operation selected from a group consisting of pointwise addition operation and vector concatenation operation. 4. The system of claim 1 , wherein a pre-trained model is used by the main processor to extract the plurality of facial expression features from the face region, so as to output the second feature vector. 5. The system of claim 4 , wherein the pre-trained model being selected from a group consisting of VGG16 model and VGG19 model. 6. The system of claim 1 , wherein the plurality of basic emotions comprise neutral, surprise, happiness, angry, disgust, fear, and sadness. 7. The system of claim 1 , wherein the main processor is further embedded with one program including instructions for: conducting a brightness quality estimation of the face region, and then outputting a first estimation value; conducting a head rotation angle estimation of the face region, and then outputting a second estimation value; calculating an image quality loss weight based on the first estimation value and the second estimation value; and adjusting an image quality of the face region by using the image quality loss weight. 8. The system of claim 1 , wherein the main processor is further embedded with one program including instructions for: conducting a model training of the FER model by using a training sample set, the facial action, the emotion dimension value, the plurality of emotional scores. 9. The system of claim 8 , wherein the main processor is further embedded with one program including instructions for: calculating a plurality of average emotion feature vectors based on the plurality of emotional scores that are in correspondence to the plurality of basic emotions; calculating a plurality of Euclidean distances based on the plurality of emotional scores that are in correspondence to the plurality of basic emotions; calculating an emotion feature loss weight based on the plurality of average emotion feature vectors and the plurality of Euclidean distances; and adjusting at least one of the plurality of emotional scores by using the emotion feature loss weight before starting to conduct the model training of the FER model. 10. The system of claim 1 , wherein the main processor is further embedded with one program including instructions for: calculating a score loss of each of the plurality of emotional scores by using a cross entropy loss algorithm; calculating a value loss of the emotion dimension value by using a mean square error loss algorithm and a concordance correlation coefficient loss algorithm; and calculating a value loss of each of the plurality of facial action values by using a binary cross entropy loss algorithm. 11. The system of claim 1 , wherein the main processor and the camera are both integrated in an electronic device selected from a group consisting of desktop computer, smart television, smartphone, tablet computer, laptop computer, physiological parameter measuring device, electronic kiosk, and video door phone system. 12. The system of claim 1 , wherein the main processor is integrated in an electronic device, and being coupled to the camera; the electronic device being selected from a group consisting of smart television, smartphone, tablet computer, laptop computer, physiological parameter measuring device, electronic kiosk, and video door phone system. 13. A method of image processing based emotion recognition, comprising a plurality of steps of: (1) capturing a user image form a user by using a camera; (2) detecting a face region from the user image by using a main processor; (3) using the main processor to extract a plurality of facial features and a plurality of facial expression features from the face region, to output the plurality of facial features by a form of a first feature vector, and to output the plurality of facial expression features by a form of a second feature vector; (4) using the main processor to combine the first feature vector and the second feature vector to a third feature vector, and subsequently to utilize a recurrent neural network (RNN) model to conduct a dimensionality reduction of the third feature vector, thereby producing an input feature vector; (5) using the main processor to convert the input feature vector to a plurality of emotional scores that are respectively corresponding to a plurality of basic emotions, and to convert the input feature vector to an emotion dimension value; (6) using the main processor to convert the input feature vector to a plurality of facial action values, and then to determine a facial action based on the plurality of facial action values; and (7) using the main processor to utilize a facial emotion recognition (FER) model to evaluate an emotion state of the user according to the facial action, the emotion dimension value, the plurality of emotional scores. 14. The method of claim 13 , wherein the RNN model is established by using artificial neural networks selected from a group consisting of long short-term memory (LSTM) neural networks and gate recurrent unit (GRU) neural networks. 15. The method of claim 13 , wherein the main processor combines the first feature vector and the second feature vector to the third feature vector after completing an operation selected from a group consisting of pointwise addition operation and vector concatenation operation. 16. The method of claim 13 , wherein a pre-trained model is used by the main processor to extract the plurality of facial expression features from the face region, so as to output the second feature vector. 17. The method of claim 16 , wherein the pre-trained model being selected from a group consisting of VGG16 model and VGG19 model, and the plurality of basic emotions comprising neutral, surprise, hap

Assignees

National Yang Ming Chiao Tung Univ

Inventors

Classifications

G06V40/176Primary
Dynamic expression · CPC title
G06F18/214
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06N3/045
Combinations of networks · CPC title
G06T7/0002
Inspection of images, e.g. flaw detection · CPC title
G06V40/171Primary
Local features and components; Facial parts (eye characteristics G06V40/18); Occluding parts, e.g. glasses; Geometrical relationships · CPC title

Patent family

Related publications grouped by family.

View patent family 83103846

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11830292B2 cover?: A system of image processing based emotion recognition is disclosed. The system principally comprises a camera and a main processor. Particularly, there a plurality of function units provided in the main processor, including: face detection unit, feature processing module, feature combination unit, conversion module, facial action judging unit, and emotion recognition unit. According to the pre…
Who is the assignee on this patent?: National Yang Ming Chiao Tung Univ
What technology area does this patent fall under?: Primary CPC classification G06V40/176. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 28 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).