Quantitative analysis method and system for attention based on line-of-sight estimation neural network
US-2023025527-A1 · Jan 26, 2023 · US
US12175704B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12175704-B2 |
| Application number | US-202217681510-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 25, 2022 |
| Priority date | Jul 26, 2021 |
| Publication date | Dec 24, 2024 |
| Grant date | Dec 24, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of the present disclosure provide a quantitative method and system for attention based on a line-of-sight estimation neural network, which improves the stability and training efficiency of the line-of-sight estimation neural network. A few-sample learning method is applied to training of the line-of-sight estimation neural network, which improves generalization performance of the line-of-sight estimation neural network. A nonlinear division method for small intervals of angles of the line of sight is provided, which reduces an estimation error of the line-of-sight estimation neural network. Eye opening and closing detection is added to avoid the line-of-sight estimation error caused by an eye closing state. A method for solving a landing point of the line of sight is provided, which has high environmental adaptability and can be quickly used in actual deployment.
Opening claim text (preview).
What is claimed is: 1. A quantitative method for attention based on a line-of-sight estimation neural network, comprising: step 1 , calculating an attention area, wherein the attention area is an entire planar area requiring attention focusing of a target; fitting the attention area into one or more surfaces capable of being represented by equations, wherein the surfaces are called attention surfaces; and recording a mapping relationship between points on an actual object plane and the surfaces, wherein in the teaching field, the attention area is an entire teaching area comprising a blackboard, a platform, or multimedia; step 2 , obtaining an image of a detection area using a binocular camera, and locating a position of each face using a face detection method, wherein each face is a target for calculation of the attention; step 3 , correcting each target image and intercepting the target image; step 3 . 1 , calculating positions of key points on the target face, wherein the key points comprise: corners of eyes, the tip of the nose, and corners of the mouth; step 3 . 2 , correcting the target using key point information, such that the corners of the eyes or the corners of the mouth of the target are on the same horizontal line; and recording a rotation angle of the target; and step 3 . 3 , rotating the image for each target, then intercepting the image of the target face, and scaling the intercepted image to a required size; step 4 , detecting whether eyes of the target are open; step 5 , calculating a line-of-sight direction of the target using the line-of-sight estimation neural network in response to the eyes of the target being open; step 6 , calculating a landing point of the line of sight of the target; step 6 . 1 , obtaining a positional relationship between an viewing angle area and the camera, and establishing a space rectangular coordinate system with a main camera in the binocular camera as a reference; step 6 . 2 , according to the rotation angle of the target image recorded in step 3 . 2 , rotating the line-of-sight direction of the target obtained in step 5 back to normal; step 6 . 3 , according to a line-of-sight direction back to normal obtained in step 6 . 2 , calculating yaw and pitch angles of the line of sight in the space rectangular coordinate system established in step 6 . 1 ; step 6 . 4 , calculating a distance between the target and the main camera according to a binocular imaging principle, and then calculating coordinates of the target in the space rectangular coordinate system; and calculating an intersection of the line of sight and a viewing angle surface according to the coordinates and the angles of the line of sight in step 6 . 3 ; and step 6 . 5 , mapping the intersection back to the point on the plane to obtain the landing point of the line of sight; and step 7 , sampling the landing point of the line of sight for the target at multiple preset time intervals to obtain landing point information of the target; and performing weighting operation on the landing point information of the target based on a correlation degree of attention focusing required by the target in the attention area to obtain multiple weighted attention values, wherein the multiple weighted attention values form a quantized attention value sequence. 2. The quantitative method for attention based on a line-of-sight estimation neural network according to claim 1 , wherein the line-of-sight estimation neural network in step 5 comprises: a feature extraction backbone network, a fully connected layer for small interval classification of the yaw angle, a fully connected layer for small interval classification of the pitch angle, a fully connected layer for yaw angle regression, a fully connected layer for pitch angle regression, and a fully connected layer for eye opening and closing detection branches; the feature extraction backbone network has an input of the target image, and an output of extracted features, and extracted features are respectively input to the fully connected layer for small interval classification of the yaw angle, the fully connected layer for small interval classification of the pitch angle, and the fully connected layer for eye opening and closing detection branches; outputs of the fully connected layer for small interval classification of the yaw angle, and the fully connected layer for small interval classification of the pitch angle are correspondingly input to the fully connected layer for yaw angle regression and the fully connected layer for pitch angle regression; and outputs of the fully connected layer for yaw angle regression, the fully connected layer for pitch angle regression, and the fully connected layer for eye opening and closing detection branches are respectively an estimated yaw angle, an estimated pitch angle, and eye opening and closing detection results. 3. The quantitative method for attention based on a line-of-sight estimation neural network according to claim 2 , wherein in the line-of-sight estimation neural network, each unit in the fully connected layer for yaw angle regression represents a small interval of the angle after yaw angle division, and each unit in the fully connected layer for pitch angle regression represents a small interval of the angle after pitch angle division; and a method for the division is as follows: making the small intervals denser when the rotation angle of the line of sight is smaller, and making the small intervals sparser when the rotation angle of the line of sight is larger. 4. The quantitative method for attention based on a line-of-sight estimation neural network according to claim 2 , wherein a method for training the line-of-sight estimation neural network comprises the following steps: step 202 , freezing weight parameters of the fully connected layer for yaw angle regression and the fully connected layer for pitch angle regression in the line-of-sight estimation neural network, such that these weight parameters are not updated and adjusted during network training, and the line-of-sight estimation neural network enters a classification training state; step 204 , obtaining line-of-sight direction estimation information of each sample image in a line-of-sight dataset using the line-of-sight estimation neural network in the classification training state; step 206 , calculating a classification loss part and a regression loss part according to the line-of-sight direction estimation information and line-of-sight direction annotation information, and calculating a line-of-sight estimation loss function value by weighting a loss function; step 208 , since the line-of-sight dataset is divided into a training part and a verification part with no intersection with the training part, estimating images in the verification part in the dataset using the line-of-sight estimation neural network in the classification training state, and calculating performance parameters of the line-of-sight estimation neural network using the corresponding annotation information; step 210 , determining whether the line-of-sight estimation neural network meets performance requirements of a first training stage, if not, proceeding to step 212 , and if yes, proceeding to step 214 ; step 212 , updating and adjusting unfrozen weight parameters of the line-of-sight estimation neural network according to the line-of-sight estimation loss function value, and continuing to return to step 204 for iterative operations; step 214 , adjusting an anchor point value of the small interval of the angle, wherein values of the weight parameters of the fully connected layer for yaw angle regression and the fully connected layer for pitch angle regression are the anchor point values of the corresponding small interval of the angle, and an initialization anchor
Face · CPC title
Artificial neural networks [ANN] · CPC title
Training; Learning · CPC title
Validation; Performance evaluation · CPC title
Eye characteristics, e.g. of the iris · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.