Fakecatcher: detection of synthetic portrait videos using biological signals
US-2021209388-A1 · Jul 8, 2021 · US
US12213767B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12213767-B2 |
| Application number | US-202217696909-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 17, 2022 |
| Priority date | May 25, 2020 |
| Publication date | Feb 4, 2025 |
| Grant date | Feb 4, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided is a video-based method and system for accurately estimating heart rate and facial blood volume distribution, and the method mainly comprises the following steps: firstly, carrying out face detection of video frame containing human face, and extracting face image sequence and face key position points sequence in time dimension; secondly, compressing these sequence of face image and face key position points to obtain the facial signals in time dimension; thirdly, estimating facial blood volume distribution by facial signals mentioned in third step; finally, estimating heart rate values by using model based on deep learning technology and the spectrum analysis method respectively, then fusing the estimation results by Kalman filter to promote the accuracy of heart rate estimation.
Opening claim text (preview).
What is claimed is: 1. A video-based method for accurately estimating a human heart rate and facial blood volume distribution, comprising the following steps: (1) detecting a human face region in video frame, and extracting a human face image sequence and face key position points in time dimension, extracting a global face signal and a set of face roi signals based on the face image sequence, preprocessing the signals; wherein the step (1) specifically comprises: (1.1) using a convolution neural network model to detect the human face region and the face position key points in the video frame, and respectively generating a human face image sequence and a face key position point sequence in time dimension; (1.2) extracting the global face signal and the set of the face roi signals, respectively, based on the face image sequence, the global face signal can be extracted as shown by Formula 3, where: face_sig is a compressed signal, PCompress ( ) is a compression function which is used to calculate an average pixel intensity of a face image of the face image sequence, and face_seq is the face image sequence; face_sig=PCompress(face_seq) (3) segmenting the face image by roi blocks with R×R size to obtain roi block image sequences in time dimension, as shown in Formula 4, where: face_roi i represents an i th roi block image sequence, face_roi_seq is a set of roi block image sequences, and mxn is a sum of the roi blocks; face_roi_seq={face_roi 1 ,face_roi 2 , . . . ,face_roi i , . . . ,face_roi m×n } (4) compressing each roi block image sequence, as shown in Formula5, where: face_roi_seq is the set of roi block image sequences, PCompress ( ) is the compression function for calculating mean of pixel intensity of the image of the sequence, and face_roi_sig is the result of PCompress ( ); face_roi_sig=PCompress(face_roi_seq) (5) where: face_roi_sig={face_roi_sig 1 , . . . ,face_roi_sig i , . . . ,face_roi_sig m×n } (6) in Formula 6, face_roi_sig i is a signal compressed by the i th roi block image sequence, and m×n is the sum of the roi blocks; (1.3) preprocessing the global face signal and the set of the face roi signals to eliminate components outside a specified frequency range; (2) estimate heart rate value and facial blood volume distribution based on a reference signal and the set of roi signals; (3) estimate heart rate value based on heart rate distribution probability by using a heart rate estimation model based on Long and Short Time Memory Network (LSTM) and a residual convolution neural network model; (4) fusing results of the heart rate value of the step (2) and the step (3) based on Kalman filtering. 2. The video-based method for accurately estimating a human heart rate and a facial blood volume distribution according to claim 1 , wherein the step (2) specifically comprises: (2.1) calculating the reference signal by linear weighting, as shown in Formula 9, where sig_ref is the reference signal, roi_sig_r is the preprocessed set of the face roi signals, and m×n is the sum of the roi blocks; sig_ref = weight_set × roi_sig _r = ∑ i = 1 m × n w i × roi_sig _r i weight_set = { w 1 , w 2 , … , w i , … , w m × n } ( 9 ) roi_sig _r = sig process ( face_roi _sig ) ( 8 ) where: weight_set is a calculated weight set; sigprocess ( ) is a signal preprocessing function; (2.2) calculating a frequency spectrum of the reference signal by using a lomb-scargle spectrum analysis method, the heart rate value corresponds to a extremum value of the frequency spectrum; (2.3) estimating the facial blood volume distribution. 3. The video-based method for accurately estimating human heart rate and a facial blood volume distribution according to claim 2 , wherein the step (2.3) is specifically as below: as shown in Formula 13, sig_ref_sd is the spectrum of the reference signal, and v is the blood volume distribution; v =Volume(sig_ref_ sd ) (13) where, Volume ( ) is a function for calculating
Face · CPC title
Artificial neural networks [ANN] · CPC title
Training; Learning · CPC title
Video; Image sequence · CPC title
Biomedical image inspection · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.