Method for presenting face in video call, video call apparatus, and vehicle

US12192586B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12192586-B2
Application numberUS-202217708876-A
CountryUS
Kind codeB2
Filing dateMar 30, 2022
Priority dateSep 30, 2019
Publication dateJan 7, 2025
Grant dateJan 7, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for presenting a face in a video call includes: obtaining a key feature point of a facial expression of a user based on a face image of the user; driving a three-dimensional (3D) head image of the user by using the key feature point of the facial expression of the user, to obtain a target 3D avatar of the user, where the target 3D avatar of the user has an expression of the user; rotating the target 3D avatar based on a preset presentation angle, to obtain a target 3D avatar at the preset presentation angle; and transmitting the target 3D avatar at the preset presentation angle to a peer video call device. A user can see a 3D avatar that is of a user and that is at a preset presentation angle in real time.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: obtaining, based on a face image of a user in a video call process, a key feature point of a facial expression of the user, wherein the face image comprises N infrared images of the user and a color face image of the user, and wherein N is an integer greater than 0; obtaining, based on the N infrared images and the color face image, a three-dimensional (3D) head point cloud information of the user; constructing, based on the 3D head point cloud information, a 3D head image of the user; driving, using the key feature point, the 3D head image to obtain a first target 3D avatar of the user, wherein the first target 3D avatar has an expression of the user; rotating, based on a preset presentation angle, the first target 3D avatar to obtain a second target 3D avatar at the preset presentation angle; and sending, to a peer video call device, the second target 3D avatar. 2. The method of claim 1 , further comprising: obtaining, based on the color face image, a face texture feature of the user; and further constructing, based on the face texture feature, the 3D head image that is a color image, wherein the 3D head point cloud information comprises first 3D head point cloud information of the user or second 3D head point cloud information of the user. 3. The method of claim 1 , further comprising: inputting the color face image and the N infrared images into a feature extraction model to obtain 3D head point cloud information of the user and a face texture feature of the user; and further constructing, based on the 3D head point cloud information and the face texture feature, the 3D head image that is a color image. 4. The method of claim 3 , wherein the feature extraction model comprises a 3D head feature extraction network and a texture feature extraction network, and wherein the method further comprises: further inputting the color face image and the N infrared images into the 3D head feature extraction network to obtain the 3D head point cloud information; and inputting the color face image into the texture feature extraction network to obtain the face texture feature. 5. An apparatus comprising: a non-transitory memory configured to store instructions; and a processor coupled to the non-transitory memory, wherein when executed by the processor, the instructions cause the apparatus to: obtain, based on a face image of a user in a video call process, a key feature point of a facial expression of the user, wherein the face image comprises N infrared images of the user and a color face image of the user, and wherein N is an integer greater than 0; obtain, based on the N infrared images and the color face image, a three-dimensional (3D) head point cloud information of the user; construct, based on the 3D head point cloud information, a 3D head image of the user; drive, using the key feature point, the 3D head image to obtain a first target 3D avatar of the user, wherein the first target 3D avatar has an expression of the user; rotate, based on a preset presentation angle, the first target 3D avatar to obtain a second target 3D avatar at the preset presentation angle; and send, to a peer video call device, the second target 3D avatar. 6. The apparatus of claim 5 , wherein when executed by the processor, the instructions further cause the apparatus to: obtain, based on the color face image, a face texture feature of the user; and further construct, based on the face texture feature, the 3D head image that is a color image, wherein the 3D head point cloud information comprises first 3D head point cloud information of the user or second 3D head point cloud information of the user. 7. The apparatus of claim 5 , wherein when executed by the processor, the instructions further cause the apparatus to: input the color face image and the N infrared images into a feature extraction model to obtain 3D head point cloud information of the user and a face texture feature of the user; and further construct, based on the 3D head point cloud information and the face texture feature, the 3D head image that is a color image. 8. The apparatus of claim 7 , wherein the feature extraction model comprises a 3D head feature extraction network and a texture feature extraction network, and wherein when executed by the processor, the instructions further cause the apparatus to: input the color face image and the N infrared images into the 3D head feature extraction network to obtain the 3D head point cloud information; and input the color face image into the texture feature extraction network to obtain the face texture feature. 9. The apparatus of claim 8 , wherein the 3D head feature extraction network is a neural network that uses an encoder-decoder architecture, and wherein when executed by the processor, the instructions further cause the apparatus to: obtain, based on the color face image and the N infrared images, N image pairs, wherein each of the N image pairs comprises the color face image and an infrared image of the user from the N infrared images; and input the N image pairs into the neural network to obtain the 3D head point cloud information. 10. The apparatus of claim 9 , wherein when executed by the processor, the instructions further cause the apparatus to obtain, based on the N infrared images, the preset presentation angle. 11. The apparatus of claim 5 , wherein the face image is a color depth image, and wherein when executed by the processor, the instructions further cause the apparatus to: obtain, based on the color depth image, 3D head point cloud information of the user and a face texture feature of the user; and construct, based on the 3D head point cloud information and the face texture feature, the 3D head image that is a color image. 12. The apparatus of claim 11 , wherein when executed by the processor, the instructions further cause the apparatus to obtain, based on the color depth image, the preset presentation angle. 13. A vehicle comprising: a video call system comprising: a processor configured to: obtain, based on a face image of a user in a video call process, a key feature point of a facial expression of the user, wherein the face image comprises N infrared images of the user and a color face image of the user, and wherein N is an integer greater than 0; obtain, based on the N infrared images and the color face image, a three-dimensional (3D) head point cloud information of the user; construct, based on the 3D head point cloud information, a 3D head image of the user; drive, using the key feature point, the 3D head image to obtain a first target 3D avatar of the user, wherein the first target 3D avatar of the user has an expression of the user; rotate, based on a preset presentation angle, the first target 3D avatar to obtain a second target 3D avatar at the preset presentation angle; and transmit the second target 3D avatar; and a communications apparatus coupled to the processor and configured to: receive the second target 3D avatar; and send, to a peer video call device, the second target 3D avatar. 14. The vehicle of claim 13 , wherein the processor is further configured to: obtain, based on the color face image, a face texture feature of the user; and further construct, based on the face texture feature, the 3D head image that is a color image, wherein the 3D head point cloud information comprises first 3D head point cloud information of the user or second 3D head point cloud information of the user. 15. The vehicle of claim 13 , wherein the processor is further configured to: input the color face ima

Assignees

Inventors

Classifications

  • for generating image signals from visible and infrared light wavelengths · CPC title

  • Feature extraction; Face representation · CPC title

  • using acquisition arrangements · CPC title

  • inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions · CPC title

  • Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12192586B2 cover?
A method for presenting a face in a video call includes: obtaining a key feature point of a facial expression of a user based on a face image of the user; driving a three-dimensional (3D) head image of the user by using the key feature point of the facial expression of the user, to obtain a target 3D avatar of the user, where the target 3D avatar of the user has an expression of the user; rotat…
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification H04N21/4788. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jan 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).