What technology area does this patent fall under?

Primary CPC classification H04L9/3231. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Sep 29 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Identity authentication method and apparatus

US10789343B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10789343-B2
Application number	US-201816192401-A
Country	US
Kind code	B2
Filing date	Nov 15, 2018
Priority date	May 19, 2016
Publication date	Sep 29, 2020
Grant date	Sep 29, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An audio/video stream generated by a target object to be authenticated is obtained. The target object is associated with a user. A determination is made whether a lip reading component and voice component in the audio/video stream are consistent. In response to determining that the lip reading component and voice component are consistent, voice recognition is performed on an audio stream in the audio/video stream to obtain voice content. The voice content is used as an object identifier of the target object. A model physiological feature corresponding to the object identifier is obtained from object registration information. Physiological recognition is performed on the audio/video stream to obtain a physiological feature of the target object. The physiological feature of the target object is compared with the model physiological feature to obtain a comparison result. If the comparison result satisfies an authentication condition, the target object is authenticated.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: obtaining an audio/video stream of a user that is to be authenticated; determining that the user's voice in the audio/video stream matches the user's lips in the audio/video stream; in response to determining that the user's voice in the audio/video stream matches the user's lips in the audio/video stream, determining, based on performing automated speech recognition on the audio/video stream, a user identifier for the user; determining, based on performing automated physiological feature extraction on the audio/video stream, a physiological feature of the user; obtaining, from stored, object registration information, a stored, model physiological feature corresponding to the determined, user identifier; generating a comparison result based on comparing the physiological feature of the target object that was determined based on performing automated physiological feature extraction on the audio/video stream with the stored, model physiological feature; and in response to determining that the comparison result satisfies an authentication condition, determining that the user has been authenticated. 2. The method of claim 1 , wherein the physiological feature of the user comprises a facial feature of the user. 3. The method of claim 1 , wherein the comparison result comprises a similarity score. 4. The method of claim 1 , wherein determining that the comparison result satisfies an authentication condition comprises determining that a similarity score exceeds a threshold score. 5. The method of claim 1 , wherein determining that the user's voice in the audio/video stream matches the user's lips comprises: determining a lip reading syllable in a video image of the audio/video stream at a particular point in time; determining a voice syllable in audio of the audio/video stream at the particular point in time; and determining that the lip reading syllable and the voice syllable match. 6. The method of claim 1 , comprising storing the model physiological feature in the object registration information. 7. The method of claim 1 , comprising receiving a request from the user to authenticate, wherein the audio/video stream of the user is obtained in response to the request. 8. A non-transitory, computer-readable medium storing one or more instructions executable by a computer system to perform operations comprising: obtaining an audio/video stream of a user that is to be authenticated; determining that the user's voice in the audio/video stream matches the user's lips in the audio/video stream; in response to determining that the user's voice in the audio/video stream matchesthe user's lips in the audio/video stream, determining, based on performing automated speech recognition on the audio/video stream, a user identifier for the user; determining, based on performing automated physiological feature extraction on the audio/video stream, a physiological feature of the user; obtaining, from stored, object registration information, a stored, model physiological feature corresponding to the determined, user identifier; generating a comparison result based on comparing the physiological feature of the target object that was determined based on performing automated physiological feature extraction on the audio/video stream with the stored, model physiological feature; and in response to determining that the comparison result satisfies an authentication condition, determining that the user has been authenticated. 9. The medium of claim 8 , wherein the physiological feature of the user comprises a facial feature of the user. 10. The medium of claim 8 , wherein the comparison result comprises a similarity score. 11. The medium of claim 8 , wherein determining that the comparison result satisfies an authentication condition comprises determining that a similarity score exceeds a threshold score. 12. The medium of claim 8 , wherein determining that the user's voice in the audio/video stream matches the user's lips comprises: determining a lip reading syllable in a video image of the audio/video stream at a particular point in time; determining a voice syllable in audio of the audio/video stream at the particular point in time; and determining that the lip reading syllable and the voice syllable match. 13. The medium of claim 8 , wherein the operations comprise storing the model physiological feature in the object registration information. 14. The medium of claim 8 , wherein the operations comprise receiving a request from the user to authenticate, wherein the audio/video stream of the user is obtained in response to the request. 15. A computer-implemented system, comprising: one or more computers; and one or more computer memory devices interoperably coupled with the one or more computers and having tangible, non-transitory, machine-readable media storing one or more instructions that, when executed by the one or more computers, perform one or more operations comprising: obtaining an audio/video stream of a user that is to be authenticated; determining that the user's voice in the audio/video stream matches the user's lips in the audio/video stream; in response to determining that the user's voice in the audio/video stream matches the user's lips in the audio/video stream, determining, based on performing automated speech recognition on the audio/video stream, a user identifier for the user; determining, based on performing automated physiological feature extraction on the audio/video stream, a physiological feature of the user; obtaining, from stored, object registration information, a stored, model physiological feature corresponding to the determined, user identifier; generating a comparison result based on comparing the physiological feature of the target object that was determined based on performing automated physiological feature extraction on the audio/video stream with the stored, model physiological feature; and in response to determining that the comparison result satisfies an authentication condition, determining that the user has been authenticated. 16. The system of claim 15 , wherein the physiological feature of the user comprises a facial feature of the user. 17. The system of claim 15 , wherein the comparison result comprises a similarity score. 18. The system of claim 15 , wherein determining that the comparison result satisfies an authentication condition comprises determining that a similarity score exceeds a threshold score. 19. The system of claim 15 , wherein determining that the user's voice in the audio/video stream matches the user's lips comprises: determining a lip reading syllable in a video image of the audio/video stream at a particular point in time; determining a voice syllable in audio of the audio/video stream at the particular point in time; and determining that the lip reading syllable and the voice syllable match. 20. The system of claim 15 , wherein the operations comprise storing the model physiological feature in the object registration information.

Assignees

Alibaba Group Holding Ltd

Inventors

Classifications

H04L9/40
Network security protocols · CPC title
G06V40/16
Human faces, e.g. facial parts, sketches or expressions · CPC title
G06V40/20
Movements or behaviour, e.g. gesture recognition (recognition of facial expressions G06V40/16) · CPC title
G06V40/172
Classification, e.g. identification · CPC title
G10L15/08
Speech classification or search · CPC title

Patent family

Related publications grouped by family.

View patent family 60324817

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10789343B2 cover?: An audio/video stream generated by a target object to be authenticated is obtained. The target object is associated with a user. A determination is made whether a lip reading component and voice component in the audio/video stream are consistent. In response to determining that the lip reading component and voice component are consistent, voice recognition is performed on an audio stream in the…
Who is the assignee on this patent?: Alibaba Group Holding Ltd
What technology area does this patent fall under?: Primary CPC classification H04L9/3231. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Sep 29 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).