Identity authentication method and apparatus

US10789343B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10789343-B2
Application numberUS-201816192401-A
CountryUS
Kind codeB2
Filing dateNov 15, 2018
Priority dateMay 19, 2016
Publication dateSep 29, 2020
Grant dateSep 29, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An audio/video stream generated by a target object to be authenticated is obtained. The target object is associated with a user. A determination is made whether a lip reading component and voice component in the audio/video stream are consistent. In response to determining that the lip reading component and voice component are consistent, voice recognition is performed on an audio stream in the audio/video stream to obtain voice content. The voice content is used as an object identifier of the target object. A model physiological feature corresponding to the object identifier is obtained from object registration information. Physiological recognition is performed on the audio/video stream to obtain a physiological feature of the target object. The physiological feature of the target object is compared with the model physiological feature to obtain a comparison result. If the comparison result satisfies an authentication condition, the target object is authenticated.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: obtaining an audio/video stream of a user that is to be authenticated; determining that the user's voice in the audio/video stream matches the user's lips in the audio/video stream; in response to determining that the user's voice in the audio/video stream matches the user's lips in the audio/video stream, determining, based on performing automated speech recognition on the audio/video stream, a user identifier for the user; determining, based on performing automated physiological feature extraction on the audio/video stream, a physiological feature of the user; obtaining, from stored, object registration information, a stored, model physiological feature corresponding to the determined, user identifier; generating a comparison result based on comparing the physiological feature of the target object that was determined based on performing automated physiological feature extraction on the audio/video stream with the stored, model physiological feature; and in response to determining that the comparison result satisfies an authentication condition, determining that the user has been authenticated. 2. The method of claim 1 , wherein the physiological feature of the user comprises a facial feature of the user. 3. The method of claim 1 , wherein the comparison result comprises a similarity score. 4. The method of claim 1 , wherein determining that the comparison result satisfies an authentication condition comprises determining that a similarity score exceeds a threshold score. 5. The method of claim 1 , wherein determining that the user's voice in the audio/video stream matches the user's lips comprises: determining a lip reading syllable in a video image of the audio/video stream at a particular point in time; determining a voice syllable in audio of the audio/video stream at the particular point in time; and determining that the lip reading syllable and the voice syllable match. 6. The method of claim 1 , comprising storing the model physiological feature in the object registration information. 7. The method of claim 1 , comprising receiving a request from the user to authenticate, wherein the audio/video stream of the user is obtained in response to the request. 8. A non-transitory, computer-readable medium storing one or more instructions executable by a computer system to perform operations comprising: obtaining an audio/video stream of a user that is to be authenticated; determining that the user's voice in the audio/video stream matches the user's lips in the audio/video stream; in response to determining that the user's voice in the audio/video stream matchesthe user's lips in the audio/video stream, determining, based on performing automated speech recognition on the audio/video stream, a user identifier for the user; determining, based on performing automated physiological feature extraction on the audio/video stream, a physiological feature of the user; obtaining, from stored, object registration information, a stored, model physiological feature corresponding to the determined, user identifier; generating a comparison result based on comparing the physiological feature of the target object that was determined based on performing automated physiological feature extraction on the audio/video stream with the stored, model physiological feature; and in response to determining that the comparison result satisfies an authentication condition, determining that the user has been authenticated. 9. The medium of claim 8 , wherein the physiological feature of the user comprises a facial feature of the user. 10. The medium of claim 8 , wherein the comparison result comprises a similarity score. 11. The medium of claim 8 , wherein determining that the comparison result satisfies an authentication condition comprises determining that a similarity score exceeds a threshold score. 12. The medium of claim 8 , wherein determining that the user's voice in the audio/video stream matches the user's lips comprises: determining a lip reading syllable in a video image of the audio/video stream at a particular point in time; determining a voice syllable in audio of the audio/video stream at the particular point in time; and determining that the lip reading syllable and the voice syllable match. 13. The medium of claim 8 , wherein the operations comprise storing the model physiological feature in the object registration information. 14. The medium of claim 8 , wherein the operations comprise receiving a request from the user to authenticate, wherein the audio/video stream of the user is obtained in response to the request. 15. A computer-implemented system, comprising: one or more computers; and one or more computer memory devices interoperably coupled with the one or more computers and having tangible, non-transitory, machine-readable media storing one or more instructions that, when executed by the one or more computers, perform one or more operations comprising: obtaining an audio/video stream of a user that is to be authenticated; determining that the user's voice in the audio/video stream matches the user's lips in the audio/video stream; in response to determining that the user's voice in the audio/video stream matches the user's lips in the audio/video stream, determining, based on performing automated speech recognition on the audio/video stream, a user identifier for the user; determining, based on performing automated physiological feature extraction on the audio/video stream, a physiological feature of the user; obtaining, from stored, object registration information, a stored, model physiological feature corresponding to the determined, user identifier; generating a comparison result based on comparing the physiological feature of the target object that was determined based on performing automated physiological feature extraction on the audio/video stream with the stored, model physiological feature; and in response to determining that the comparison result satisfies an authentication condition, determining that the user has been authenticated. 16. The system of claim 15 , wherein the physiological feature of the user comprises a facial feature of the user. 17. The system of claim 15 , wherein the comparison result comprises a similarity score. 18. The system of claim 15 , wherein determining that the comparison result satisfies an authentication condition comprises determining that a similarity score exceeds a threshold score. 19. The system of claim 15 , wherein determining that the user's voice in the audio/video stream matches the user's lips comprises: determining a lip reading syllable in a video image of the audio/video stream at a particular point in time; determining a voice syllable in audio of the audio/video stream at the particular point in time; and determining that the lip reading syllable and the voice syllable match. 20. The system of claim 15 , wherein the operations comprise storing the model physiological feature in the object registration information.

Assignees

Inventors

Classifications

  • Network security protocols · CPC title

  • Human faces, e.g. facial parts, sketches or expressions · CPC title

  • Movements or behaviour, e.g. gesture recognition (recognition of facial expressions G06V40/16) · CPC title

  • Classification, e.g. identification · CPC title

  • Speech classification or search · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10789343B2 cover?
An audio/video stream generated by a target object to be authenticated is obtained. The target object is associated with a user. A determination is made whether a lip reading component and voice component in the audio/video stream are consistent. In response to determining that the lip reading component and voice component are consistent, voice recognition is performed on an audio stream in the…
Who is the assignee on this patent?
Alibaba Group Holding Ltd
What technology area does this patent fall under?
Primary CPC classification H04L9/3231. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Sep 29 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).