What technology area does this patent fall under?

Primary CPC classification G10L21/10. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 21 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and apparatus for providing interactive avatar services

US12450809B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12450809-B2
Application number	US-202318098428-A
Country	US
Kind code	B2
Filing date	Jan 18, 2023
Priority date	Jan 18, 2022
Publication date	Oct 21, 2025
Grant date	Oct 21, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of providing an avatar service includes obtaining a user-uttered voice and a spatial information of a user-utterance space, transmitting the user-uttered voice and the spatial information to a server, receiving, from the server, a first avatar voice answer and an avatar facial expression sequence corresponding to the first avatar voice, which are determined based on the user-uttered voice and the spatial information, determining first avatar facial expression data, based on the first avatar voice answer and the avatar facial expression sequence, identifying a certain event during reproduction of a first avatar animation created based on the first avatar voice answer and the first avatar facial expression data, determining second avatar facial expression data or a second avatar voice answer, based on the certain event, and reproducing a second avatar animation created based on the second avatar facial expression data or the second avatar voice answer.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, performed by an electronic device, of providing an avatar service, comprising: obtaining a user-uttered voice and a spatial information of a user-utterance space where a user utters the user-uttered voice, the spatial information comprising spatial characteristics of the user-utterance space based on at least one of images captured by a camera and a sound obtained through a microphone, wherein the spatial information comprises at least one of: whether the user-utterance space is public or private; or whether the user-utterance space is quiet or noisy; transmitting the user-uttered voice and the spatial information to a server; receiving, from the server, a first avatar voice answer and an avatar facial expression sequence corresponding to the first avatar voice answer, which are determined based on the user-uttered voice and an avatar response mode, wherein the avatar response mode is determined based on the spatial information; determining first avatar facial expression data, based on the first avatar voice answer and the avatar facial expression sequence; identifying a certain event during reproduction of a first avatar animation created based on the first avatar voice answer and the first avatar facial expression data; determining second avatar facial expression data or a second avatar voice answer, based on the certain event; and stopping reproduction of the first avatar animation, and reproducing a second avatar animation created based on the second avatar facial expression data or the second avatar voice answer. 2. The method of claim 1 , wherein the spatial information comprises information about whether the user-utterance space is a public place and a level of noise in the user-utterance space. 3. The method of claim 1 , wherein the first avatar facial expression data and the second avatar facial expression data each comprise a set of coefficients for each of a plurality of reference three-dimensional (3D) meshes for modeling a facial expression of the first avatar animation and the second avatar animation, respectively. 4. The method of claim 1 , wherein: the second avatar facial expression data comprises lip sync data, and the lip sync data is obtained using an artificial intelligence (AI) model. 5. The method of claim 4 , wherein the AI model is trained using data normalized based on an available range according to the lip sync data. 6. The method of claim 1 , wherein the certain event comprises at least one of an utterance mode change event, an observation mode event, or a refresh mode event. 7. The method of claim 6 , wherein, based on the certain event being the refresh mode event, the stopping reproduction of the first avatar animation, and the reproducing of the second avatar animation comprises: stopping the reproduction of the first avatar animation at a point in time; reproducing a preset refresh animation; and reproducing the first avatar animation from the point in time at which the first avatar animation is stopped. 8. The method of claim 6 , wherein, based on the certain event being the utterance mode change event, the determining of the second avatar facial expression data or the second avatar voice answer comprises: determining the second avatar facial expression data by modifying the first avatar facial expression data, based on an utterance mode obtained as a result of the certain event; and modifying the first avatar voice answer, based on the utterance mode. 9. The method of claim 6 , wherein, based on the certain event being the observation mode event, the determining of the second avatar facial expression data or the second avatar voice answer comprises determining the second avatar facial expression data by changing a face direction or eye direction of the first avatar animation. 10. A method, performed by a server, of providing an avatar service through an electronic device, the method comprising: receiving, from the electronic device, a user-uttered voice and spatial information of a user-utterance space where a user utters the user-uttered voice, the spatial information comprising spatial characteristics of the user-utterance space based on at least one of images captured by a camera and a sound obtained through a microphone, wherein the spatial information comprises at least one of: whether the user-utterance space is public or private; or whether the user-utterance space is quiet or noisy; determining an avatar response mode for the user-uttered voice, based on the spatial information; generating a first avatar voice answer for an avatar to respond to the user-uttered voice and an avatar facial expression sequence corresponding to the first avatar voice answer, based on the user-uttered voice and the avatar response mode; and transmitting the first avatar voice answer and the avatar facial expression sequence for generating a first avatar animation to the electronic device. 11. An electronic device for providing an avatar service, comprising: a communication interface; a storage storing at least one instruction; and at least one processor configured to execute the at least one instruction stored in the storage, wherein the at least one processor is configured to execute the at least one instruction to: obtain a user-uttered voice and a spatial information of a user-utterance space where a user utters the user-uttered voice, the spatial information comprising spatial characteristics of the user-utterance space based on at least one of images captured by a camera and a sound obtained through a microphone, wherein the spatial information comprises at least one of: whether the user-utterance space is public or private; or whether the user-utterance space is quiet or noisy; transmit the user-uttered voice and the spatial information to a server; receive, from the server through the communication interface, a first avatar voice answer and an avatar facial expression sequence corresponding to the first avatar voice answer, which are determined based on the user-uttered voice and an avatar response mode, wherein the avatar response mode is determined based on the spatial information; determine first avatar facial expression data, based on the first avatar voice answer and the avatar facial expression sequence; identify a certain event during reproduction of a first avatar animation created based on the first avatar voice answer and the first avatar facial expression data; determine second avatar facial expression data or a second avatar voice answer, based on the certain event; and reproduce a second avatar animation created based on the second avatar facial expression data or the second avatar voice answer. 12. The electronic device of claim 11 , wherein the spatial information comprises information about whether the user-utterance space is a public place and a level of noise in the user-utterance space. 13. The electronic device of claim 11 , wherein the first avatar facial expression data and the second avatar facial expression data each comprise a set of coefficients for each of a plurality of reference three-dimensional (3D) meshes for modeling a facial expression of the first avatar animation and the second avatar animation, respectively. 14. The electronic device of claim 11 , wherein: the second avatar facial expression data comprises lip sync data, and the lip sync data is obtained using an artificial intelligence (AI) model. 15. The electronic device of claim 14 , wherein the AI model is trained using data normalized based on an available range according to the lip sync data. 16. The electro

Assignees

Samsung Electronics Co Ltd

Inventors

Classifications

G10L15/22
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G06T13/205
driven by audio data · CPC title
G06T17/20
Finite element generation, e.g. wire-frame surface description, {tesselation} · CPC title
G10L21/10Primary
Transforming into visible information · CPC title
G10L15/04
Segmentation; Word boundary detection · CPC title

Patent family

Related publications grouped by family.

View patent family 87162196

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12450809B2 cover?: A method of providing an avatar service includes obtaining a user-uttered voice and a spatial information of a user-utterance space, transmitting the user-uttered voice and the spatial information to a server, receiving, from the server, a first avatar voice answer and an avatar facial expression sequence corresponding to the first avatar voice, which are determined based on the user-uttered vo…
Who is the assignee on this patent?: Samsung Electronics Co Ltd
What technology area does this patent fall under?: Primary CPC classification G10L21/10. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 21 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).