Player audio analysis in online gaming environments
US-10293260-B1 · May 21, 2019 · US
US12450809B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12450809-B2 |
| Application number | US-202318098428-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 18, 2023 |
| Priority date | Jan 18, 2022 |
| Publication date | Oct 21, 2025 |
| Grant date | Oct 21, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of providing an avatar service includes obtaining a user-uttered voice and a spatial information of a user-utterance space, transmitting the user-uttered voice and the spatial information to a server, receiving, from the server, a first avatar voice answer and an avatar facial expression sequence corresponding to the first avatar voice, which are determined based on the user-uttered voice and the spatial information, determining first avatar facial expression data, based on the first avatar voice answer and the avatar facial expression sequence, identifying a certain event during reproduction of a first avatar animation created based on the first avatar voice answer and the first avatar facial expression data, determining second avatar facial expression data or a second avatar voice answer, based on the certain event, and reproducing a second avatar animation created based on the second avatar facial expression data or the second avatar voice answer.
Opening claim text (preview).
What is claimed is: 1. A method, performed by an electronic device, of providing an avatar service, comprising: obtaining a user-uttered voice and a spatial information of a user-utterance space where a user utters the user-uttered voice, the spatial information comprising spatial characteristics of the user-utterance space based on at least one of images captured by a camera and a sound obtained through a microphone, wherein the spatial information comprises at least one of: whether the user-utterance space is public or private; or whether the user-utterance space is quiet or noisy; transmitting the user-uttered voice and the spatial information to a server; receiving, from the server, a first avatar voice answer and an avatar facial expression sequence corresponding to the first avatar voice answer, which are determined based on the user-uttered voice and an avatar response mode, wherein the avatar response mode is determined based on the spatial information; determining first avatar facial expression data, based on the first avatar voice answer and the avatar facial expression sequence; identifying a certain event during reproduction of a first avatar animation created based on the first avatar voice answer and the first avatar facial expression data; determining second avatar facial expression data or a second avatar voice answer, based on the certain event; and stopping reproduction of the first avatar animation, and reproducing a second avatar animation created based on the second avatar facial expression data or the second avatar voice answer. 2. The method of claim 1 , wherein the spatial information comprises information about whether the user-utterance space is a public place and a level of noise in the user-utterance space. 3. The method of claim 1 , wherein the first avatar facial expression data and the second avatar facial expression data each comprise a set of coefficients for each of a plurality of reference three-dimensional (3D) meshes for modeling a facial expression of the first avatar animation and the second avatar animation, respectively. 4. The method of claim 1 , wherein: the second avatar facial expression data comprises lip sync data, and the lip sync data is obtained using an artificial intelligence (AI) model. 5. The method of claim 4 , wherein the AI model is trained using data normalized based on an available range according to the lip sync data. 6. The method of claim 1 , wherein the certain event comprises at least one of an utterance mode change event, an observation mode event, or a refresh mode event. 7. The method of claim 6 , wherein, based on the certain event being the refresh mode event, the stopping reproduction of the first avatar animation, and the reproducing of the second avatar animation comprises: stopping the reproduction of the first avatar animation at a point in time; reproducing a preset refresh animation; and reproducing the first avatar animation from the point in time at which the first avatar animation is stopped. 8. The method of claim 6 , wherein, based on the certain event being the utterance mode change event, the determining of the second avatar facial expression data or the second avatar voice answer comprises: determining the second avatar facial expression data by modifying the first avatar facial expression data, based on an utterance mode obtained as a result of the certain event; and modifying the first avatar voice answer, based on the utterance mode. 9. The method of claim 6 , wherein, based on the certain event being the observation mode event, the determining of the second avatar facial expression data or the second avatar voice answer comprises determining the second avatar facial expression data by changing a face direction or eye direction of the first avatar animation. 10. A method, performed by a server, of providing an avatar service through an electronic device, the method comprising: receiving, from the electronic device, a user-uttered voice and spatial information of a user-utterance space where a user utters the user-uttered voice, the spatial information comprising spatial characteristics of the user-utterance space based on at least one of images captured by a camera and a sound obtained through a microphone, wherein the spatial information comprises at least one of: whether the user-utterance space is public or private; or whether the user-utterance space is quiet or noisy; determining an avatar response mode for the user-uttered voice, based on the spatial information; generating a first avatar voice answer for an avatar to respond to the user-uttered voice and an avatar facial expression sequence corresponding to the first avatar voice answer, based on the user-uttered voice and the avatar response mode; and transmitting the first avatar voice answer and the avatar facial expression sequence for generating a first avatar animation to the electronic device. 11. An electronic device for providing an avatar service, comprising: a communication interface; a storage storing at least one instruction; and at least one processor configured to execute the at least one instruction stored in the storage, wherein the at least one processor is configured to execute the at least one instruction to: obtain a user-uttered voice and a spatial information of a user-utterance space where a user utters the user-uttered voice, the spatial information comprising spatial characteristics of the user-utterance space based on at least one of images captured by a camera and a sound obtained through a microphone, wherein the spatial information comprises at least one of: whether the user-utterance space is public or private; or whether the user-utterance space is quiet or noisy; transmit the user-uttered voice and the spatial information to a server; receive, from the server through the communication interface, a first avatar voice answer and an avatar facial expression sequence corresponding to the first avatar voice answer, which are determined based on the user-uttered voice and an avatar response mode, wherein the avatar response mode is determined based on the spatial information; determine first avatar facial expression data, based on the first avatar voice answer and the avatar facial expression sequence; identify a certain event during reproduction of a first avatar animation created based on the first avatar voice answer and the first avatar facial expression data; determine second avatar facial expression data or a second avatar voice answer, based on the certain event; and reproduce a second avatar animation created based on the second avatar facial expression data or the second avatar voice answer. 12. The electronic device of claim 11 , wherein the spatial information comprises information about whether the user-utterance space is a public place and a level of noise in the user-utterance space. 13. The electronic device of claim 11 , wherein the first avatar facial expression data and the second avatar facial expression data each comprise a set of coefficients for each of a plurality of reference three-dimensional (3D) meshes for modeling a facial expression of the first avatar animation and the second avatar animation, respectively. 14. The electronic device of claim 11 , wherein: the second avatar facial expression data comprises lip sync data, and the lip sync data is obtained using an artificial intelligence (AI) model. 15. The electronic device of claim 14 , wherein the AI model is trained using data normalized based on an available range according to the lip sync data. 16. The electro
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
driven by audio data · CPC title
Finite element generation, e.g. wire-frame surface description, {tesselation} · CPC title
Transforming into visible information · CPC title
Segmentation; Word boundary detection · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.