Who is the assignee on this patent?

Beijing Baidu Netcom Sci & Tech Co Ltd

What technology area does this patent fall under?

Primary CPC classification G10L13/02. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 16 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Data processing method

US12499868B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12499868-B2
Application number	US-202318296914-A
Country	US
Kind code	B2
Filing date	Apr 6, 2023
Priority date	Apr 8, 2022
Publication date	Dec 16, 2025
Grant date	Dec 16, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A data processing method is provided. The method includes: obtaining a speech pattern data of a target user based on a speech information of the target user, where the speech pattern data indicates a speech feature of the target user; and converting a broadcast text into an audio content based on the speech pattern data, where a text of the audio content corresponds to the broadcast text, and the audio content has the speech feature.

First claim

Opening claim text (preview).

What is claimed is: 1 . A data processing method, comprising: obtaining a speech pattern data of a target user based on speech information of the target user, wherein the speech pattern data indicates a speech feature of the target user, wherein the speech feature comprises a lexical feature and a grammatical feature, wherein the speech pattern data comprises a lexical data indicating the lexical feature of the target user, and a grammatical data indicating the grammatical feature of the target user, and wherein the obtaining the speech pattern data of the target user based on the speech information of the target user comprises: performing a speech recognition on the speech information of the target user to obtain a speech text; obtaining a modal particle commonly used by the target user from the speech text to determine the modal particle as the lexical data; and obtaining, by analyzing a grammar in the speech text, a tag representing a grammar commonly used by the target user to determine the tag as the grammatical data; and converting a broadcast text into an audio content for delivery to the target user based on the speech pattern data, wherein a text of the audio content corresponds to the broadcast text, and the audio content has the speech feature. 2 . The method according to claim 1 , wherein the speech feature further comprises at least one of a pronunciation feature and a speech rate feature, and wherein the speech pattern data further comprises at least one of following: a pronunciation data indicating the pronunciation feature of the target user; and a speech rate data indicating the speech rate feature of the target user. 3 . The method according to claim 1 , further comprising: obtaining, based on an image information of the target user, a facial expression data indicating a facial expression of the target user, wherein the image information corresponds to the speech information; and driving, based on the facial expression data, a digital human to deliver the audio content, wherein the digital human has a broadcast expression when delivering the audio content, and wherein the broadcast expression corresponds to the facial expression of the target user. 4 . The method according to claim 3 , wherein the facial expression data comprises an expression type data and an expression intensity data. 5 . The method according to claim 3 , further comprising: obtaining a behavior data of the target user based on the image information, wherein the behavior data indicates a behavior of the target user; and wherein the driving, based on the facial expression data, the digital human to deliver the audio content comprises: driving, based on the facial expression data and the behavior data, the digital human to deliver the audio content, wherein the digital human has the broadcast expression and a broadcast behavior when delivering the audio content, and wherein the broadcast behavior corresponds to the behavior of the target user. 6 . The method according to claim 5 , wherein the behavior of the target user comprises at least one of a motion, a posture, a gesture, and a breathing rate, and the behavior data further comprises at least one of following: a motion data indicating the motion of the target user; a posture data indicating the posture of the target user; a gesture data indicating the gesture of the target user; and a breathing rate data indicating the breathing rate of the target user. 7 . An electronic device, comprising: one or more processors; and a memory storing one or more programs configured to be executed by the one or more processors, the one or more programs comprising instructions for performing operations comprising: obtaining a speech pattern data of a target user based on speech information of the target user, wherein the speech pattern data indicates a speech feature of the target user, wherein the speech feature comprises a lexical feature and a grammatical feature, wherein the speech pattern data comprises a lexical data indicating the lexical feature of the target user, and a grammatical data indicating the grammatical feature of the target user, and wherein the obtaining the speech pattern data of the target user based on the speech information of the target user comprises: performing a speech recognition on the speech information of the target user to obtain a speech text; obtaining a modal particle commonly used by the target user from the speech text to determine the modal particle as the lexical data; and obtaining, by analyzing a grammar in the speech text, a tag representing a grammar commonly used by the target user to determine the tag as the grammatical data; and converting a broadcast text into an audio content for delivery to the target user based on the speech pattern data, wherein a text of the audio content corresponds to the broadcast text, and the audio content has the speech feature. 8 . The electronic device according to claim 7 , wherein the speech feature further comprises at least one of a pronunciation feature and a speech rate feature, and wherein the speech pattern data further comprises at least one of following: a pronunciation data indicating the pronunciation feature of the target user; and a speech rate data indicating the speech rate feature of the target user. 9 . The electronic device according to claim 7 , wherein the operations further comprising: obtaining, based on an image information of the target user, a facial expression data indicating a facial expression of the target user, wherein the image information corresponds to the speech information; and driving, based on the facial expression data, a digital human to deliver the audio content, wherein the digital human has a broadcast expression when delivering the audio content, and wherein the broadcast expression corresponds to the facial expression of the target user. 10 . The electronic device according to claim 9 , wherein the facial expression data comprises an expression type data and an expression intensity data. 11 . The electronic device according to claim 9 , wherein the operations further comprising: obtaining a behavior data of the target user based on the image information, wherein the behavior data indicates a behavior of the target user; and wherein the driving, based on the facial expression data, the digital human to deliver the audio content comprises: driving, based on the facial expression data and the behavior data, the digital human to deliver the audio content, wherein the digital human has the broadcast expression and a broadcast behavior when delivering the audio content, and wherein the broadcast behavior corresponds to the behavior of the target user. 12 . The electronic device according to claim 11 , wherein the behavior of the target user comprises at least one of a motion, a posture, a gesture, and a breathing rate, and the behavior data further comprises at least one of following: a motion data indicating the motion of the target user; a posture data indicating the posture of the target user; a gesture data indicating the gesture of the target user; and a breathing rate data indicating the breathing rate of the target user. 13 . A non-transitory computer-readable storage medium storing one or more programs comprising instructions that, when executed by one or more processors of a computing device, cause the computing device to perform operations comprising: obtaining a speech pattern data of a target user based on speech information of the target user, wherein the speech pattern data indicates a speech feature of the target user, wherein the sp

Assignees

Beijing Baidu Netcom Sci & Tech Co Ltd

Inventors

Classifications

G06F40/216
using statistical methods · CPC title
G06F40/30
Semantic analysis · CPC title
G10L15/25
using position of the lips, movement of the lips or face analysis · CPC title
G10L15/02
Feature extraction for speech recognition; Selection of recognition unit · CPC title
G06T13/40
of characters, e.g. humans, animals or virtual beings · CPC title

Patent family

Related publications grouped by family.

View patent family 83024202

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12499868B2 cover?: A data processing method is provided. The method includes: obtaining a speech pattern data of a target user based on a speech information of the target user, where the speech pattern data indicates a speech feature of the target user; and converting a broadcast text into an audio content based on the speech pattern data, where a text of the audio content corresponds to the broadcast text, and t…
Who is the assignee on this patent?: Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?: Primary CPC classification G10L13/02. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 16 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Processing multimodal user input for assistant systems

Audio playing method, electronic device, and storage medium

Two-Level Speech Prosody Transfer

Speech to media translation

Generating digital avatar

Method and apparatus for broadcasting a response based on artificial intelligence, and storage medium

Frequently asked questions