Personalized custom synthetic speech

US10902841B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10902841-B2
Application numberUS-201916276869-A
CountryUS
Kind codeB2
Filing dateFeb 15, 2019
Priority dateFeb 15, 2019
Publication dateJan 26, 2021
Grant dateJan 26, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and computer program products customizing and delivering contextually relevant, artificially synthesized, voiced content that is targeted toward the individual user behaviors, viewing habits, experiences and preferences of each individual user accessing the content of a content provider. A network accessible profile service collects and analyzes collected user profile data and recommends contextually applicable voices based on the user's profile data. As user input to access voiced content or triggers voiced content maintained by a content provider, the voiced content being delivered to the user is a modified version comprising artificially synthesized human speech mimicking the recommended voice and delivering the dialogue of the voiced content, in a manner that imitates the sounds and speech patterns of the recommended voice.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising the steps of: receiving a request to deliver voiced content to a user; analyzing user profile data associated with the user; recommending a voice based on analysis of the user profile, wherein the voice is contextually applicable to the user; transcribing the voiced content into text; conditioning a neural network using a voice sample of the voice to synthesize a waveform comprising artificial speech reciting the text of the voiced content in an artificially created voice indistinguishable from the voice recommended; synthesizing a modified version of the voiced content comprising the waveform; and delivering the modified version of the voiced content to the user. 2. The computer-implemented method of claim 1 , wherein the user profile data is stored in a database selected from the group consisting of a user context database, user history database and a speech context database. 3. The computer-implemented method of claim 1 , wherein the analyzing step is performed by a remotely accessible profile service. 4. The computer-implemented method of claim 2 , wherein the user history database comprises a historical bank of voices previously experienced by the user. 5. The computer-implemented method of claim 4 , wherein the voice is selected from the historical bank of voices. 6. The computer-implemented method of claim 2 , wherein the speech context database comprises a description of an emotional state of the user and the voice is selected in response to the emotional state. 7. The computer-implemented method of claim 2 , wherein the user context database identifies a contextual association between the user and recommended voice based on location information, API call, social media or application data. 8. A computer system comprising: a processor; and a computer-readable storage media coupled to a processor, wherein the computer readable storage media contains program instructions executing a computer-implemented method comprising the steps of: receiving a request to deliver voiced content to a user; analyzing user profile data associated with the user; recommending a voice based on analysis of the user profile, wherein the voice is contextually applicable to the user; transcribing the voiced content into text; conditioning a neural network using a voice sample of the voice to synthesize a waveform comprising artificial speech reciting the text of the voiced content in an artificially created voice indistinguishable from the voice recommended; synthesizing a modified version of the voiced content comprising the waveform; and delivering the modified version of the voiced content to the user. 9. The computer system of claim 8 , wherein the user profile data is stored in a database selected from the group consisting of a user context database, user history database and a speech context database. 10. The computer system of claim 8 , wherein the analyzing step is performed by a remotely accessible profile service. 11. The computer system of claim 9 , wherein the user history database comprises a historical bank of voices previously experienced by the user. 12. The computer system of claim 11 , wherein the voice is selected from the historical bank of voices. 13. The computer system of claim 9 , wherein the speech context database comprises a description of an emotional state of the user and the voice is selected in response to the emotional state. 14. The computer system of claim 9 , wherein the user context database identifies a contextual association between the user and the contextually applicable voice based on location information, API call, social media or application data. 15. A computer program product comprising: one or more computer readable storage media having computer-readable program instructions stored on the one or more computer readable storage media, said program instructions executes a computer-implemented method comprising the steps of: receiving a request to deliver voiced content to a user; analyzing user profile data associated with the user; recommending a voice based on analysis of the user profile, wherein the voice is contextually applicable to the user; transcribing the voiced content into text; conditioning a neural network using a voice sample of the voice to synthesize a waveform comprising artificial speech reciting the text of the voiced content in an artificially created voice indistinguishable from the voice recommended; synthesizing a modified version of the voiced content comprising the waveform; and delivering the modified version of the voiced content to the user. 16. The computer program product of claim 15 , wherein the user profile data is stored in a database selected from the group consisting of a user context database, user history database and a speech context database. 17. The computer program product of claim 15 , wherein the analyzing step is performed by a remotely accessible profile service. 18. The computer program product of claim 16 , wherein the user history database comprises a historical bank of voices previously experienced by the user. 19. The computer program product of claim 18 , wherein the voice is selected from the historical bank of voices. 20. The computer program product of claim 16 , wherein the speech context database comprises a description of an emotional state of the user and the voice is selected in response to the emotional state.

Assignees

Inventors

Classifications

  • Speech to text systems (G10L15/08 takes precedence) · CPC title

  • for estimating an emotional state · CPC title

  • G10L13/033Primary

    Voice editing, e.g. manipulating the voice of the synthesiser · CPC title

  • using artificial neural networks · CPC title

  • G10L13/04Primary

    Details of speech synthesis systems, e.g. synthesiser structure or memory management · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10902841B2 cover?
Systems, methods, and computer program products customizing and delivering contextually relevant, artificially synthesized, voiced content that is targeted toward the individual user behaviors, viewing habits, experiences and preferences of each individual user accessing the content of a content provider. A network accessible profile service collects and analyzes collected user profile data and…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G10L13/033. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 26 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).