Method and apparatus for response using voice matching user category

US11475897B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11475897-B2
Application numberUS-201916505648-A
CountryUS
Kind codeB2
Filing dateJul 8, 2019
Priority dateAug 30, 2018
Publication dateOct 18, 2022
Grant dateOct 18, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A response method and an apparatus thereof are provided. The method includes: receiving voice information sent through a client by a user; determining, based on the voice information, a user category to which the user belongs; and using voice matching the user category to respond to the voice information. Accordingly, response voice matches the user category of the user, which implements that the response is performed using the response voice targeted for the user category, and thus, the user experience may be improved.

First claim

Opening claim text (preview).

What is claimed is: 1. A response method, wherein the method comprises: receiving voice information sent through a client by a user; determining, based on the voice information, an output user category to which the user belongs, wherein voices and voice parameters matching user categories are pre-stored, the voice parameters comprising at least one of a sound spectrum, a fundamental frequency, a pitch, or a sound intensity; and using a voice matching the output user category to respond to the voice information, wherein the using the voice matching the output user category to respond to the voice information comprises: determining that the voice information includes a polite term, and playing a preset polite term expressing thanks by using the voice, in response to determining the voice information including the polite term, wherein the using the voice matching the output user category to respond to the voice information comprises: analyzing semantics of the voice information; acquiring a plurality of pieces of chat information from the Internet; determining, from the plurality of pieces of chat information, conversation information including semantics identical to the semantics of the voice information; extracting a response from the conversation information; and playing the response using the voice. 2. The method according to claim 1 , wherein the determining, based on the voice information, an output user category to which the user belongs comprises: inputting the voice information into a pre-trained user category recognition model, and determining the output user category to which the user belongs according to an output of the user category recognition model. 3. The method according to claim 1 , wherein the using voice matching the user category to respond to the voice information comprises: analyzing semantics of the voice information; determining response information matching the semantics; and playing the response information using the voice. 4. The method according to claim 1 , wherein the output user category refers to a child user, and the method further comprises: pushing multimedia information matching the child user to the user. 5. A response apparatus, wherein the apparatus comprises: at least one processor; and a memory storing instructions, wherein the instructions when executed by the at least one processor, cause the at least one processor to perform operations, the operations comprising: receiving voice information sent through a client by a user; determining, based on the voice information, an output user category to which the user belongs, wherein voices and voice parameters matching user categories are pre-stored, the voice parameters comprising at least one of a sound spectrum, a fundamental frequency, a pitch, or a sound intensity; and using a voice matching the output user category to respond to the voice information, wherein the using the voice matching the output user category to respond to the voice information comprises: determining that the voice information includes a polite term, and playing a preset polite term expressing thanks by using the voice, in response to determining the voice information including the polite term, wherein the using the voice matching the output user category to respond to the voice information comprises: analyzing semantics of the voice information; acquiring a plurality of pieces of chat information from the Internet; determining, from the plurality of pieces of chat information, conversation information including semantics identical to the semantics of the voice information; extracting a response from the conversation information; and playing the response using the voice. 6. The apparatus according to claim 5 , wherein the determining, based on the voice information, an output user category to which the user belongs comprises: inputting the voice information into a pre-trained user category recognition model, and determine the output user category to which the user belongs according to an output of the user category recognition model. 7. The apparatus according to claim 5 , wherein the using voice matching the user category to respond to the voice information comprises: analyzing semantics of the voice information; determining response information matching the semantics; and playing the response information using the voice. 8. The apparatus according to claim 5 , wherein the output user category refers to a child user, and the operations further comprise: pushing multimedia information matching the child user to the user. 9. A non-transitory computer readable medium, storing a computer program, wherein the program, when executed by a processor, causes the processor to perform operations, wherein the operations comprise: receiving voice information sent through a client by a user; determining, based on the voice information, an output user category to which the user belongs, wherein voices and voice parameters matching user categories are pre-stored, the voice parameters comprising at least one of a sound spectrum, a fundamental frequency, a pitch, or a sound intensity; and using a voice matching the output user category to respond to the voice information, wherein the using the voice matching the output user category to respond to the voice information comprises: determining that the voice information includes a polite term, and playing a preset polite term expressing thanks by using the voice, in response to determining the voice information including the polite term, wherein the using the voice matching the output user category to respond to the voice information comprises: analyzing semantics of the voice information; acquiring a plurality of pieces of chat information from the Internet; determining, from the plurality of pieces of chat information, conversation information including semantics identical to the semantics of the voice information; extracting a response from the conversation information; and playing the response using the voice. 10. The method according to claim 1 , wherein the output voice parameter includes the sound spectrum, the fundamental frequency, the pitch, and the sound intensity, matching the output user category. 11. The method according to claim 1 , wherein an interval corresponding to an acoustic characteristic having a commonality for a plurality of users in a given age range is acquired in advance by performing statistics, wherein the determining comprising: extracting a characteristic value of an acoustic characteristic of a user, determining that the interval includes the characteristic value, and determining the user category to which the user belongs based on the given age range, wherein the acoustic characteristic includes a time length, a fundamental frequency, energy, a formant frequency, wideband, a frequency perturbation, an amplitude perturbation, a zero-crossing rate, and a Mel frequency cepstral parameter. 12. The method according to claim 1 , wherein the using comprises: determining an output voice and an output voice parameter matching the output user category from the voices and the voice parameters matching user categories, and outputting the output voice with the output voice parameter, wherein the output voice parameter includes a sound spectrum, a fundamental frequency, a pitch, and a sound intensity, matching the output user category, and the output voice with the output voice parameter sounds identical or similar to a true voice of the user of the output user category.

Assignees

Inventors

Classifications

  • G06F3/167Primary

    Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

  • Details of speech synthesis systems, e.g. synthesiser structure or memory management · CPC title

  • Execution procedure of a spoken command · CPC title

  • Speech synthesis; Text to speech systems · CPC title

  • G10L17/00Primary

    Speaker identification or verification techniques · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11475897B2 cover?
A response method and an apparatus thereof are provided. The method includes: receiving voice information sent through a client by a user; determining, based on the voice information, a user category to which the user belongs; and using voice matching the user category to respond to the voice information. Accordingly, response voice matches the user category of the user, which implements that t…
Who is the assignee on this patent?
Baidu online network technology beijing co ltd
What technology area does this patent fall under?
Primary CPC classification G06F3/167. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 18 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).