Method for enhancing the playback of information in interactive voice response systems
US-8983841-B2 · Mar 17, 2015 · US
US9685152B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9685152-B2 |
| Application number | US-201414892624-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 2, 2014 |
| Priority date | May 31, 2013 |
| Publication date | Jun 20, 2017 |
| Grant date | Jun 20, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present invention is provided with: a voice input section that receives a remark (a question) via a voice signal; a reply creation section that creates a voice sequence of a reply (response) to the remark; a pitch analysis section that analyzes the pitch of a first segment (e.g., word ending) of the remark; and a voice generation section (a voice synthesis section, etc.) that generates a reply, in the form of voice, represented by the voice sequence. The voice generation section controls the pitch of the entire reply in such a manner that the pitch of a second segment (e.g., word ending) of the reply assumes a predetermined pitch (e.g., five degrees down) with respect to the pitch of the first segment of the remark. Such arrangements can realize synthesis of replying voice capable of giving a natural feel to the user.
Opening claim text (preview).
What is claimed is: 1. A voice synthesis apparatus comprising: a voice input section configured to receive a voice signal of a remark; a pitch analysis section configured to analyze a pitch of a first segment of the remark; an acquisition section configured to acquire a reply to the remark; and a voice generation section configured to generate voice of the reply acquired by said acquisition section, said voice generation section controlling a pitch of the voice of the reply in such a manner that a second segment of the reply has a pitch associated with the pitch of the first segment analyzed by said pitch analysis section, wherein said voice generation section controls the pitch of the voice of the reply in such a manner that an interval of the pitch of said second segment relative to the pitch of said first segment becomes a consonant interval except in a case where the pitch of said second segment and the pitch of said first segment are in perfect unison. 2. A voice synthesis apparatus comprising: a voice input section configured to receive a voice signal of a remark; a pitch analysis section configured to analyze a pitch of a first segment of the remark; an acquisition section configured to acquire a reply to the remark; and a voice generation section configured to generate voice of the reply acquired by said acquisition section, said voice generation section controlling a pitch of the voice of the reply in such a manner that a second segment of the reply has a pitch associated with the pitch of the first segment analyzed by said pitch analysis section, wherein said voice generation section controls the pitch of the voice of the reply in such a manner that an interval of the pitch of said second segment relative to the pitch of said first segment becomes any one of intervals, except in a case where the pitch of said second segment and the pitch of said first segment are in perfect unison, within a range of one octave up and one octave down from the pitch of said first segment. 3. The voice synthesis apparatus as claimed in claim 2 , wherein any one of a first mode and a second mode is settable as an operation mode of said voice generation section, and wherein, in said first mode, said voice generation section controls the pitch of the voice of the reply in such a manner that the interval of the pitch of said second segment relative to the pitch of said first segment becomes a consonant interval except in a case where the pitch of said second segment and the pitch of said first segment are in perfect unison, and in said second mode, said voice generation section controls the pitch of the voice of the reply in such a manner that the interval of the pitch of said second segment relative to the pitch of said first segment becomes a dissonant interval. 4. The voice synthesis apparatus as claimed in claim 1 , wherein said voice generation section controls the pitch of the voice of the reply in such a manner that the interval of the pitch of said second segment relative to the pitch of said first segment becomes a consonant interval of five degrees lower than the pitch of said first segment. 5. The voice synthesis apparatus as claimed in claim 1 , wherein said voice generation section provisionally sets the pitch of the second segment of the voice of the reply at the pitch associated with the pitch of the first segment, and said voice generation section is further configured to perform at least one of: an operation of, if the provisionally-set pitch of the second segment is lower than a predetermined first threshold value, changing the provisionally-set pitch of the second segment to a pitch shifted one octave up; and an operation of, if the provisionally-set pitch of the second segment is higher than a predetermined second threshold value, changing the provisionally-set pitch of the second segment to a pitch one octave down. 6. The voice synthesis apparatus as claimed in claim 1 , wherein said voice generation section provisionally sets the pitch of the second segment of the voice of the reply at the pitch associated with the pitch of the first segment, and said voice generation section is further configured to change the provisionally-set pitch to a pitch shifted one octave up or down in accordance with a designated attribute. 7. The voice synthesis apparatus as claimed in claim 1 , wherein any one of a first mode and a second mode is settable as an operation mode of said voice generation section, and wherein, in said first mode, said voice generation section controls the pitch of the voice of the reply in such a manner that the interval of the pitch of said second segment relative to the pitch of said first segment becomes a consonant interval except in a case where the pitch of said second segment and the pitch of said first segment are in perfect unison, and in said second mode, said voice generation section controls the pitch of the voice of the reply in such a manner that the interval of the pitch of said second segment relative to the pitch of said first segment becomes a dissonant interval. 8. A computer-implemented method comprising: receiving a voice signal of a remark; analyzing a pitch of a first segment of the remark; acquiring a reply to the remark; synthesizing voice of the acquired reply; and controlling a pitch of the reply in such a manner that a pitch of a second segment of the voice of the reply has a pitch associated with the analyzed pitch of the first segment and an interval of the pitch of the second segment relative to the pitch of the first segment becomes a consonant interval except in a case where the pitch of the second segment and the pitch of the first segment are in perfect unison. 9. A computer-implemented method comprising: receiving a voice signal of a remark; analyzing a pitch of a first segment of the remark; acquiring a reply to the remark; synthesizing voice of the acquired reply; and controlling a pitch of the reply in such a manner that a pitch of a second segment of the voice of the reply has a pitch associated with the analyzed pitch of the first segment and an interval of the pitch of the second segment relative to the pitch of the first segment becomes any one of intervals, except in a case where the pitch of the second segment and the pitch of the first segment are in perfect unison, within a range of one octave up and one octave down from the pitch of said first segment.
Elementary speech units used in speech synthesisers; Concatenation rules · CPC title
Concept to speech synthesisers; Generation of natural phrases from machine-based concepts (generation of parameters for speech synthesis out of text G10L13/08) · CPC title
using speech synthesis · CPC title
Prosody rules derived from text; Stress or intonation · CPC title
using natural language modelling · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.