Technology for responding to remarks using speech synthesis

US9685152B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9685152-B2
Application numberUS-201414892624-A
CountryUS
Kind codeB2
Filing dateJun 2, 2014
Priority dateMay 31, 2013
Publication dateJun 20, 2017
Grant dateJun 20, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present invention is provided with: a voice input section that receives a remark (a question) via a voice signal; a reply creation section that creates a voice sequence of a reply (response) to the remark; a pitch analysis section that analyzes the pitch of a first segment (e.g., word ending) of the remark; and a voice generation section (a voice synthesis section, etc.) that generates a reply, in the form of voice, represented by the voice sequence. The voice generation section controls the pitch of the entire reply in such a manner that the pitch of a second segment (e.g., word ending) of the reply assumes a predetermined pitch (e.g., five degrees down) with respect to the pitch of the first segment of the remark. Such arrangements can realize synthesis of replying voice capable of giving a natural feel to the user.

First claim

Opening claim text (preview).

What is claimed is: 1. A voice synthesis apparatus comprising: a voice input section configured to receive a voice signal of a remark; a pitch analysis section configured to analyze a pitch of a first segment of the remark; an acquisition section configured to acquire a reply to the remark; and a voice generation section configured to generate voice of the reply acquired by said acquisition section, said voice generation section controlling a pitch of the voice of the reply in such a manner that a second segment of the reply has a pitch associated with the pitch of the first segment analyzed by said pitch analysis section, wherein said voice generation section controls the pitch of the voice of the reply in such a manner that an interval of the pitch of said second segment relative to the pitch of said first segment becomes a consonant interval except in a case where the pitch of said second segment and the pitch of said first segment are in perfect unison. 2. A voice synthesis apparatus comprising: a voice input section configured to receive a voice signal of a remark; a pitch analysis section configured to analyze a pitch of a first segment of the remark; an acquisition section configured to acquire a reply to the remark; and a voice generation section configured to generate voice of the reply acquired by said acquisition section, said voice generation section controlling a pitch of the voice of the reply in such a manner that a second segment of the reply has a pitch associated with the pitch of the first segment analyzed by said pitch analysis section, wherein said voice generation section controls the pitch of the voice of the reply in such a manner that an interval of the pitch of said second segment relative to the pitch of said first segment becomes any one of intervals, except in a case where the pitch of said second segment and the pitch of said first segment are in perfect unison, within a range of one octave up and one octave down from the pitch of said first segment. 3. The voice synthesis apparatus as claimed in claim 2 , wherein any one of a first mode and a second mode is settable as an operation mode of said voice generation section, and wherein, in said first mode, said voice generation section controls the pitch of the voice of the reply in such a manner that the interval of the pitch of said second segment relative to the pitch of said first segment becomes a consonant interval except in a case where the pitch of said second segment and the pitch of said first segment are in perfect unison, and in said second mode, said voice generation section controls the pitch of the voice of the reply in such a manner that the interval of the pitch of said second segment relative to the pitch of said first segment becomes a dissonant interval. 4. The voice synthesis apparatus as claimed in claim 1 , wherein said voice generation section controls the pitch of the voice of the reply in such a manner that the interval of the pitch of said second segment relative to the pitch of said first segment becomes a consonant interval of five degrees lower than the pitch of said first segment. 5. The voice synthesis apparatus as claimed in claim 1 , wherein said voice generation section provisionally sets the pitch of the second segment of the voice of the reply at the pitch associated with the pitch of the first segment, and said voice generation section is further configured to perform at least one of: an operation of, if the provisionally-set pitch of the second segment is lower than a predetermined first threshold value, changing the provisionally-set pitch of the second segment to a pitch shifted one octave up; and an operation of, if the provisionally-set pitch of the second segment is higher than a predetermined second threshold value, changing the provisionally-set pitch of the second segment to a pitch one octave down. 6. The voice synthesis apparatus as claimed in claim 1 , wherein said voice generation section provisionally sets the pitch of the second segment of the voice of the reply at the pitch associated with the pitch of the first segment, and said voice generation section is further configured to change the provisionally-set pitch to a pitch shifted one octave up or down in accordance with a designated attribute. 7. The voice synthesis apparatus as claimed in claim 1 , wherein any one of a first mode and a second mode is settable as an operation mode of said voice generation section, and wherein, in said first mode, said voice generation section controls the pitch of the voice of the reply in such a manner that the interval of the pitch of said second segment relative to the pitch of said first segment becomes a consonant interval except in a case where the pitch of said second segment and the pitch of said first segment are in perfect unison, and in said second mode, said voice generation section controls the pitch of the voice of the reply in such a manner that the interval of the pitch of said second segment relative to the pitch of said first segment becomes a dissonant interval. 8. A computer-implemented method comprising: receiving a voice signal of a remark; analyzing a pitch of a first segment of the remark; acquiring a reply to the remark; synthesizing voice of the acquired reply; and controlling a pitch of the reply in such a manner that a pitch of a second segment of the voice of the reply has a pitch associated with the analyzed pitch of the first segment and an interval of the pitch of the second segment relative to the pitch of the first segment becomes a consonant interval except in a case where the pitch of the second segment and the pitch of the first segment are in perfect unison. 9. A computer-implemented method comprising: receiving a voice signal of a remark; analyzing a pitch of a first segment of the remark; acquiring a reply to the remark; synthesizing voice of the acquired reply; and controlling a pitch of the reply in such a manner that a pitch of a second segment of the voice of the reply has a pitch associated with the analyzed pitch of the first segment and an interval of the pitch of the second segment relative to the pitch of the first segment becomes any one of intervals, except in a case where the pitch of the second segment and the pitch of the first segment are in perfect unison, within a range of one octave up and one octave down from the pitch of said first segment.

Assignees

Inventors

Classifications

  • Elementary speech units used in speech synthesisers; Concatenation rules · CPC title

  • Concept to speech synthesisers; Generation of natural phrases from machine-based concepts (generation of parameters for speech synthesis out of text G10L13/08) · CPC title

  • using speech synthesis · CPC title

  • Prosody rules derived from text; Stress or intonation · CPC title

  • using natural language modelling · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9685152B2 cover?
The present invention is provided with: a voice input section that receives a remark (a question) via a voice signal; a reply creation section that creates a voice sequence of a reply (response) to the remark; a pitch analysis section that analyzes the pitch of a first segment (e.g., word ending) of the remark; and a voice generation section (a voice synthesis section, etc.) that generates a re…
Who is the assignee on this patent?
Yamaha Corp
What technology area does this patent fall under?
Primary CPC classification G10L13/0335. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 20 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).