Voice processing device and voice processing method for controlling silent period between sound periods

US9443537B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9443537-B2
Application numberUS-201414269389-A
CountryUS
Kind codeB2
Filing dateMay 5, 2014
Priority dateMay 23, 2013
Publication dateSep 13, 2016
Grant dateSep 13, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A voice processing device includes a processor; and a memory which stores a plurality of instructions, which when executed by the processor, causing the processor to execute: acquiring an input voice; detecting a sound period included in the input voice and a silent period adjacent to a back end of the sound period; calculating a number of words included in the sound period; and controlling a length of the silent period according to the number of words.

First claim

Opening claim text (preview).

What is claimed is: 1. A voice processing device comprising: a hardware processor; and a memory which stores a plurality of instructions, which when executed by the hardware processor, causes the hardware processor to execute: storing attribute information according to characteristics of a user; acquiring an input voice signal; detecting a first sound period and a second sound period included in the input voice signal and a first silent period adjacent to a back end of the first sound period and adjacent to a start of the second sound period; calculating a number of words included in the first sound period; setting a length of a second silent period of an output voice signal based on the attribute information and the number of words; and outputting the output voice signal including a third sound period, a fourth sound period and the second silent period adjacent to a back end of the third sound period and adjacent to a start of the fourth sound period, the words included in the first sound period of the input voice signal being outputted in the third sound period and words included in the second sound period of the input voice signal being outputted in the fourth sound period. 2. The device according to claim 1 , wherein the attribute information is at least one of a language skill test score of the user, a length of a period of language learning by the user, an age of the user, and a time taken by the user to respond to a voice. 3. The device according to claim 2 , wherein the acquiring further acquires a response input from the user, wherein the time taken by the user to respond to a voice is a time from termination of the third sound period to a response input by the user. 4. The device according to claim 1 , further comprising: extracting acoustic features included in the first sound period, wherein the calculating calculates the number of words included in the first sound period according to the acoustic features. 5. The device according to claim 4 , wherein the acoustic features are one of a number of moras included in the first sound period and a number of sudden power changes included in the first sound period. 6. The device according to claim 1 , wherein the detecting detects a ratio of signal power to noise from a plurality of frames included in the input voice signal, wherein the detecting detects frames for which the ratio of signal power to noise is equal to or greater than a first threshold as the first sound period, and detects frames for which the ratio of signal power to noise is smaller than the first threshold as the first silent period. 7. The device according to claim 1 , wherein the setting sets the second silent period where the larger is the number of words, the longer is the second silent period, and where the smaller is the number of words, the shorter is the second silent period. 8. The device according to claim 1 , wherein the acquiring acquires a voice signal including a predetermined number of words as the input voice signal. 9. The device according to claim 1 , further comprising: recognizing the input voice signal as text information, wherein the calculating calculates the number of words according to the text information. 10. A voice processing method comprising: storing attribute information according to characteristics of a user; acquiring an input voice signal; detecting a first sound period and a second sound period included in the input voice signal and a first silent period adjacent to a back end of the first sound period and adjacent to a start of the second sound period; calculating, by a computer processor, a number of words included in the first sound period; setting a length of a second silent period of an output voice signal based on the attribute information and the number of words; and outputting the output voice signal including a third sound period, a fourth sound period and the second silent period adjacent to a back end of the third sound period and adjacent to a start of the fourth sound period, the words included in the first sound period of the input voice signal being outputted in the third sound period, and words included in the second sound period of the input voice signal being outputted in the fourth sound period. 11. The method according to claim 10 , wherein the attribute information is at least one of a language skill test score of the user, a length of a period of language learning by the user, an age of the user, and a time taken by the user to respond to a voice. 12. The method according to claim 11 , wherein the acquiring further acquires a response input from the user, wherein the time taken by the user to respond to a voice is a time from termination of the third sound period to a response input by the user. 13. The method according to claim 12 , further comprising: extracting acoustic features included in the first sound period, wherein the calculating calculates the number of words included in the first sound period according to the acoustic features. 14. The method according to claim 10 , wherein the acoustic features are one of a number of moras included in the first sound period and a number of sudden power changes included in the first sound period. 15. The method according to claim 10 , wherein the detecting detects a ratio of signal power to noise from a plurality of frames included in the input voice, wherein the detecting detects frames for which the ratio of signal power to noise is equal to or greater than a first threshold as the first sound period, and detects frames for which the ratio of signal power to noise is smaller than the first threshold as the first silent period. 16. The method according to claim 10 , wherein the setting sets the second silent period where the larger is the number of words, the longer is the second silent period, and where the smaller is the number of words, the shorter is the second silent period. 17. A non-transitory computer-readable storage medium storing a voice processing program that causes a computer to execute a process comprising: storing attribute information according to characteristics of a user; acquiring an input voice signal; detecting a first sound period and a second sound period included in the input voice signal and a first silent period adjacent to a back end of the first sound period and adjacent to a start of the second sound period; calculating a number of words included in the first sound period; setting a length of a second silent period of an output voice signal based on the attribute information and the number of words; and outputting the output voice signal including a third sound period, and fourth sound period and the second silent period adjacent to a back end of the third sound period and adjacent to a start of the fourth sound period, the words included in the first sound period of the input voice signal being outputted in the third sound period, and words included in the second sound period of the input voice signal being outputted in the fourth sound period. 18. A mobile terminal device comprising: a microphone configured to receive an input voice of a speaking person and generate an input voice signal; a memory storing attribute information according to characteristics of a user; a hardware processor coupled to the memory and configured to receive the input voice signal from the microphone, detect a first sound period and a second sound period included in the input voice signal and a first silent period adjacent to a back end of the

Assignees

Inventors

Classifications

  • for improving intelligibility · CPC title

  • Foreign languages (with audible presentation of material to be studied G09B5/04) · CPC title

  • G10L25/78Primary

    Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M9/10) · CPC title

  • based on threshold decision · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9443537B2 cover?
A voice processing device includes a processor; and a memory which stores a plurality of instructions, which when executed by the processor, causing the processor to execute: acquiring an input voice; detecting a sound period included in the input voice and a silent period adjacent to a back end of the sound period; calculating a number of words included in the sound period; and controlling a l…
Who is the assignee on this patent?
Fujitsu Ltd
What technology area does this patent fall under?
Primary CPC classification G10L25/78. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 13 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).