What technology area does this patent fall under?

Primary CPC classification G10L25/78. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 13 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Voice processing device and voice processing method for controlling silent period between sound periods

US9443537B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9443537-B2
Application number	US-201414269389-A
Country	US
Kind code	B2
Filing date	May 5, 2014
Priority date	May 23, 2013
Publication date	Sep 13, 2016
Grant date	Sep 13, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A voice processing device includes a processor; and a memory which stores a plurality of instructions, which when executed by the processor, causing the processor to execute: acquiring an input voice; detecting a sound period included in the input voice and a silent period adjacent to a back end of the sound period; calculating a number of words included in the sound period; and controlling a length of the silent period according to the number of words.

First claim

Opening claim text (preview).

What is claimed is: 1. A voice processing device comprising: a hardware processor; and a memory which stores a plurality of instructions, which when executed by the hardware processor, causes the hardware processor to execute: storing attribute information according to characteristics of a user; acquiring an input voice signal; detecting a first sound period and a second sound period included in the input voice signal and a first silent period adjacent to a back end of the first sound period and adjacent to a start of the second sound period; calculating a number of words included in the first sound period; setting a length of a second silent period of an output voice signal based on the attribute information and the number of words; and outputting the output voice signal including a third sound period, a fourth sound period and the second silent period adjacent to a back end of the third sound period and adjacent to a start of the fourth sound period, the words included in the first sound period of the input voice signal being outputted in the third sound period and words included in the second sound period of the input voice signal being outputted in the fourth sound period. 2. The device according to claim 1 , wherein the attribute information is at least one of a language skill test score of the user, a length of a period of language learning by the user, an age of the user, and a time taken by the user to respond to a voice. 3. The device according to claim 2 , wherein the acquiring further acquires a response input from the user, wherein the time taken by the user to respond to a voice is a time from termination of the third sound period to a response input by the user. 4. The device according to claim 1 , further comprising: extracting acoustic features included in the first sound period, wherein the calculating calculates the number of words included in the first sound period according to the acoustic features. 5. The device according to claim 4 , wherein the acoustic features are one of a number of moras included in the first sound period and a number of sudden power changes included in the first sound period. 6. The device according to claim 1 , wherein the detecting detects a ratio of signal power to noise from a plurality of frames included in the input voice signal, wherein the detecting detects frames for which the ratio of signal power to noise is equal to or greater than a first threshold as the first sound period, and detects frames for which the ratio of signal power to noise is smaller than the first threshold as the first silent period. 7. The device according to claim 1 , wherein the setting sets the second silent period where the larger is the number of words, the longer is the second silent period, and where the smaller is the number of words, the shorter is the second silent period. 8. The device according to claim 1 , wherein the acquiring acquires a voice signal including a predetermined number of words as the input voice signal. 9. The device according to claim 1 , further comprising: recognizing the input voice signal as text information, wherein the calculating calculates the number of words according to the text information. 10. A voice processing method comprising: storing attribute information according to characteristics of a user; acquiring an input voice signal; detecting a first sound period and a second sound period included in the input voice signal and a first silent period adjacent to a back end of the first sound period and adjacent to a start of the second sound period; calculating, by a computer processor, a number of words included in the first sound period; setting a length of a second silent period of an output voice signal based on the attribute information and the number of words; and outputting the output voice signal including a third sound period, a fourth sound period and the second silent period adjacent to a back end of the third sound period and adjacent to a start of the fourth sound period, the words included in the first sound period of the input voice signal being outputted in the third sound period, and words included in the second sound period of the input voice signal being outputted in the fourth sound period. 11. The method according to claim 10 , wherein the attribute information is at least one of a language skill test score of the user, a length of a period of language learning by the user, an age of the user, and a time taken by the user to respond to a voice. 12. The method according to claim 11 , wherein the acquiring further acquires a response input from the user, wherein the time taken by the user to respond to a voice is a time from termination of the third sound period to a response input by the user. 13. The method according to claim 12 , further comprising: extracting acoustic features included in the first sound period, wherein the calculating calculates the number of words included in the first sound period according to the acoustic features. 14. The method according to claim 10 , wherein the acoustic features are one of a number of moras included in the first sound period and a number of sudden power changes included in the first sound period. 15. The method according to claim 10 , wherein the detecting detects a ratio of signal power to noise from a plurality of frames included in the input voice, wherein the detecting detects frames for which the ratio of signal power to noise is equal to or greater than a first threshold as the first sound period, and detects frames for which the ratio of signal power to noise is smaller than the first threshold as the first silent period. 16. The method according to claim 10 , wherein the setting sets the second silent period where the larger is the number of words, the longer is the second silent period, and where the smaller is the number of words, the shorter is the second silent period. 17. A non-transitory computer-readable storage medium storing a voice processing program that causes a computer to execute a process comprising: storing attribute information according to characteristics of a user; acquiring an input voice signal; detecting a first sound period and a second sound period included in the input voice signal and a first silent period adjacent to a back end of the first sound period and adjacent to a start of the second sound period; calculating a number of words included in the first sound period; setting a length of a second silent period of an output voice signal based on the attribute information and the number of words; and outputting the output voice signal including a third sound period, and fourth sound period and the second silent period adjacent to a back end of the third sound period and adjacent to a start of the fourth sound period, the words included in the first sound period of the input voice signal being outputted in the third sound period, and words included in the second sound period of the input voice signal being outputted in the fourth sound period. 18. A mobile terminal device comprising: a microphone configured to receive an input voice of a speaking person and generate an input voice signal; a memory storing attribute information according to characteristics of a user; a hardware processor coupled to the memory and configured to receive the input voice signal from the microphone, detect a first sound period and a second sound period included in the input voice signal and a first silent period adjacent to a back end of the

Assignees

Fujitsu Ltd

Inventors

Classifications

G10L21/057
for improving intelligibility · CPC title
G09B19/06
Foreign languages (with audible presentation of material to be studied G09B5/04) · CPC title
G10L25/78Primary
Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M9/10) · CPC title
G10L2025/783
based on threshold decision · CPC title

Patent family

Related publications grouped by family.

View patent family 50628713

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9443537B2 cover?: A voice processing device includes a processor; and a memory which stores a plurality of instructions, which when executed by the processor, causing the processor to execute: acquiring an input voice; detecting a sound period included in the input voice and a silent period adjacent to a back end of the sound period; calculating a number of words included in the sound period; and controlling a l…
Who is the assignee on this patent?: Fujitsu Ltd
What technology area does this patent fall under?: Primary CPC classification G10L25/78. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 13 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).