Theme detection for object-recognition-based notifications
US-12183330-B2 · Dec 31, 2024 · US
US9443537B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9443537-B2 |
| Application number | US-201414269389-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 5, 2014 |
| Priority date | May 23, 2013 |
| Publication date | Sep 13, 2016 |
| Grant date | Sep 13, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A voice processing device includes a processor; and a memory which stores a plurality of instructions, which when executed by the processor, causing the processor to execute: acquiring an input voice; detecting a sound period included in the input voice and a silent period adjacent to a back end of the sound period; calculating a number of words included in the sound period; and controlling a length of the silent period according to the number of words.
Opening claim text (preview).
What is claimed is: 1. A voice processing device comprising: a hardware processor; and a memory which stores a plurality of instructions, which when executed by the hardware processor, causes the hardware processor to execute: storing attribute information according to characteristics of a user; acquiring an input voice signal; detecting a first sound period and a second sound period included in the input voice signal and a first silent period adjacent to a back end of the first sound period and adjacent to a start of the second sound period; calculating a number of words included in the first sound period; setting a length of a second silent period of an output voice signal based on the attribute information and the number of words; and outputting the output voice signal including a third sound period, a fourth sound period and the second silent period adjacent to a back end of the third sound period and adjacent to a start of the fourth sound period, the words included in the first sound period of the input voice signal being outputted in the third sound period and words included in the second sound period of the input voice signal being outputted in the fourth sound period. 2. The device according to claim 1 , wherein the attribute information is at least one of a language skill test score of the user, a length of a period of language learning by the user, an age of the user, and a time taken by the user to respond to a voice. 3. The device according to claim 2 , wherein the acquiring further acquires a response input from the user, wherein the time taken by the user to respond to a voice is a time from termination of the third sound period to a response input by the user. 4. The device according to claim 1 , further comprising: extracting acoustic features included in the first sound period, wherein the calculating calculates the number of words included in the first sound period according to the acoustic features. 5. The device according to claim 4 , wherein the acoustic features are one of a number of moras included in the first sound period and a number of sudden power changes included in the first sound period. 6. The device according to claim 1 , wherein the detecting detects a ratio of signal power to noise from a plurality of frames included in the input voice signal, wherein the detecting detects frames for which the ratio of signal power to noise is equal to or greater than a first threshold as the first sound period, and detects frames for which the ratio of signal power to noise is smaller than the first threshold as the first silent period. 7. The device according to claim 1 , wherein the setting sets the second silent period where the larger is the number of words, the longer is the second silent period, and where the smaller is the number of words, the shorter is the second silent period. 8. The device according to claim 1 , wherein the acquiring acquires a voice signal including a predetermined number of words as the input voice signal. 9. The device according to claim 1 , further comprising: recognizing the input voice signal as text information, wherein the calculating calculates the number of words according to the text information. 10. A voice processing method comprising: storing attribute information according to characteristics of a user; acquiring an input voice signal; detecting a first sound period and a second sound period included in the input voice signal and a first silent period adjacent to a back end of the first sound period and adjacent to a start of the second sound period; calculating, by a computer processor, a number of words included in the first sound period; setting a length of a second silent period of an output voice signal based on the attribute information and the number of words; and outputting the output voice signal including a third sound period, a fourth sound period and the second silent period adjacent to a back end of the third sound period and adjacent to a start of the fourth sound period, the words included in the first sound period of the input voice signal being outputted in the third sound period, and words included in the second sound period of the input voice signal being outputted in the fourth sound period. 11. The method according to claim 10 , wherein the attribute information is at least one of a language skill test score of the user, a length of a period of language learning by the user, an age of the user, and a time taken by the user to respond to a voice. 12. The method according to claim 11 , wherein the acquiring further acquires a response input from the user, wherein the time taken by the user to respond to a voice is a time from termination of the third sound period to a response input by the user. 13. The method according to claim 12 , further comprising: extracting acoustic features included in the first sound period, wherein the calculating calculates the number of words included in the first sound period according to the acoustic features. 14. The method according to claim 10 , wherein the acoustic features are one of a number of moras included in the first sound period and a number of sudden power changes included in the first sound period. 15. The method according to claim 10 , wherein the detecting detects a ratio of signal power to noise from a plurality of frames included in the input voice, wherein the detecting detects frames for which the ratio of signal power to noise is equal to or greater than a first threshold as the first sound period, and detects frames for which the ratio of signal power to noise is smaller than the first threshold as the first silent period. 16. The method according to claim 10 , wherein the setting sets the second silent period where the larger is the number of words, the longer is the second silent period, and where the smaller is the number of words, the shorter is the second silent period. 17. A non-transitory computer-readable storage medium storing a voice processing program that causes a computer to execute a process comprising: storing attribute information according to characteristics of a user; acquiring an input voice signal; detecting a first sound period and a second sound period included in the input voice signal and a first silent period adjacent to a back end of the first sound period and adjacent to a start of the second sound period; calculating a number of words included in the first sound period; setting a length of a second silent period of an output voice signal based on the attribute information and the number of words; and outputting the output voice signal including a third sound period, and fourth sound period and the second silent period adjacent to a back end of the third sound period and adjacent to a start of the fourth sound period, the words included in the first sound period of the input voice signal being outputted in the third sound period, and words included in the second sound period of the input voice signal being outputted in the fourth sound period. 18. A mobile terminal device comprising: a microphone configured to receive an input voice of a speaking person and generate an input voice signal; a memory storing attribute information according to characteristics of a user; a hardware processor coupled to the memory and configured to receive the input voice signal from the microphone, detect a first sound period and a second sound period included in the input voice signal and a first silent period adjacent to a back end of the
for improving intelligibility · CPC title
Foreign languages (with audible presentation of material to be studied G09B5/04) · CPC title
Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M9/10) · CPC title
based on threshold decision · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.