Information processing apparatus and information processing method
US-2019147870-A1 · May 16, 2019 · US
US11238852B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11238852-B2 |
| Application number | US-201916363407-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 25, 2019 |
| Priority date | Mar 29, 2018 |
| Publication date | Feb 1, 2022 |
| Grant date | Feb 1, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A speech translation device includes: a beamformer which forms, from a speech signal obtained by a microphone array, a first beam and a second beam having different directions; a direction designator which designates one of the first beam and the second beam according to a user operation; a signal-to-noise (SN) ratio calculator which calculates an SN ratio using the designated beam as a signal component in the SN ratio, and the other beam not designated as a noise component; a display determiner which determines whether recognition of the designated beam is difficult, using the calculated SN ratio, and determines a speaking instruction for overcoming difficulty of the recognition when determining that the recognition is difficult; and a display which displays the speaking instruction determined by the display determiner in a display area.
Opening claim text (preview).
What is claimed is: 1. A speech translation device, comprising: a memory configured to store a program; and a processor configured to execute the program and control the speech translation device to function as: a first beamformer which calculates first beamformer output which is a signal resulting from processing a speech signal obtained by a microphone array to direct a directivity for picking up sound in a first direction in which a first user is located; a second beamformer which calculates second beamformer output which is a speech signal resulting from processing the speech signal obtained by the microphone array to direct the directivity for picking up sound in a second direction different from the first direction, the second direction being a direction in which a second user different from the first user is located; a direction designator which designates, according to a user operation made by one of the first user and the second user, one output among output from the first beamformer and output from the second beamformer; a first recognizer which recognizes content indicated by the first beamformer output as first content in a first language by performing, on the first beamformer output, recognition processing in the first language when the one output designated by the direction designator is a first beam formed by the first beamformer; a first translator which translates the first content recognized by the first recognizer into a second language; a second recognizer which recognizes content indicated by the second beamformer output as second content in the second language by performing, on the second beamformer output, recognition processing in the second language when the one output designated by the direction designator is a second beam formed by the second beamformer; a second translator which translates the second content recognized by the second recognizer into the first language; a signal-to-noise (SN) ratio calculator which calculates an SN ratio, using the one output designated by the direction designator as a signal component in the SN ratio, and the other output not designated by the direction designator among the output from the first beamformer and the output from the second beamformer, as a noise component in the SN ratio; a display determiner which determines, using the SN ratio calculated by the SN ratio calculator, whether recognition of the one output designated by the direction designator is difficult and determines, when the display determiner determines that the recognition is difficult, a speaking instruction for overcoming difficulty of the recognition, the speaking instruction being to be notified to a user, the user being the one of the first user and the second user; and a display which displays, in a display area, one of output from the first translator, output from the second translator, and the speaking instruction determined by the display determiner, wherein when the SN ratio is below a threshold, the display determiner determines that the recognition is difficult, and determines an action for increasing the SN ratio to at least the threshold, as the speaking instruction, and when the SN ratio is below the threshold and the action determined by the display determiner as the speaking instruction instructs the user to move closer to the microphone array and speak, the display determiner switches, for input to one of the first recognizer and the second recognizer which is to receive the one output designated by the direction designator, from the one output to output from the microphone array, and causes the speech signal obtained by the microphone array to be input to the one of the first recognizer and the second recognizer. 2. The speech translation device according to claim 1 , wherein the display determiner further calculates a volume of the one output designated by the direction designator, and determines to display the volume calculated in the display area, and the display further displays a level meter indicating a level of the volume in the display area. 3. The speech translation device according to claim 1 , wherein the display determiner further determines to display the SN ratio calculated by the SN ratio calculator in the display area, and the display further displays a level meter indicating a level of the SN ratio in the display area. 4. The speech translation device according to claim 1 , wherein the display determiner further calculates a signal volume which is a volume of the one output designated by the direction designator, and a noise volume which is a volume of the speech signal obtained by the microphone array, and determines to display the signal volume calculated and the noise volume calculated in the display area, and the display further displays a level meter indicating a level of the signal volume and a level meter indicating a level of the noise volume in the display area. 5. The speech translation device according to claim 2 , wherein the level indicated by the level meter varies within a range from a lower threshold to an upper threshold when the display displays the level meter in the display area. 6. The speech translation device according to claim 2 , wherein the display changes a color of the level meter according to the level when the display displays the level meter. 7. The speech translation device according to claim 2 , wherein the display further displays a notification according to the level in the display area. 8. The speech translation device according to claim 1 , wherein the processor is further configured to execute the program and control the speech translation device to function as a noise characteristic calculator which calculates noise characteristics, using one of the speech signal obtained by the microphone array and the one output designated by the direction designator, and the display determiner further determines whether the recognition of the one output is difficult, using the noise characteristics calculated by the noise characteristic calculator. 9. The speech translation device according to claim 1 , wherein the processor is further configured to execute the program and control the speech translation device to function as a speech determiner which determines that a speech section includes the one output designated by the direction designator, and the display determiner further determines whether the recognition of the one output is difficult, using the speech section determined by the speech determiner to include the one output. 10. The speech translation device according to claim 9 , wherein the display determiner further determines whether the user has made an erroneous operation, using the speech section determined by the speech determiner to include the one output. 11. A speech translation device, comprising: a memory configured to store a program; and a processor configured to execute the program and control the speech translation device to function as: a first beamformer which calculates first beamformer output which is a signal resulting from processing a speech signal obtained by a microphone array to direct a directivity for picking up sound in a first direction in which a first user is located; a second beamformer which calculates second beamformer output which is a speech signal resulting from processing the speech signal obtained by the microphone array to direct the directivity for picking up sound in a second direction different from the first direction, the second direction being a direction in which a second user different from the first user is located; a direction designator which designates, according to a user operation made by one of the firs
Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation · CPC title
for microphones (H04R1/34 and H04R1/40 take precedence) · CPC title
Microphone arrays; Beamforming · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
characterised by the method used for estimating noise · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.