Theme detection for object-recognition-based notifications
US-12183330-B2 · Dec 31, 2024 · US
US9020823B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9020823-B2 |
| Application number | US-91587910-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 29, 2010 |
| Priority date | Oct 30, 2009 |
| Publication date | Apr 28, 2015 |
| Grant date | Apr 28, 2015 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus, a system and a method for voice dialogue activation and/or conduct. The apparatus for voice dialogue activation and/or conduct has a voice recognition unit, a speaker recognition unit and a decision-maker unit. The decision-maker unit is designed to activate a result action on the basis of results from the voice and speaker recognition units.
Opening claim text (preview).
We claim: 1. An apparatus for at least one of voice dialogue activation and voice dialogue conduct, for use in a vehicle, comprising: at least one input for a voice signal; a voice recognition unit configured to establish one or more command words contained in the voice signal; a speaker recognition unit configured to determine a current speaker using the voice signal and at least one stored speaker profile; a decision-maker unit comprising: a voice recognition unit connection coupled to an output of the voice recognition unit configured to perform a result action based on the one or more command words, and a speaker recognition unit connection coupled to the speaker recognition unit, the decision-maker unit being configured such that the activation of the result action is dependent, at least in the case of at least one command word, on whether the at least one command word has been identified as coming from a speaker associated with a speaker profile; and an echo cancellation unit that receives a multichannel voice signal and, on the basis of transit time differences among components of the multichannel signal with respect to the at least one input, removes all components from non-authorized speakers, wherein: the speaker recognition unit is configured to identify the current speaker by extracting speaker features from the voice signal and comparing the speaker features with stored speaker-dependent features, and comprises a further unit configured for speaker adaptation to continually ascertain refined speaker-dependent features and store the refined speaker-dependent features in the stored speaker profiles, and the speaker recognition unit is configured to, in the case that a plurality of speakers are speaking simultaneously, attribute the voice signal to no speaker. 2. The apparatus as claimed in claim 1 , wherein the decision-maker unit is configured to align and correlate results from the speaker recognition unit and from the voice recognition unit with speaker-specific information stored in a speaker profile, wherein performance of at least one command-word-dependent result action is suppressed if a current speaker is not authorized to perform the result actions. 3. The apparatus as claimed in claim 1 , wherein the apparatus is configured as a combined apparatus for voice dialogue conduct and activation. 4. The apparatus as claimed in claim 1 , wherein the voice evaluation unit comprises a word recognition unit configured to recognize words and also a downstream structure evaluation unit configured to recognize command-forming structures. 5. The apparatus as claimed in claim 1 , wherein the echo cancellation unit is connected directly or indirectly upstream of at least one of the speaker recognition unit and the voice recognition unit, wherein the echo cancellation unit has one or more inputs for loudspeaker signals that comprise at least one of mono, stereo, and multichannel loudspeaker signals, the echo cancellation unit configured to compensate for the influence of the loudspeaker signals on the voice signal. 6. The apparatus as claimed in claim 5 , wherein the echo cancellation unit comprises a subunit configured to compensate for voice components from other persons, said subunit connected to at least one input for the connection of additional microphones. 7. The apparatus as claimed in claim 1 , wherein at least one of the speaker recognition unit and the voice recognition unit has a noise rejection unit connected directly or indirectly upstream. 8. The apparatus as claimed in claim 1 , wherein at least one of the speaker recognition unit and the voice recognition unit is configured to synchronize an output from a speaker recognized by the speaker recognition unit to the decision-maker unit with an output of command words recognized by the voice recognition unit. 9. The apparatus as claimed in claim 1 , wherein a driver state sensing unit for sensing a state of the driver using the voice signal is arranged in parallel with the speaker recognition unit and the voice recognition unit. 10. The apparatus as claimed in claim 1 , wherein the voice recognition unit comprises an additional unit configured to capture time-related alterations in the speaker features of a speaker as an attribute and to store them in a stored speaker profile associated with the speaker. 11. The apparatus as claimed in claim 1 , further comprising at least one memory apparatus configured to store at least one of user profiles and speaker profiles. 12. The apparatus as claimed in claim 11 , wherein the at least one memory apparatus has at least one interface configured to input or output the stored at least one of the user profiles and speaker profiles such that the stored at least one of the user profiles and speaker profiles may be transferred to/from another vehicle. 13. The apparatus as claimed in claim 1 , wherein the apparatus is activated to evaluate the voice signals even during the performance of a result action, such that recognition of a command from an authorized speaker prompts at least partial interruption of the performance of a result action triggered by a prior command. 14. The apparatus as claimed in claim 1 , wherein the decision-maker unit is configured such that some command words are performed independently of the recognition of a speaker associated with the speaker profile. 15. The apparatus as claimed in claim 1 , further comprising at least one memory apparatus configured to store speaker profiles, wherein the at least one memory apparatus has at least one interface configured to input or output the stored speaker profiles such that the stored speaker profiles may be transferred to/from another vehicle. 16. A system for voice dialogue activation and/or voice dialogue conduct comprising: at least one input for a voice signal; a voice recognition unit configured to establish one or more command words contained in the voice signal; a speaker recognition unit configured to determine a current speaker using the voice signal and at least one stored speaker profile; a decision-maker unit comprising: a voice recognition unit connection coupled to an output of the voice recognition unit configured to perform a result action based on the one or more command words, and a speaker recognition unit connection coupled to the speaker recognition unit, the decision-maker unit being configured such that the activation of the result action is dependent, at least in the case of at least one command word, on whether the at least one command word has been identified as coming from a speaker associated with a speaker profile; at least one microphone coupled to the voice recognition unit; and at least one loudspeaker coupled to the voice recognition unit; and an echo cancellation unit that receives a multichannel voice signal and, on the basis of transit time differences among components of the multichannel signal with respect to the at least one input, removes all components from non-authorized speakers, wherein: the speaker recognition unit is configured to identify the current speaker by extracting speaker features from the voice signal and comparing the speaker features with stored speaker-dependent features, and comprises a further unit configured for speaker adaptation to continually ascertain refined speaker-dependent features and store the refined speaker-dependent features in the stored speaker profiles, and the speaker recognition unit is configured to, in the case that a plurality of speakers are speaking simultaneously, attribute the voice signal to no speaker. 17
Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech (G10L21/02 takes precedence) · CPC title
Speaker identification or verification techniques · CPC title
Microphone arrays; Beamforming · CPC title
Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.