Object pose estimation
US-2020363815-A1 · Nov 19, 2020 · US
US11200897B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11200897-B2 |
| Application number | US-201916559184-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 3, 2019 |
| Priority date | Jun 25, 2019 |
| Publication date | Dec 14, 2021 |
| Grant date | Dec 14, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and apparatus for selecting a voice-enabled device are disclosed. The voice-enabled device selecting apparatus may reduce the amount of communication load between a home IoT server and home IoT devices and minimize the amount of computation of the home IoT server by obtaining information related to the direction from which each voice recognition device receives a wakeup word from a plurality of voice recognition devices, determining the position where the wakeup word is spoken by using the information related to the direction from which the wakeup word is received, and selecting the voice recognition device closest to the speech position as a voice-enabled device. At least one of the voice enable device selecting apparatus, IoT device, and a server may be associated with an artificial intelligence (AI) module, an unmanned aerial vehicle (UAV) (or drone), a robot, an augmented reality (AR) device, a virtual reality (VR) device, and a device related to a 5G service.
Opening claim text (preview).
What is claimed is: 1. A method for selecting a voice-enabled device, the method comprising: obtaining information related to a wakeup word from a plurality of voice recognition devices; and selecting the voice-enabled device from among the plurality of voice-recognition devices based on the information related to the wakeup word, wherein the obtaining of the information related to the wakeup word comprises obtaining information related to a direction from which each voice recognition device receives the wakeup word from the plurality of voice recognition devices, wherein the selecting of the voice-enabled device comprises: determining a position where the wakeup word is spoken based on the information related to the direction from which each voice recognition device receives the wakeup word; and selecting a voice recognition device related to the position where the wakeup word is spoken, wherein the selecting of the voice-enabled device comprises selecting the voice recognition device closest to the position where the wakeup word is spoken from among the plurality of voice recognition devices, and wherein the information related to the direction from which the wakeup word is received comprises: information related to a vertical angle at which each voice recognition device receives the wakeup word; and information related to a horizontal angle at which each voice recognition device receives the wakeup word. 2. The method of claim 1 , wherein the determining of the position where the wakeup word is spoken comprises: obtaining information related to a location of each voice recognition device; and estimating the position where the wakeup word is spoken by using the information related to the location of each voice recognition device and the information related to the direction from which the wakeup word is received. 3. The method of claim 1 , further comprising: obtaining information related to a situation in which the wakeup word is recognized by each voice recognition device; and applying the information related to the situation in which the wakeup word is recognized to a pre-learned threshold situation detection and classification model, wherein the selecting of the voice-enabled device comprises: determining whether the situation in which the wakeup word is recognized is a threshold situation based on a result of the applying; and selecting the voice-enabled device using both the information related to the direction from which the wakeup word is received and a plurality of voice signals each voice recognition device obtains by recognizing the wakeup word. 4. The method of claim 3 , further comprising, if the situation in which the wakeup word is recognized is determined to be a threshold situation, obtaining the plurality of voice signals from the plurality of voice recognition devices, wherein the selecting of the voice recognition device comprises: determining a direction in which the wakeup word is spoken based on the plurality of voice signals; and selecting the voice recognition device related to the position where the wakeup word is spoken. 5. The method of claim 4 , wherein the selecting of the voice recognition device comprises selecting the voice recognition device closest to the positon where the wakeup word is spoken from among at least one voice recognition device positioned in the direction in which the wakeup word is spoken. 6. The method of claim 3 , wherein information related to a situation in which the wakeup word is spoken comprises: information related to time at which the wakeup word is recognized; information related to a user who is recognized as having spoken the wakeup word; information related to a location of each voice recognition device; and information related to the position where the wakeup word is spoken. 7. The method of claim 3 , wherein the pre-learned threshold situation detection and classification model is stored in an external artificial intelligence (AI) device, and wherein the applying of the information related to the situation in which the wakeup word is recognized to the pre-learned threshold situation detection and classification model comprises: transmitting feature values related to information related to a situation in which the wakeup word is spoken to the AI device; and obtaining, from the AI device, the result of the applying the information related to the situation in which the wakeup word is recognized to the pre-learned threshold situation detection and classification model. 8. The method of claim 3 , wherein the pre-learned threshold situation detection and classification model is stored in a network, and wherein the applying of the information related to the situation in which the wakeup word is recognized to the pre-learned threshold situation detection and classification model comprises: transmitting information related to a situation in which the wakeup word is spoken to the network; and obtaining, from the network, the result of the applying the information related to the situation in which the wakeup word is recognized to the pre-learned threshold situation detection and classification model. 9. The method of claim 8 , further comprising receiving, from the network, downlink control information (DCI) which is used to schedule the transmission of the information related to the situation in which the wakeup word is spoken, wherein the information related to the situation in which the wakeup word is spoken is transmitted to the network based on the DCI. 10. The method of claim 9 , further comprising performing an initial access procedure with the network based on a synchronization signal block (SSB), wherein the information related to the situation in which the wakeup word is spoken is transmitted to the network via a physical uplink shared channel (PUSCH), and the SSB and a demodulation reference signal (DM-RS) of the PUSCH are quasi co-located with a quasi co location (QCL) type D. 11. The method of claim 9 , further comprising: controlling a communication module to transmit the information related to the situation in which the wakeup word is spoken to an artificial intelligence (AI) processor included in the network; and controlling the communication module to receive AI-processed information from the AI processor, wherein the AI-processed information is information related to threshold situation probability for determining whether the situation in which the wakeup word is spoken is the threshold situation. 12. An apparatus for selecting a voice-enabled device, the apparatus comprising: a communication module configured to obtain information related to a wakeup word from a plurality of voice recognition devices; and a processor configured to select the voice-enabled device from among the plurality of voice-recognition devices based on the information related to the wakeup word, wherein the processor obtains information related to a direction from which each voice recognition device receives the wakeup word from the plurality of voice recognition devices through the communication module, determines a position where the wakeup word is spoken based on the information related to the direction from which each voice recognition device receives the wakeup word, and selects a voice recognition device related to the position where the wakeup word is spoken, wherein the processor selects the voice recognition device closest to the position where the wakeup word is spoken from among the plurality of voice recognition devices, and wherein the information related to the direction from which the wakeup word is received comprises: information related to a vertical an
Supervised learning · CPC title
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
the user being prompted to utter a password or a predefined phrase · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Speech classification or search · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.