Who is the assignee on this patent?

Beijing Didi Infinity Technology & Dev Co Ltd

What technology area does this patent fall under?

Primary CPC classification G10L15/22. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 02 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

System and method for uninterrupted application awakening and speech recognition

US11164584B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11164584-B2
Application number	US-201916563981-A
Country	US
Kind code	B2
Filing date	Sep 9, 2019
Priority date	Oct 24, 2017
Publication date	Nov 2, 2021
Grant date	Nov 2, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are provided for application awakening and speech recognition. Such system may comprise a microphone configured to record an audio in an audio queue. The system may further comprise a processor configured to monitor the audio queue for an awakening phrase, in response to detecting the awakening phrase, obtain an audio segment from the audio queue, and transmit the obtained audio segment to a server. The recording of the audio may be continuous from a beginning of the awakening phrase to an end of the audio segment.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computing system, comprising: a microphone configured to record an audio in an audio queue; and a processor configured to: monitor the audio queue for an awakening phrase; in response to detecting the awakening phrase, obtain an audio segment from the audio queue by performing operations comprising: monitoring the audio queue for a first absence of voice activity, wherein the first absence of voice activity corresponds to a first-detected duration in the audio queue after the awakening phrase with no voice recorded and exceeding a first preset threshold; in response to detecting the first absence of voice activity exceeding the first preset threshold, monitoring the audio queue for a first presence of voice activity after the first absence of voice activity, wherein the first presence of voice activity corresponds to a first-detected duration with voice recorded in the audio queue after the first absence of voice activity; and in response to not detecting the first presence of voice activity within a second preset threshold from an end of the awakening phrase, obtaining the audio segment comprising at least a portion of the audio queue from the end of the awakening phrase to a start of the first absence of voice activity; and transmit the obtained audio segment to a server, wherein the recording of the audio is continuous from a beginning of the awakening phrase to an end of the audio segment. 2. The system of claim 1 , wherein: the system is implemented on a mobile device including a mobile phone; the server is caused to perform the speech recognition on the audio segment and return information to the mobile device based on the speech recognition. 3. The system of claim 2 , further comprising: a display configured to display the returned information, wherein the returned information comprises texts of a machine-recognized speech corresponding to the audio segment. 4. The system of claim 1 , wherein: the audio queue is associated with time; and to monitor the audio queue for the awakening phrase, the processor is configured to screen the recorded audio for a match with the awakening phrase. 5. The system of claim 4 , wherein: the recording of the audio in the audio queue is continuous throughout the detecting of the awakening phrase. 6. The system of claim 1 , wherein: the audio segment further comprises the awakening phrase. 7. The system of claim 1 , wherein: to obtain the audio segment from the audio queue in response to detecting the awakening phrase, the processor is further configured to: in response to detecting the first presence of voice activity within the second preset threshold from an end of the awakening phrase, monitor the audio queue for a second absence of voice activity, wherein the second absence of voice activity corresponds to a first-detected duration in the audio queue after the first presence of voice activity with no voice recorded and exceeding the first preset threshold; and in response to detecting the second absence of voice activity, obtain the audio segment comprising at least a portion of the audio queue from a start of the first presence of voice activity to an end of the first presence of voice activity. 8. The system of claim 7 , wherein: the first preset threshold is 700 milliseconds; and the second preset threshold is longer than the first preset threshold. 9. A method, comprising: recording an audio in an audio queue; and monitoring the audio queue for an awakening phrase; in response to detecting the awakening phrase, obtaining an audio segment from the audio queue, wherein the obtaining comprises: monitoring the audio queue for a first absence of voice activity, wherein the first absence of voice activity corresponds to a first-detected duration in the audio queue after the awakening phrase with no voice recorded and exceeding a first preset threshold; in response to detecting the first absence of voice activity exceeding the first preset threshold, monitoring the audio queue for a first presence of voice activity after the first absence of voice activity, wherein the first presence of voice activity corresponds to a first-detected duration with voice recorded in the audio queue after the first absence of voice activity; in response to detecting the first presence of voice activity within a second preset threshold from an end of the awakening phrase, monitoring the audio queue for a second absence of voice activity, wherein the second absence of voice activity corresponds to a first-detected duration in the audio queue after the first presence of voice activity with no voice recorded and exceeding the first preset threshold; and in response to detecting the second absence of voice activity, obtaining the audio segment comprising at least a portion of the audio queue from a start of the first presence of voice activity to an end of the first presence of voice activity; and transmitting the obtained audio segment to a server, wherein the recording of the audio is continuous from a beginning of the awakening phrase to an end of the audio segment. 10. The method of claim 9 , wherein: the method is implemented by a mobile device including a mobile phone; the server is caused to perform the speech recognition on the audio segment and return information to the mobile device based on the speech recognition. 11. The method of claim 10 , further comprising: displaying the returned information, wherein the returned information comprises texts of a machine-recognized speech corresponding to the audio segment. 12. The method of claim 9 , wherein: the audio queue is associated with time; and monitoring the audio queue for the awakening phrase comprises screening the recorded audio for a match with the awakening phrase. 13. The method of claim 12 , wherein the recording of the audio in the audio queue is continuous throughout the detecting of the awakening phrase. 14. The method of claim 9 , wherein: obtaining the audio segment from the audio queue in response to detecting the awakening phrase further comprises: in response to not detecting the first presence of voice activity within the second preset threshold from an end of the awakening phrase, obtaining the audio segment comprising at least a portion of the audio queue from the end of the awakening phrase to a start of the first absence of voice activity. 15. The method of claim 14 , wherein: the audio segment further comprises the awakening phrase. 16. The method of claim 9 , wherein: the first preset threshold is 700 milliseconds; and the second preset threshold is longer than the first preset threshold. 17. A non-transitory computer-readable medium, comprising instructions stored therein, wherein the instructions, when executed by one or more processors, cause the one or more processors to perform a method comprising: obtaining a recorded audio in an audio queue; and monitoring the audio queue for an awakening phrase; in response to detecting the awakening phrase, obtaining an audio segment from the audio queue, wherein the obtaining comprises: monitoring the audio queue for a first absence of voice activity, wherein the first absence of voice activity corresponds to a first-detected duration in the audio queue after the awakening phrase with no voice recorded and exceeding a first preset threshold; and in response to detecting the first absence of voice activity exceeding the first preset threshold, obtaining the audio segment comprising at least a portion of the audio queue from an end of

Assignees

Beijing Didi Infinity Technology & Dev Co Ltd

Inventors

Classifications

G10L2015/088
Word spotting · CPC title
G10L15/05
Word boundary detection · CPC title
G10L15/22Primary
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G10L2015/223
Execution procedure of a spoken command · CPC title
G10L25/78
Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M9/10) · CPC title

Patent family

Related publications grouped by family.

View patent family 66247160

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11164584B2 cover?: Systems and methods are provided for application awakening and speech recognition. Such system may comprise a microphone configured to record an audio in an audio queue. The system may further comprise a processor configured to monitor the audio queue for an awakening phrase, in response to detecting the awakening phrase, obtain an audio segment from the audio queue, and transmit the obtained a…
Who is the assignee on this patent?: Beijing Didi Infinity Technology & Dev Co Ltd
What technology area does this patent fall under?: Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 02 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Low-power, always-listening, voice command detection and capture

Voice interaction device, voice interaction method, voice interaction program, and robot

Method and apparatus for evaluating trigger phrase enrollment

Method and apparatus for executing voice command in electronic device

Speech-responsive portable speaker

Device arbitration for listening devices

User interaction with building controller device using a remote server and a duplex connection

Frequently asked questions