Near-end indication that the end of speech is received by the far end in an audio or video conference

US9525845B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9525845-B2
Application numberUS-201314426134-A
CountryUS
Kind codeB2
Filing dateSep 27, 2013
Priority dateSep 27, 2012
Publication dateDec 20, 2016
Grant dateDec 20, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of client device and method for audio or video conferencing are described. An embodiment includes an offset detecting unit, a configuring unit, an estimator and an output unit. The offset detecting unit detects an offset of speech input to the client device. The configuring unit determines a voice latency from the client device to every far end. The estimator estimates a time when a user at the far end perceives the offset based on the voice latency. The output unit outputs a perceivable signal indicating that a user at the far end perceives the offset based on the time estimated for the far end. The perceivable signal is helpful to avoid collision between parties.

First claim

Opening claim text (preview).

We claim: 1. A client device for use in an audio or video conference system, comprising: an offset detecting unit configured to detect an offset of speech input to the client device; a configuring unit configured to, for each of at least one far end, determine a first voice latency from the client device to the far end; an estimator configured to, for each of the at least one far end, estimate a time when a user at the far end perceives the offset, based on the first voice latency; and an output unit configured to, for each of the at least one far end, output a first perceivable signal indicating that a user at the far end perceives the offset based on the time estimated for the far end; wherein the output unit is configured to output one of subtle reverb and noticeable noise field not audible to other parties during a period after the offset detecting unit detects the offset and before the output unit outputs the first perceivable signal. 2. The client device according to claim 1 , wherein the at least one far end comprises only one far end having the largest first voice latency among all the far ends involving a conference with the client device. 3. The client device according to claim 1 , wherein the configuring unit is further configured to determine the first voice latency at least based on a transmission delay from the client device to the far end. 4. The client device according to claim 3 , wherein the configuring unit is further configured to acquire a network delay from the client device to the far end as the transmission delay. 5. The client device according to claim 1 , wherein the configuring unit is further configured to determine a network delay of a route from the client device to the at least one far end, further comprising a jitter monitor configured to acquire jitter range of the network delay, and the output unit is further configured to present the network delay of the route and the jitter range. 6. The client device according to claim 1 , further comprising a jitter buffer tuner configured to, in response to a user input, adjust the jitter buffer delay of a jitter buffer on a route from the client device to the at least one far end. 7. The client device according to claim 6 , further comprising a transmitting unit configured, in response to the adjusting, to transmit to the far end of the corresponding route an indication that the jitter buffer delay of the jitter buffer has been changed. 8. The client device according to claim 3 , wherein the output unit is further configured to, for each of the at least one far end, output a second perceivable signal in response to elapsing of a time interval after outputting the first perceivable signal, and wherein the configuring unit is further configured to determine the time interval as not less than a second voice latency from the far end to the client device. 9. The client device according to claim 1 , further comprising: a receiving unit configured to receive data frames; and a voice activity detector configured to detect voice activity in the data frames directly output from the receiving unit, wherein the output unit is further configured to output a third perceivable signal indicating that there is incoming speech from a far end. 10. The client device according to claim 9 , wherein the voice activity detector is further configured to detect voice activity from local audio input, and the output unit is further configured to output a fourth perceivable signal indicating that there is a collision if both voice activities are detected from the data frames and the local audio input at the same time. 11. A client device for use in an audio or video conference system, comprising: a receiving unit configured to receive data frames; a voice activity detector configured to detect voice activity in the data frames directly output from the receiving unit; and an output unit configured to output a perceivable signal indicating that there is incoming speech from a far end, wherein the voice activity detector is further configured to detect voice activity from local audio input, and the output unit is further configured to output another perceivable signal indicating that there is a collision if both voice activities are detected from the data frames and the local audio input at the same time. 12. A method of audio or video conferencing for use in a client device, comprising: a configuring step of, for each of at least one far end, determining a first voice latency from the client device to the far end; a detecting step of detecting an offset of speech input to the client device; an estimating step of, for each of the at least one far end, estimating a time when a user at the far end perceives the offset, based on the first voice latency; an outputting step of, for each of the at least one far end, outputting a first perceivable signal indicating that a user at the far end perceives the offset based on the time estimated for the far end; and outputting one of subtle reverb and noticeable noise field not audible to other parties during a period after detecting the offset and before outputting the first perceivable signal. 13. The method according to claim 12 , wherein the configuring step further comprises determining the first voice latency at least based on a transmission delay from the client device to the far end. 14. The method according to claim 12 , further comprising: determining a network delay of a route from the client device to the at least one far end, acquiring jitter range of the network delay, and presenting the network delay of the route and the jitter range. 15. The method according to claim 12 , further comprising, in response to a user input, adjusting the jitter buffer delay of a jitter buffer on a route from the client device to the at least one far end. 16. The method according to claim 15 , further comprising, in response to the adjusting, transmitting to the far end of the corresponding route an indication that the jitter buffer delay of the jitter buffer has been changed, wherein the indication further comprises the adjusted jitter buffer delay of the jitter buffer. 17. The method according to claim 13 , further comprising: for each of the at least one far end, outputting a second perceivable signal in response to elapsing of a time interval after outputting the first perceivable signal, and wherein the time interval is set as not less than a second voice latency from the far end to the client device. 18. The method according to claim 12 , further comprising: a receiving step of receiving data frames; and a voice activity detecting step of detecting voice activity in the data frames received through the receiving step, wherein the outputting step further comprises outputting a third perceivable signal indicating that there is incoming speech from a far end, detecting voice activity from local audio input, and outputting a fourth perceivable signal indicating that there is a collision if both voice activities are detected from the data frames and the local audio input at the same time.

Assignees

Inventors

Classifications

  • H04N7/147Primary

    Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals (selecting H04Q) · CPC title

  • Conference systems · CPC title

  • H04M3/569Primary

    using the instant speaker's algorithm (speech detection per se G10L25/78) · CPC title

  • Delay circuits; Timers · CPC title

  • Displays · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9525845B2 cover?
Embodiments of client device and method for audio or video conferencing are described. An embodiment includes an offset detecting unit, a configuring unit, an estimator and an output unit. The offset detecting unit detects an offset of speech input to the client device. The configuring unit determines a voice latency from the client device to every far end. The estimator estimates a time when a…
Who is the assignee on this patent?
Dolby Laboratories Licensing Corp, Dolby Int Ab, Dobly Laboratories Licensing Corp, and 1 more
What technology area does this patent fall under?
Primary CPC classification H04N7/147. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Dec 20 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).