Transmitting multimedia streams to users
US-2018013982-A1 · Jan 11, 2018 · US
US11729354B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11729354-B2 |
| Application number | US-202117514818-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 29, 2021 |
| Priority date | Oct 29, 2021 |
| Publication date | Aug 15, 2023 |
| Grant date | Aug 15, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
One example method includes joining, by a first client device, a videoconferencing meeting hosted by a video conference provider, the videoconference meeting including a plurality of participants; providing an audio stream and a video stream to a video conference provider; receiving, from a second client device, an audio focus area associated with a video stream provided the first client device; determining, based on the audio focus area, a bounding region within an environment shown in the video stream; directing a microphone array to capture audio from the bounding region; and providing the captured audio as an audio stream to the video conference provider.
Opening claim text (preview).
That which is claimed is: 1. A method comprising: joining, by a first client device, a videoconferencing meeting hosted by a video conference provider, the videoconference meeting including a plurality of participants; providing, by the first client device, an audio stream and a video stream to the video conference provider; receiving, from a second client device, an audio focus area associated with the video stream provided by the first client device; determining, by the first client device based on the audio focus area, a bounding region within an environment shown in the video stream; directing, by the first client device, a microphone array to capture audio from the bounding region; and providing, by the first client device, the captured audio as an audio stream to the video conference provider. 2. The method of claim 1 , wherein the audio focus area identifies a portion of a video frame received from the first client device. 3. The method of claim 1 , wherein the audio focus area identifies a person in a video frame received from the first client device. 4. The method of claim 1 , further comprising: determining an audio focus zone within the bounding region; and wherein directing the microphone array comprises directing the microphone array to capture audio from the audio focus zone. 5. The method of claim 1 , wherein determining the bounding region within the environment is based on dimensions of a room and a location and orientation of a camera providing the video stream. 6. The method of claim 1 , wherein directing the microphone array comprises changing a position or orientation of the microphone array or one or more microphones in the microphone array. 7. The method of claim 1 , wherein directing the microphone array comprises changing one or more beamforming parameters of the microphone array. 8. The method of claim 1 , wherein the microphone array is a first microphone array, and further comprising: receiving, from a third client device, a second audio focus area associated with the video stream provided the first client device; determining, based on the second audio focus area, a second bounding region within the environment shown in the video stream; directing a second microphone array to capture second audio from the second bounding region; and providing the captured second audio as a second audio stream to the video conference provider. 9. A client device comprising: a communications interface; a non-transitory computer-readable medium; and one or more processors communicatively coupled to the communications interface and the non-transitory computer-readable medium, the one or more processors configured to execute processor-executable instructions stored in the non-transitory computer-readable medium to cause the one or more processors to: join a videoconferencing meeting hosted by a video conference provider, the videoconference meeting including a plurality of participants; provide an audio stream and a video stream to a video conference provider; receive, from a client device, an audio focus area associated with a video stream provided the client device; determine, based on the audio focus area, a bounding region within an environment shown in the video stream; direct a microphone array to capture audio from the bounding region; and provide the captured audio as an audio stream to the video conference provider. 10. The client device of claim 9 , wherein the audio focus area identifies a portion of a video frame provided the client device. 11. The client device of claim 9 , wherein the audio focus area identifies a previously received audio focus area. 12. The client device of claim 9 , wherein the one or more processors are configured to execute further processor-executable instructions stored in the non-transitory computer-readable medium to cause the one or more processors to: determine an existing bounding region similar to the bounding region, and, determine the existing bounding region as the bounding region. 13. The client device of claim 9 , wherein the one or more processors are configured to execute further processor-executable instructions stored in the non-transitory computer-readable medium to cause the one or more processors to change one or more beamforming parameters of the microphone array. 14. The client device of claim 9 , wherein the microphone array is a first microphone array, and wherein the one or more processors are configured to execute further processor-executable instructions stored in the non-transitory computer-readable medium to cause the one or more processors to: receive, from a second client device, a second audio focus area associated with the video stream provided the client device; determining, based on the second audio focus area, a second bounding region within the environment shown in the video stream; directing a second microphone array to capture second audio from the second bounding region; and providing the captured second audio as a second audio stream to the video conference provider. 15. A non-transitory computer-readable medium comprising processor-executable instructions configured to cause one or more processors to: join, by a client device, a videoconferencing meeting hosted by a video conference provider, the videoconference meeting including a plurality of participants; provide an audio stream and a video stream to a video conference provider; receive, from a second client device, an audio focus area associated with a video stream provided the client device; determine, based on the audio focus area, a bounding region within an environment shown in the video stream; direct a microphone array to capture audio from the bounding region; and provide the captured audio as an audio stream to the video conference provider. 16. The non-transitory computer-readable medium of claim 15 , wherein the audio focus area identifies a plurality of portions of a video frame provided by the client device. 17. The non-transitory computer-readable medium of claim 15 , further comprising processor-executable instructions configured to cause the one or more processors to determining the region within the environment based on dimensions of a room and a location and orientation of a camera providing the video stream. 18. The non-transitory computer-readable medium of claim 15 , further comprising processor-executable instructions configured to cause the one or more processors to: determine an existing bounding region similar to the bounding region, and, determine the existing bounding region as the bounding region. 19. The non-transitory computer-readable medium of claim 15 , further comprising processor-executable instructions configured to cause the one or more processors to change one or more beamforming parameters of the microphone array. 20. The non-transitory computer-readable medium of claim 15 , wherein the microphone array is a first microphone array, and further comprising processor-executable instructions configured to cause the one or more processors to: receive, from a third client device, a second audio focus area associated with the video stream provided the client device; determining, based on the second audio focus area, a second bounding region within the environment shown in the video stream; directing a second microphone array to capture second audio from the second bounding region; and providing the captured second audio as a second audio stream to the video conference provider.
Conference systems · CPC title
based on user input or interaction · CPC title
Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items (segmenting video sequences G06V20/49) · CPC title
microphones · CPC title
for combining the signals of two or more microphones (specially adapted for hearing aids H04R25/407) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.