Voice data transmission method and apparatus
US-2024363120-A1 · Oct 31, 2024 · US
US9313336B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9313336-B2 |
| Application number | US-201113187914-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 21, 2011 |
| Priority date | Jul 21, 2011 |
| Publication date | Apr 12, 2016 |
| Grant date | Apr 12, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems, methods and apparatus for capturing at least one audio signal using a plurality of microphones that generate a plurality of representations of the at least one audio signal. In some embodiments, the plurality of microphones are disposed in a multiple-microphone setting so that the at least one audio signal is captured by at least two of the plurality of microphones. In some embodiments, at least one of the plurality of microphones is a microphone of a mobile device. The plurality of representations of the at least one audio signal may be processed to obtain a processed representation of the at least one audio signal.
Opening claim text (preview).
What is claimed is: 1. A method comprising acts of: receiving, by a server, a plurality of representations of at least one audio signal captured during a meeting attended by a plurality of participants, each representation of the plurality of representations being generated by at least one corresponding microphone of a plurality of microphones and being received via a separate communication path, the plurality of microphones being disposed in a multiple-microphone setting so that the at least one audio signal is captured by at least two of the plurality of microphones, at least one of the plurality of microphones being a microphone of a mobile device, wherein a location of every microphone relative to every other microphone of the plurality of microphones is unknown to the server prior to a beginning of the meeting; and processing, by the server, the plurality of representations of the at least one audio signal to obtain a processed representation of the at least one audio signal, wherein the act of processing the plurality of representations comprises: detecting, based on an input other than the plurality of representations of the at least one audio signal, movement of at least one of the plurality of microphones; and in response to detecting movement of at least one of the plurality of microphones, shifting, in time, at least one first representation of the plurality of representations at least in part by performing auto-correlation processing on the at least one first representation and at least one second representation of the plurality of representations. 2. The method of claim 1 , wherein each of the plurality of microphones is associated with at least one corresponding mobile device of a plurality of mobile devices. 3. The method of claim 2 , wherein a first mobile device of the plurality of mobile devices is personal to a first user, and a second mobile device of the plurality of mobile devices is personal to a second user different from the first user. 4. The method of claim 1 , wherein the act of processing the plurality of representations comprises acts of: assessing at least one quality of the plurality of representations of the at least one audio signal; and selecting, as the processed representation, at least one representation from the plurality of representations based at least in part on the at least one quality. 5. The method of claim 1 , wherein the act of processing the plurality of representations comprises an act of applying at least one signal enhancement technique to the plurality of representations to obtain the processed representation of the at least one audio signal. 6. The method of claim 1 , wherein: each of the plurality of microphones corresponds to one of a plurality of devices present at the meeting; and the plurality of devices are arranged in an ad hoc array that is not constrained to a fixed geometry. 7. The method of claim 6 , wherein at least one of the plurality of devices is unknown prior to the beginning of the meeting. 8. The method of claim 6 , wherein a total number of devices in the plurality of devices is unknown prior to the beginning of the meeting. 9. The method of claim 6 , wherein the ad hoc array has a first geometry at the beginning of the meeting, and wherein the method further comprises: rearranging, during the meeting, the plurality of devices so that the ad hoc array has a second geometry different from the first geometry. 10. At least one non-transitory computer readable medium having encoded thereon computer executable instructions for causing at least one computer to perform a method comprising acts of: receiving, by a server, a plurality of representations of at least one audio signal captured during a meeting attended by a plurality of participants, each representation of the plurality of representations being generated by at least one corresponding microphone of a plurality of microphones and being received via a separate communication path, the plurality of microphones being disposed in a multiple-microphone setting so that the at least one audio signal is captured by at least two of the plurality of microphones, at least one of the plurality of microphones being a microphone of a mobile device, wherein a location of every microphone relative to every other microphone of the plurality of microphones is unknown to the server prior to a beginning of the meeting; and processing, by the server, the plurality of representations of the at least one audio signal to obtain a processed representation of the at least one audio signal, wherein the act of processing the plurality of representations comprises: detecting, based on an input other than the plurality of representations of the at least one audio signal, movement of at least one of the plurality of microphones; and in response to detecting movement of at least one of the plurality of microphones, shifting, in time, at least one first representation of the plurality of representations at least in part by performing auto-correlation processing on the at least one first representation and at least one second representation of the plurality of representations. 11. The at least one non-transitory computer readable medium of claim 10 , wherein the act of processing the plurality of representations comprises acts of: assessing at least one quality of the plurality of representations of the at least one audio signal; and selecting, as the processed representation, at least one representation from the plurality of representations based at least in part on the at least one quality. 12. The at least one non-transitory computer readable medium of claim 10 , wherein the act of processing the plurality of representations comprises an act of applying at least one signal enhancement technique to the plurality of representations to obtain the processed representation of the at least one audio signal. 13. A system comprising: a server comprising at least one processor programmed to: receive a plurality of representations of at least one audio signal captured during a meeting attended by a plurality of participants, each representation of the plurality of representations being generated by at least one corresponding microphone of a plurality of microphones and being received via a separate communication path, the plurality of microphones being disposed in a multiple-microphone setting so that the at least one audio signal is captured by at least two of the plurality of microphones, at least one of the plurality of microphones being a microphone of a mobile device, wherein a location of every microphone relative to every other microphone of the plurality of microphones is unknown to the server prior to a beginning of the meeting; and process the plurality of representations of the at least one audio signal to obtain a processed representation of the at least one audio signal, wherein the at least one processor is programmed to process the plurality of representations at least in part by: detecting, based on an input other than the plurality of representations of the at least one audio signal, movement of at least one of the plurality of microphones; and in response to detecting movement of at least one of the plurality of microphones, shifting, in time, at least one first representation of the plurality of representations at least in part by performing auto-correlation processing on the at least one first representation and at least one second representation of the plurality of representations. 14. The system of claim 13 , wherein the at least one processor is programmed to process the plurality of representations at least in p
using the instant speaker's algorithm (speech detection per se G10L25/78) · CPC title
wireless networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.