Systems and methods for automatic generation and consumption of hypermeetings
US-2016182851-A1 · Jun 23, 2016 · US
US10133538B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10133538-B2 |
| Application number | US-201514671918-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 27, 2015 |
| Priority date | Mar 27, 2015 |
| Publication date | Nov 20, 2018 |
| Grant date | Nov 20, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An audio file analyzer computing system includes technologies to, among other things, localize audio events of interest (such as speakers of interest) within an audio file that includes multiple different classes (e.g., different speakers) of audio. The illustrative audio file analyzer computing system uses a seed segment to perform a semi-supervised diarization of the audio file. The seed segment is pre-selected, such as by a human person using an interactive graphical user interface.
Opening claim text (preview).
The invention claimed is: 1. A method for a high precision segmentation of an audio file having an undetermined number of speakers, the method comprising: receiving a selection indicative of an audio event of interest in an electronic file that includes an undetermined number of different audio events; in response to the selection, creating a seed segment that is representative of less than or equal to ten seconds of the audio event of interest; by comparing features of the seed segment to a set of features extracted from the electronic file, separating the features extracted from the electronic file into a first subset that includes the features extracted from the seed segment and a second subset that includes features extracted from a remaining portion of the electronic file that does not include the seed segment; creating a seed model using, as training data, only the first subset and not the second subset; creating a non-seed model using, as training data, only the second subset and not the first subset; for a feature in the second subset, computing a score based on a comparison of the feature to the seed model and a comparison of the feature to the non-seed model; outputting a segment of the electronic file, wherein the segment includes the feature and at least one label indicative of the seed score and the non-seed score. 2. The method of claim 1 , comprising displaying, in a window of a graphical user interface, a time-based graphical depiction of the audio event of interest with a time-based graphical depiction of the at least one segment of the remaining portion of the electronic file that is related to the audio event of interest. 3. The method of claim 1 , comprising accessing the electronic file through a video player application, and in a graphical user interface, aligning a playing of a video portion of the electronic file with a time-based graphical depiction of the at least one segment of the remaining portion of the electronic file that is related to the audio event of interest. 4. The method of claim 1 , comprising displaying, in a graphical user interface, a list of interactive elements including an interactive element representative of the electronic file, and in response to a selection of the interactive element, playing the at least one segment of the remaining portion of the electronic file that is related to the audio event of interest. 5. The method of claim 1 , comprising determining an offset value based on a characteristic of the seed segment; adjusting the seed score based on the offset value; comparing the adjusted seed score to the non-seed score. 6. The method of claim 1 , comprising computing both the seed score and the non-seed score using a likelihood log ratio. 7. The method of claim 1 , wherein the offset value is determined in response to an interaction with a graphical user interface element. 8. The method of claim 1 , comprising ranking a plurality of audio events in the electronic file based on comparing the adjusted seed score to the non-seed score. 9. The method of claim 1 , wherein the audio event of interest comprises (i) speech or (ii) non-speech or (iii) a combination of (i) and (ii). 10. The method of claim 1 , comprising receiving a plurality of user interface-based selections each corresponding to a different segment of the electronic file, and creating the seed segment based on the plurality of user interface-based selections. 11. The method of claim 1 , comprising selecting a filter based on at least one of (i) a type associated with the seed segment or (ii) a characteristic of the seed segment and prior to the separating, using the selected filter to determine the set of features of the electronic file. 12. The method of claim 1 , comprising creating a new model based on the audio event of interest and at least one segment of the remaining portion of the electronic file that matches the audio event of interest. 13. The method of claim 12 , comprising using the new model, performing audio event recognition on a new electronic file. 14. The method of claim 12 , comprising using the new model, searching a audio files for audio events of a same type as the audio event of interest, and outputting a list of audio files arranged according to a likelihood that the audio files comprise an audio event of the same type as the audio event of interest. 15. The method of claim 1 , wherein the selection of the audio event of interest is received in response to an interaction with a graphical user interface element. 16. The method of claim 1 , wherein the audio event of interest comprises a speech segment produced by a person of interest and the method comprises outputting a list of multi-speaker audio files that comprise speech produced by the person of interest. 17. The method of claim 16 , comprising ranking each audio file in the list based on a likelihood of the audio file comprising speech produced by the person of interest. 18. The method of claim 1 , comprising displaying a graphical representation of the electronic file, displaying a plurality of interactive graphical user interface elements to facilitate user selection of the seed segment and visualization of at least one segment of the remaining portion of the electronic file that matches the audio event of interest. 19. The method of claim 1 , comprising displaying, in a graphical user interface, an interactive graphical element representative of the seed segment. 20. The method of claim 1 , comprising displaying, in a graphical user interface, an interactive graphical element representative of a segment of the remaining portion of the electronic file that matches the audio event of interest.
Clustering; Classification · CPC title
for retrieval · CPC title
characterised by the analysis technique · CPC title
for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range · CPC title
Decision making techniques; Pattern matching strategies · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.