Creation of language models for speech recognition
US-10943583-B1 · Mar 9, 2021 · US
US12167077B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12167077-B2 |
| Application number | US-202318362551-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 31, 2023 |
| Priority date | May 29, 2020 |
| Publication date | Dec 10, 2024 |
| Grant date | Dec 10, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and apparatus to identify alternate language versions of media based on signature matching are disclosed. Example apparatus disclosed herein include a signature matcher to compare signatures in monitored data with reference signatures to determine signature match strengths associated with portions of the monitored data, the reference signatures associated with reference media assets. Disclosed example apparatus also include a data segmenter to divide the monitored data into first and second segments, the first segments including temporally adjacent portions of the monitored data having signature match strengths that satisfy a threshold, the second segments including temporally adjacent portions of the monitored data having signature match strengths that do not satisfy the threshold. Disclosed example apparatus further includes a trend determiner to determine, based on a pattern of the first and second segments, whether the monitored data is associated with an alternative language version of one of the reference media assets.
Opening claim text (preview).
What is claimed is: 1. A method comprising: obtaining a sequence of monitored signatures generated from monitored data of monitored media by one or more audience measurement meters; identifying, based on the sequence of monitored signatures, first segments and second segments of the sequence of the monitored signatures based on: (i) ones of the sequence of monitored signatures in the first segment satisfying a matching criterion with respect to reference signatures of a media asset, and (ii) ones of the sequence of monitored signatures in the second segment not satisfying the matching criterion with respect to reference signatures of the media asset; determining that the identified first segments and second segments exhibit a characteristic pattern; determining, based on determining that the identified first segments and second segments exhibit the characteristic pattern, that the monitored media is an alternative language version of the media asset; and crediting exposure to the alternative language version of the media asset, wherein crediting exposure to the alternative language version of the media asset comprises: transmitting identification data about the alternative language version of the media asset to a media exposure creditor implemented at a data center; and outputting, at a media exposure creditor implemented by a data center, data metrics based on the identification data. 2. The method of claim 1 , wherein the identification data includes a media identifier that is a name of the media asset or a metadata tag. 3. The method of claim 1 , further comprising: determining that the monitored data of monitored media corresponds to the media asset of reference media assets having the reference signatures. 4. The method of claim 3 , wherein determining that the monitored data of the monitored media corresponds to the media asset comprises: using linear or hashed matching to match signatures of the monitored data to the reference signatures of the reference media assets; and comparing the sequence of monitored signatures with a sequence of the reference signatures to determine signature matches associated with portions of monitored data corresponding to the monitored media, the reference signatures associated with a primary language version of the reference media assets. 5. The method of claim 1 , wherein determining that the identified first segments and second segments exhibit the characteristic pattern comprises: during an evaluation period of the monitored data, (i) determining that signature matches of the ones of the first segments to the reference signatures satisfy the matching criterion; and (ii) determining that signature matches of the ones of the second segments to the references signatures do not satisfy the matching criterion. 6. The method of claim 1 , wherein the monitored media is accessed via a streaming service of a streaming video provider. 7. The method of claim 1 , wherein the determination of the monitored media as the alternative language version of the media asset is additionally based on corresponding lengths of the first segments and the second segments. 8. The method of claim 1 , wherein the first segments are associated with nondialogue portions of the alternate language version of the media asset and the second segments are associated with dialogue portions of the alternate language version of the media asset. 9. A non-transitory computer-readable storage medium, having stored thereon program instructions that, upon execution by a processor, cause performance of operations comprising: obtaining a sequence of monitored signatures generated from monitored media by one or more audience measurement meters; identifying, based on the sequence of monitored signatures, first segments and second segments of the sequence of the monitored signatures based on: (i) ones of the sequence of monitored signatures in the first segment satisfying a matching criterion with respect to reference signatures of a media asset, and (ii) ones of the sequence of monitored signatures in the second segment not satisfying the matching criterion with respect to reference signatures of the media asset; identifying the monitored media as an alternative language version of the media asset based on the first segments and the second segments exhibiting a characteristic pattern; generating identification data of the alternative language version of the media asset based on the identification of the monitored media as the alternate language version of the media asset; transmitting the identification data to a media exposure creditor implemented at a data center; and outputting, at the media exposure creditor implemented by the data center, data metrics based on the identification data. 10. The non-transitory computer-readable storage medium of claim 9 , the operations further comprising: determining that the monitored data corresponds to the media asset of reference media assets, wherein the determining comprises: using linear or hashed matching to match signatures of the monitored data to the reference signatures of reference media assets; and comparing the sequence of monitored signatures with a sequence of the reference signatures to determine signature matches associated with portions of monitored data corresponding to the monitored media, the reference signatures associated with a primary language version of the reference media assets. 11. The non-transitory computer-readable storage medium of claim 9 , wherein the data metrics include data that credits audience exposure to the alternative language version of the media asset. 12. The non-transitory computer-readable storage medium of claim 9 , wherein outputting, at the media exposure creditor implemented by the data center, data metrics includes operations further comprising: generating a report including the data metrics. 13. The non-transitory computer-readable storage medium of claim 9 , wherein identifying the first segments and the second segments of the sequence of the monitored signatures further comprises operations: during an evaluation period of the monitored data, (i) determining that signature matches of the ones of the first segments to the reference signatures satisfy the matching criterion; and (ii) determining that signature matches of the ones of the second segments to the references signatures do not satisfy the matching criterion. 14. The non-transitory computer-readable storage medium of claim 9 , wherein the alternative language version of the media asset includes a dialogue that is different from a dialogue of a primary language version of the reference media assets. 15. The non-transitory computer-readable storage medium of claim 9 , wherein the monitored media is accessed via a streaming service of a streaming video provider. 16. The non-transitory computer-readable storage medium of claim 9 , wherein the identification data includes a media identifier that is a name of the media asset or a metadata tag. 17. A system comprising: a media meter configured to monitor a media device and to obtain monitored data about the media device having accessed a streaming service of a streaming video provider; a media exposure creditor implemented at a data center and configured to credit audience exposure of the monitored data; a processor; and a non-transitory computer-readable storage medium, having stored thereon program instructions that, upon execution by the processor, cause performance of operations comprising: obtaining a sequence of signatures representative of the
by decomposing the content in the time domain, e.g. in time segments · CPC title
Monitoring of content usage, e.g. the number of times a movie has been viewed, copied or the amount which has been watched (monitoring of user activities for profile generation for accessing a video database G06F16/739; protecting generic digital content where the protection is independent of the precise nature of the content G06F21/10; arrangements for monitoring the use made of the broadcast services in broadcast systems H04H60/31) · CPC title
involving watermark {(protecting executable software by watermarking G06F21/16; image watermarking in general G06T1/0021; watermarks inserted in still images for transmission purposes H04N1/32144; inserting watermarks during video coding H04N19/467)} · CPC title
involving special audio data, e.g. different tracks for different languages · CPC title
involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams (arrangements characterised by components specially adapted for monitoring, identification or recognition of audio in broadcast systems H04H60/58) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.