Systems and methods for automatically enabling subtitles based on detecting an accent

US9854324B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9854324-B1
Application numberUS-201715419284-A
CountryUS
Kind codeB1
Filing dateJan 30, 2017
Priority dateJan 30, 2017
Publication dateDec 26, 2017
Grant dateDec 26, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are described for automatically enabling subtitles based on a user profile when a language is spoken with an accent a user has difficulty understanding. For example, a media guidance application may detect a first plurality of user interactions of the user while the given language is being spoken with the accent. Based on the first plurality of interactions, the media guidance application may calculate a first value associated with a user specific level of difficulty indicating how difficult it is for the user to understand the language when spoken with the accent. If the first plurality of user interactions are not being performed again, the media guidance application may update the user specific difficulty with a second value that is lower than the first value. The media guidance application may automatically generate for display subtitles for a media asset based on the user specific level of difficulty.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for automatically enabling subtitles based on a user profile when a language is spoken with an accent, the method comprising: storing, in a user profile associated with a user, a first data structure indicating a list of one or more languages that the user understands; determining, at a first point in time, that a language of the one or more languages in the list is being spoken with an accent by retrieving the first data structure, extracting the list, and comparing the language to the one or more languages; detecting a first plurality of user interactions of the user while the given language is being spoken with the accent; storing, in the user profile, a data log indicating the first point in time and the first plurality of user interactions; retrieving, from a remote source, an information table associating user interactions with values, wherein the values represent a general level of difficulty, the general level of difficulty being indicative of a measure of difficulty a plurality of users have in understanding accents in audio content; comparing the first plurality of user interactions with the information table to determine a first plurality of values, wherein each value of the first plurality of values is associated with a respective one of the first plurality of user interactions; calculating a first value based on the first plurality of values; creating a second data structure, wherein the second data structure associates the first value with a user specific level of difficulty, the user specific level of difficulty being indicative of a measure of difficulty the user encounters in understanding the given language when spoken with the accent; storing the second data structure in the user profile; detecting that the given language is being spoken with the accent at a second point in time later than the first point in time; based on detecting that the given language is being spoken with the accent at the second point in time, retrieving, from the user profile, the data log; monitoring user interactions of the user while the given language is being spoken with the accent at the second point in time to determine whether the first plurality of user interactions are being performed again while the given language is being spoken with the accent; based on determining that the first plurality of user interactions are not being performed again, updating the second data structure, the second data structure associating a second value that is lower than the first value with the user specific level of difficulty; detecting that a media asset includes the given language spoken with the accent; retrieving, from the user profile, the second data structure; extracting, from the second data structure, the user specific level of difficulty; and automatically generating for display subtitles for the media asset based on the extracted user specific level of difficulty. 2. The method of claim 1 , wherein the respective one of the first plurality of user interactions is at least one of a facial expression, rewinding a previous media asset, pausing the previous media asset, enabling subtitles, increasing the volume, a head movement, a user gesture, a vocal utterance, a user setting input, a user geographic input, a user demographic information input, and a social media post. 3. The method of claim 1 further comprising calculating the second value, wherein calculating the second value comprises: extracting, from the data log, the first point in time; comparing the first point in time to the second point in time to determine an elapsed time between the first point in time and the second point in time; retrieving, from the remote source, a decay function, the decay function associating the elapsed time to the user specific level of difficulty; and inputting the elapsed time and the first value into the decay function to determine the second value. 4. The method of claim 1 further comprising calculating the second value, wherein calculating the second value comprises: monitoring user usage of user equipment to determine a plurality of media assets that the user has watched, wherein the plurality of media assets each contain audio with the given language spoken with the accent; storing a usage log indicating the plurality of media assets and when the user watched each of the plurality of media assets; determining the number of media assets in the usage log that the user watched between the first point in time and the second point in time; retrieving, from the remote source, a decay function, the decay function relating the number of media assets to the user specific level of difficulty; and inputting the number of media assets and the first value into the decay function to determine the second value. 5. The method of claim 4 , wherein the user is a first user, wherein the data log is a first data log, wherein the usage log is a first usage log, wherein the plurality of media assets is a first plurality of media assets, wherein the number of media assets is a first number of media assets, and wherein the decay function is determined by: receiving, at the remote source, a second data log and a third data log from user equipment associated with a second user, wherein the second data log indicates a third point in time and a second plurality of user interactions, and wherein the third data log indicates a fourth point in time and a third plurality of interactions comparing the second plurality of user interactions with the information table to determine a second plurality of values; comparing the third plurality of user interactions with the information table to determine a third plurality of values; calculating a third value based on the second plurality of values and a fourth value based on the third plurality of values; receiving, at the remote source, a second usage log indicating a second plurality of media assets, wherein the second plurality of media assets each contain audio with the given language spoken with the accent; determining a second number of media assets in the second usage log the second user watched between the third point in time and the fourth point in time; and based on the calculated third value, the calculated fourth value, and the second number of media assets, determining the decay function. 6. The method of claim 1 further comprising: based on monitoring user interactions of the user while the given language is being spoken with the accent at the second point in time, determining a second plurality of user interactions of the user while the given language is being spoken with the accent; and calculating the second value, wherein calculating the second value comprises: comparing the second plurality of user interactions with the information table to determine a second plurality of values; and calculating the second value based on the second plurality of values; and wherein determining that the first plurality of user interactions are not being performed again comprises comparing the second plurality of user interaction to the first plurality of user interactions. 7. The method of claim 1 , wherein the information table is a first information table, and wherein calculating the first value further comprises: determining the geographic location of the user; retrieving, from a remote source, a second information table associating the geographic location and the accent with an augmenting value, wherein the augmenting value is calculated by: receiving, from a plurality of users located in the geographic location, a plurality of data logs indicating a second plurality of user interactions of the plurality of users detected while the given language was being spoken with the accent on user equipm

Assignees

Inventors

Classifications

  • Language recognition · CPC title

  • Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams · CPC title

  • for displaying subtitles · CPC title

  • Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV programme (methods or arrangements for recognising human body or animal bodies or body parts G06V40/10; methods or arrangements for acquiring or recognising human faces, facial parts, facial sketches, facial expressions G06V40/16; methods or arrangements for recognising movements or behaviour G06V40/20; arrangements for identifying users in broadcast systems H04H60/45) · CPC title

  • involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams (arrangements characterised by components specially adapted for monitoring, identification or recognition of audio in broadcast systems H04H60/58) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9854324B1 cover?
Systems and methods are described for automatically enabling subtitles based on a user profile when a language is spoken with an accent a user has difficulty understanding. For example, a media guidance application may detect a first plurality of user interactions of the user while the given language is being spoken with the accent. Based on the first plurality of interactions, the media guidan…
Who is the assignee on this patent?
Rovi Guides Inc
What technology area does this patent fall under?
Primary CPC classification H04N21/4884. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Dec 26 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).