Detection and replacement of burned-in subtitles

US11216684B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-11216684-B1
Application numberUS-202016781456-A
CountryUS
Kind codeB1
Filing dateFeb 4, 2020
Priority dateFeb 4, 2020
Publication dateJan 4, 2022
Grant dateJan 4, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are described for detecting and replacing burned-in subtitles in image and video content.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for detecting and removing burned-in subtitles from video content, comprising: obtaining a video, wherein the video comprises a motion picture or an episode of a television series, wherein the video includes an image frame having burned-in subtitles, and wherein the video has an accompanying audio track; detecting, within the image frame, an area of the image frame that bounds the burned-in subtitles; detecting, within the area of the image frame that bounds the burned-in subtitles, first pixels associated with the burned-in subtitles, wherein the subtitles were burned-into the image frame by removing second pixels associated with non-subtitle image content and inserting, in place of the second pixels, the first pixels associated with the burned-in subtitles, wherein detecting the first pixels includes: processing at least a portion of the audio track through a speech recognition algorithm; processing at least a portion of the image frame through an optical character recognition algorithm; and determining, based on a comparison of outputs of the speech recognition and optical character recognition algorithms, that the image frame contains subtitle text; determining, based at least on third pixels in the image frame associated with non-subtitle image content, approximations of the second pixels associated with non-subtitle image content; and outputting a version of the video in which the first pixels of the image frame have been replaced based on the approximations of the second pixels. 2. The method of claim 1 , wherein determining the approximations of the second pixels comprises performing interpolations based on the at least some of the third pixels. 3. The method of claim 1 , wherein a fraction of the third pixels lie within the detected area that bounds the burned-in subtitles and wherein determining the approximations of the second pixels comprises performing interpolations based the fraction of the third pixels that lie within the detected area. 4. The method of claim 1 , wherein detecting the area of the image frame that bounds the burned-in subtitles comprises calculating a confidence score indicative of the likelihood that the area includes subtitle-text instead of non-subtitle text and determining that the confidence score exceeds a threshold. 5. A method for removing burned-in subtitles from video content, comprising: obtaining a video, wherein the video includes an image frame having burned-in subtitles, and wherein the video has an accompanying audio track; detecting, within the image frame, first pixels associated with the burned-in subtitles wherein detecting the first pixels includes: processing at least a portion of the audio track through a speech recognition algorithm; processing at least a portion of the image frame through an optical character recognition algorithm; and determining, based on a comparison of outputs of the speech recognition and optical character recognition algorithms, that the image frame contains subtitle text; determining replacement pixels for the first pixels; and outputting a version of the video in which the first pixels of the image frame are replaced with the replacement pixels. 6. The method of claim 5 , wherein determining the replacement pixels comprises interpolation based at least on second pixels in the image frame. 7. The method of claim 5 , wherein the image frame comprises a first image frame, wherein the video includes a second image frame that precedes or follows the first image frame, and wherein determining the replacement pixels comprises analyzing image content in the first image frame. 8. The method of claim 5 , further comprising calculating a confidence score indicative of the likelihood that an area of the image frame corresponding to the first pixels includes subtitle-text instead of non-subtitle text and determining that the confidence score exceeds a threshold. 9. The method of claim 5 , wherein detecting the first pixels associated with the burned-in subtitles comprises processing the image frame through a neural network configured to detect burned-in subtitles. 10. A system, comprising one or more processors and memory configured to: obtain a video, wherein the video includes an image frame having burned-in subtitles, and wherein the video has an accompanying audio track; detect, within the image frame, first pixels associated with the burned-in subtitles wherein the one or more processors and memory are configured to detect the first pixels by: processing at least a portion of the audio track through a speech recognition algorithm; processing at least a portion of the image frame through an optical character recognition algorithm; and determining, based on a comparison of outputs of the speech recognition and optical character recognition algorithms, that the image frame contains subtitle text; determine replacement pixels for the first pixels; and output a version of the video in which the first pixels of the image frame are replaced with the replacement pixels. 11. The system of claim 10 , wherein the processors and memory are configured to determine the replacement pixels by interpolation based at least on second pixels in the image frame. 12. The system of claim 10 , wherein the image frame comprises a first image frame, wherein the video includes a second image frame that precedes or follows the first image frame, and wherein the processors and memory are configured to determine the replacement pixels by analyzing image content in the first image frame. 13. The system of claim 10 , wherein the processors and memory are further configured to calculate a confidence score indicative of the likelihood that an area of the image frame corresponding to the first pixels includes subtitle-text instead of non-subtitle text and determining that the confidence score exceeds a threshold. 14. The system of claim 10 , wherein the processors and memory are configured to detect the first pixels associated with the burned-in subtitles by processing the image frame through a neural network configured to detect burned-in subtitles.

Assignees

Inventors

Classifications

  • Segmentation of character regions · CPC title

  • G06V20/46Primary

    Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames · CPC title

  • using neural networks · CPC title

  • Overlay text, e.g. embedded captions in a TV programme · CPC title

  • Video; Image sequence · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11216684B1 cover?
Techniques are described for detecting and replacing burned-in subtitles in image and video content.
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06V20/46. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 04 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).