Dynamic detection and recognition of media subjects
US-2022292284-A1 · Sep 15, 2022 · US
US11954893B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11954893-B2 |
| Application number | US-202217843270-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 17, 2022 |
| Priority date | Aug 20, 2019 |
| Publication date | Apr 9, 2024 |
| Grant date | Apr 9, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The technology described herein is directed to systems, methods, and software for indexing video. In an implementation, a method comprises identifying one or more regions of interest around target content in a frame of the video. Further, the method includes identifying, in a portion of the frame outside a region of interest, potentially empty regions adjacent to the region of interest. The method continues with identifying at least one empty region of the potentially empty regions that satisfies one or more criteria and classifying at least the one empty region as a negative sample of the target content. In some implementations, the negative sample of the target content in a set of negative samples of the target content, with which to train a machine learning model employed to identify instances of the target content.
Opening claim text (preview).
What is claimed is: 1. A method comprising: identifying a bounding box around a character in an image; identifying multiple sub-frames within a frame around the bounding box; and for at least a sub-frame of the sub-frames: determining whether the sub-frame satisfies a plurality of criteria, wherein the plurality of criteria comprises a size threshold and whether the sub-frame is empty; in response to determining that the sub-frame satisfies the plurality of criteria, designating the sub-frame a negative example of one or more characters in the image; and in response to determining that the sub-frame is empty but does not satisfy the size threshold, discarding the sub-frame without designating the sub-frame as a negative example. 2. The method of claim 1 wherein: the plurality of criteria comprises whether a size of the sub-frame meets or exceeds a size threshold and whether the sub-frame is empty; and the sub-frame satisfies the plurality of criteria if the size of the sub-frame meets or exceeds the threshold and if the sub-frame is empty. 3. The method of claim 2 further comprising, in response to determining that the sub-frame is not empty, identifying other sub-frames within the sub-frame and adjacent to a rectangular portion of the sub-frame that includes at least a portion of another bounding box around another character. 4. The method of claim 3 further comprising, in response to determining that the sub-frame is not empty: identifying at least one other sub-frame of the other sub-frames that satisfies the plurality of criteria; and classifying at least the one other sub-frame as the negative example of the one or more characters in the image. 5. The method of claim 4 further comprising: including the negative example in a set of negative examples of the one or more characters; and training a machine learning model to identify instances of the one or more characters based on training data comprising the set of negative examples and a set of positive examples of the one or more characters. 6. A method for indexing video comprising: identifying one or more regions of interest around target content in a frame of a video; identifying, in a portion of the frame outside a region of interest, potentially empty regions adjacent to the region of interest; identifying at least one empty region of the potentially empty regions that satisfies criteria, wherein the criteria comprise: whether an empty region qualifies as empty for not including any of the one or more regions of interest around the target content; and whether a size of the empty region meets a size threshold; classifying at least the one empty region that satisfies the criteria as a negative sample of the target content; identifying at least one empty region of the potentially empty regions that qualifies as empty but does not meet the size threshold; and discarding the empty region that qualifies as empty but does not meet the size threshold without classifying the empty region as any type of sample of the target content. 7. The method of claim 6 further comprising: including the negative sample of the target content in a set of negative samples of the target content; and training a machine learning model to identify instances of the target content based on training data comprising the set of negative samples. 8. The method of claim 6 wherein: the target content comprises an animated character in the video; and the region of interest comprises a bounding box around the animated character. 9. The method of claim 8 wherein the potentially empty regions adjacent to the region of interest comprise rectangles, each with one side adjacent to the bounding box. 10. The method of claim 9 wherein: the rectangles comprise empty rectangles that do not overlap with any of the one of more regions of interest around the target content; and identifying at least the one empty region that satisfies the criteria comprises identifying a largest one of the empty rectangles. 11. The method of claim 10 , wherein discarding the empty region that qualifies as empty but does not meet the size threshold comprises discarding a rectangle that qualifies as empty but does not meet the size threshold without classifying the rectangle as any type of sample of the target content. 12. The method of claim 10 further comprising, for a rectangle that does not qualify as empty, identifying potentially empty rectangles adjacent to a rectangular portion of the rectangle that includes at least a portion of another bounding box around another animated character. 13. The method of claim 12 further comprising, for a rectangle that does not qualify as empty: identifying at least one empty rectangle of the potentially empty rectangles that qualifies as empty and meets the size threshold; and classifying at least the one empty rectangle as a negative sample of the target content. 14. The method of claim 6 wherein the target content comprises animated characters in the video. 15. The method of claim 14 wherein: the one or more regions of interest comprise bounding boxes drawn around instances of the animated characters in the frame; the portion of the frame outside the region of interest comprises a border area defined by a boundary of the region of interest and a boundary of the frame; and the region of interest comprises a central most one of the bounding boxes. 16. A computing apparatus comprising: one or more computer readable storage media; one or more processors operatively coupled with the one or more computer readable storage media; and program instructions stored on the one or more computer readable storage media that, when executed by the one or more processors, direct the computing apparatus to at least: identify a bounding box around a character in an image; identify multiple sub-frames within a frame around the bounding box; and for at least a sub-frame of the sub-frames: determine whether the sub-frame satisfies a plurality of criteria, wherein the plurality of criteria comprises a size threshold and whether the sub-frame is empty; in response to determining that the sub-frame satisfies the plurality of criteria, designate the sub-frame a negative example of one or more characters in the image; and in response to determining that the sub-frame is empty but does not satisfy the size threshold, discard the sub-frame without designating the sub-frame as a negative example. 17. The computing apparatus of claim 16 wherein: the plurality of criteria comprises whether a size of the sub-frame meets or exceeds a size threshold and whether the sub-frame is empty; and the sub-frame satisfies the plurality of criteria if the size of the sub-frame meets or exceeds the threshold and if the sub-frame is empty. 18. The computing apparatus of claim 17 wherein the program instructions, when executed by the one or more processors, further direct the computing apparatus to, in response to determining that the sub-frame is not empty, identify other sub-frames within the sub-frame and adjacent to a rectangular portion of the sub-frame that includes at least a portion of another bounding box around another animated character. 19. The computing apparatus of claim 18 wherein the program instructions, when executed by the one or more processors, further direct the computing apparatus to, in response to determining that the sub-frame is not empty: identify at least one other sub-frame of the other sub-frames that satisfies the plurality of criteria; and
Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title
characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling · CPC title
Classification techniques · CPC title
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items (segmenting video sequences G06V20/49) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.