Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06V10/25. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 09 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Negative sampling algorithm for enhanced image classification

US11954893B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11954893-B2
Application number	US-202217843270-A
Country	US
Kind code	B2
Filing date	Jun 17, 2022
Priority date	Aug 20, 2019
Publication date	Apr 9, 2024
Grant date	Apr 9, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The technology described herein is directed to systems, methods, and software for indexing video. In an implementation, a method comprises identifying one or more regions of interest around target content in a frame of the video. Further, the method includes identifying, in a portion of the frame outside a region of interest, potentially empty regions adjacent to the region of interest. The method continues with identifying at least one empty region of the potentially empty regions that satisfies one or more criteria and classifying at least the one empty region as a negative sample of the target content. In some implementations, the negative sample of the target content in a set of negative samples of the target content, with which to train a machine learning model employed to identify instances of the target content.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: identifying a bounding box around a character in an image; identifying multiple sub-frames within a frame around the bounding box; and for at least a sub-frame of the sub-frames: determining whether the sub-frame satisfies a plurality of criteria, wherein the plurality of criteria comprises a size threshold and whether the sub-frame is empty; in response to determining that the sub-frame satisfies the plurality of criteria, designating the sub-frame a negative example of one or more characters in the image; and in response to determining that the sub-frame is empty but does not satisfy the size threshold, discarding the sub-frame without designating the sub-frame as a negative example. 2. The method of claim 1 wherein: the plurality of criteria comprises whether a size of the sub-frame meets or exceeds a size threshold and whether the sub-frame is empty; and the sub-frame satisfies the plurality of criteria if the size of the sub-frame meets or exceeds the threshold and if the sub-frame is empty. 3. The method of claim 2 further comprising, in response to determining that the sub-frame is not empty, identifying other sub-frames within the sub-frame and adjacent to a rectangular portion of the sub-frame that includes at least a portion of another bounding box around another character. 4. The method of claim 3 further comprising, in response to determining that the sub-frame is not empty: identifying at least one other sub-frame of the other sub-frames that satisfies the plurality of criteria; and classifying at least the one other sub-frame as the negative example of the one or more characters in the image. 5. The method of claim 4 further comprising: including the negative example in a set of negative examples of the one or more characters; and training a machine learning model to identify instances of the one or more characters based on training data comprising the set of negative examples and a set of positive examples of the one or more characters. 6. A method for indexing video comprising: identifying one or more regions of interest around target content in a frame of a video; identifying, in a portion of the frame outside a region of interest, potentially empty regions adjacent to the region of interest; identifying at least one empty region of the potentially empty regions that satisfies criteria, wherein the criteria comprise: whether an empty region qualifies as empty for not including any of the one or more regions of interest around the target content; and whether a size of the empty region meets a size threshold; classifying at least the one empty region that satisfies the criteria as a negative sample of the target content; identifying at least one empty region of the potentially empty regions that qualifies as empty but does not meet the size threshold; and discarding the empty region that qualifies as empty but does not meet the size threshold without classifying the empty region as any type of sample of the target content. 7. The method of claim 6 further comprising: including the negative sample of the target content in a set of negative samples of the target content; and training a machine learning model to identify instances of the target content based on training data comprising the set of negative samples. 8. The method of claim 6 wherein: the target content comprises an animated character in the video; and the region of interest comprises a bounding box around the animated character. 9. The method of claim 8 wherein the potentially empty regions adjacent to the region of interest comprise rectangles, each with one side adjacent to the bounding box. 10. The method of claim 9 wherein: the rectangles comprise empty rectangles that do not overlap with any of the one of more regions of interest around the target content; and identifying at least the one empty region that satisfies the criteria comprises identifying a largest one of the empty rectangles. 11. The method of claim 10 , wherein discarding the empty region that qualifies as empty but does not meet the size threshold comprises discarding a rectangle that qualifies as empty but does not meet the size threshold without classifying the rectangle as any type of sample of the target content. 12. The method of claim 10 further comprising, for a rectangle that does not qualify as empty, identifying potentially empty rectangles adjacent to a rectangular portion of the rectangle that includes at least a portion of another bounding box around another animated character. 13. The method of claim 12 further comprising, for a rectangle that does not qualify as empty: identifying at least one empty rectangle of the potentially empty rectangles that qualifies as empty and meets the size threshold; and classifying at least the one empty rectangle as a negative sample of the target content. 14. The method of claim 6 wherein the target content comprises animated characters in the video. 15. The method of claim 14 wherein: the one or more regions of interest comprise bounding boxes drawn around instances of the animated characters in the frame; the portion of the frame outside the region of interest comprises a border area defined by a boundary of the region of interest and a boundary of the frame; and the region of interest comprises a central most one of the bounding boxes. 16. A computing apparatus comprising: one or more computer readable storage media; one or more processors operatively coupled with the one or more computer readable storage media; and program instructions stored on the one or more computer readable storage media that, when executed by the one or more processors, direct the computing apparatus to at least: identify a bounding box around a character in an image; identify multiple sub-frames within a frame around the bounding box; and for at least a sub-frame of the sub-frames: determine whether the sub-frame satisfies a plurality of criteria, wherein the plurality of criteria comprises a size threshold and whether the sub-frame is empty; in response to determining that the sub-frame satisfies the plurality of criteria, designate the sub-frame a negative example of one or more characters in the image; and in response to determining that the sub-frame is empty but does not satisfy the size threshold, discard the sub-frame without designating the sub-frame as a negative example. 17. The computing apparatus of claim 16 wherein: the plurality of criteria comprises whether a size of the sub-frame meets or exceeds a size threshold and whether the sub-frame is empty; and the sub-frame satisfies the plurality of criteria if the size of the sub-frame meets or exceeds the threshold and if the sub-frame is empty. 18. The computing apparatus of claim 17 wherein the program instructions, when executed by the one or more processors, further direct the computing apparatus to, in response to determining that the sub-frame is not empty, identify other sub-frames within the sub-frame and adjacent to a rectangular portion of the sub-frame that includes at least a portion of another bounding box around another animated character. 19. The computing apparatus of claim 18 wherein the program instructions, when executed by the one or more processors, further direct the computing apparatus to, in response to determining that the sub-frame is not empty: identify at least one other sub-frame of the other sub-frames that satisfies the plurality of criteria; and

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06V10/25Primary
Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title
G06F18/2155
characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling · CPC title
G06F18/24
Classification techniques · CPC title
G06V10/774
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06V20/41
Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items (segmenting video sequences G06V20/49) · CPC title

Patent family

Related publications grouped by family.

View patent family 74646271

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11954893B2 cover?: The technology described herein is directed to systems, methods, and software for indexing video. In an implementation, a method comprises identifying one or more regions of interest around target content in a frame of the video. Further, the method includes identifying, in a portion of the frame outside a region of interest, potentially empty regions adjacent to the region of interest. The met…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06V10/25. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 09 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).