Computerized system and method for automatic highlight detection from live streaming media and rendering within a specialized media player
US-2018020243-A1 · Jan 18, 2018 · US
US10740394B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10740394-B2 |
| Application number | US-201815941437-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 30, 2018 |
| Priority date | Jan 18, 2018 |
| Publication date | Aug 11, 2020 |
| Grant date | Aug 11, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed are systems and methods for improving interactions with and between computers in content searching, hosting and/or providing systems supported by or configured with devices, servers and/or platforms. The disclosed systems and methods provide a novel machine-in-the-loop, image-to-video bootstrapping framework that harnesses a training set built upon an image dataset and a video dataset in order to efficiently produce an accurate training set to be applied to frames of videos. The disclosed systems and methods reduce the amount of time required to build the training dataset, and also provide mechanisms to apply the training dataset to any type of content and for any type of recognition task.
Opening claim text (preview).
What is claimed is: 1. A method comprising the steps of: receiving, at a computing device, a search query comprising a search term; searching, via the computing device, a collection of images, and based on said searching, identifying a set of images, said set of images comprising content depicting said search term; searching, via the computing device, a collection of videos, and based on said searching, identifying a set of videos, each video in said set of videos comprising at least one video frame comprising content depicting said search term; executing, via the computing device, object detection software on said image set and said video set, said execution comprising analyzing the image set and identifying information related to said content that depicts said search term within each image in the image set, and based on said analysis, performing visual object detection on frames of the videos in the video set based on the identified information from said image set; generating, via the computing device, a set of annotated video frames based on said visual object detection, said generation comprising annotating video frames of the videos in the video set that comprise said content depicting said search term with information indicating that a depiction of said search term is depicted therein; and training, via the computing device, visual recognizer software with said generated set of annotated video frames. 2. The method of claim 1 , further comprising: searching said collection of videos, and based on said searching, identifying a second video set of videos, each video in said second video set comprising at least one video frame comprising content depicting said search term; executing said object detection software on said second video set and said set of annotated video frames, said execution comprising performing visual object detection on frames of the videos in the second video set based on the annotated information in said annotated video frame set; generating a second set of annotated video frames based on said visual object detection, said generation comprising annotating a set of video frames of the videos in the second video set that comprise said content depicting said search term with information indicating that a depiction of said search term is depicted therein; and adding said second set of annotated video frames to a training dataset comprising the annotated video frames. 3. The method of claim 2 , further comprising training the visual recognizer software based on said addition of the second set of annotated video frames to the training dataset. 4. The method of claim 1 , further comprising: causing a video file to be rendered over a network on a device of a user; analyzing the video file as it is rendered on the user device, said analysis comprising identifying a frame set of the video that is currently being rendered; applying the trained visual recognizer software to said identified frame set; and identifying, based on said application of the trained visual recognizer software, an object depicted within said frame set that corresponds to said search term. 5. The method of claim 4 , further comprising: searching, over a network, for content associated with said object; identifying, based on said search, said content; and communicating said content for display when said object is displayed within said video said content display comprising information augmenting a depiction of the object within said video. 6. The method of claim 1 , further comprising: sampling each of the videos identified in said video set, and based on said sampling, identifying a frame set for each of the videos in said video set. 7. The method of claim 6 , wherein said sampling comprises applying neural network region proposal software on said videos in said video set. 8. The method of claim 1 , further comprising: determining a confidence value for each annotated video frame, said confidence value indicating a quality of the object in each video frame. 9. The method of claim 8 , wherein said annotated video frame is automatically added to a training dataset when said confidence value for said frame satisfies a threshold. 10. The method of claim 8 , wherein said annotated video frame is verified by an editor when said confidence value does not satisfy a threshold, wherein said annotated video frame is added to a training dataset after said verification. 11. The method of claim 1 , further comprising: downloading and storing said image set upon identifying said image set from said image search; and downloading and storing said video set upon identifying said video set from said video search. 12. A non-transitory computer-readable storage medium tangibly encoded with computer-executable instructions, that when executed by a processor associated with a computing device, performs a method comprising: receiving, at the computing device, a search query comprising a search term; searching, via the computing device, a collection of images, and based on said searching, identifying a set of images, said set of images comprising content depicting said search term; searching, via the computing device, a collection of videos, and based on said searching, identifying a set of videos, each video in said set of videos comprising at least one video frame comprising content depicting said search term; executing, via the computing device, object detection software on said image set and said video set, said execution comprising analyzing the image set and identifying information related to said content that depicts said search term within each image in the image set, and based on said analysis, performing visual object detection on frames of the videos in the video set based on the identified information from said image set; generating, via the computing device, a set of annotated video frames based on said visual object detection, said generation comprising annotating video frames of the videos in the video set that comprise said content depicting said search term with information indicating that a depiction of said search term is depicted therein; and training, via the computing device, visual recognizer software with said generated set of annotated video frames. 13. The non-transitory computer-readable storage medium of claim 12 , further comprising: searching said collection of videos, and based on said searching, identifying a second video set of videos, each video in said second video set comprising at least one video frame comprising content depicting said search term; executing said object detection software on said second video set and said set of annotated video frames, said execution comprising performing visual object detection on frames of the videos in the second video set based on the annotated information in said annotated video frame set; generating a second set of annotated video frames based on said visual object detection, said generation comprising annotating a set of video frames of the videos in the second video set that comprise said content depicting said search term with information indicating that a depiction of said search term is depicted therein; and adding said second set of annotated video frames to a training dataset comprising the annotated video frames. 14. The non-transitory computer-readable storage medium of claim 13 , further comprising training the visual recognizer software based on said addition of the second set of annotated video frames to the training dataset. 15. The non-transitory computer-readable storage medium of claim 12 , further comprising: causing a video
using objects detected or recognised in the video content · CPC title
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Validation; Performance evaluation · CPC title
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Validation; Performance evaluation; Active pattern learning techniques · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.