What technology area does this patent fall under?

Primary CPC classification G06N5/022. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 17 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Video segmentation based on weighted knowledge graph

US11093755B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11093755-B2
Application number	US-201916688356-A
Country	US
Kind code	B2
Filing date	Nov 19, 2019
Priority date	Nov 19, 2019
Publication date	Aug 17, 2021
Grant date	Aug 17, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system, method, and computer program product for segmenting videos. The system includes at least one processing component, at least one memory component, a video, an extraction component, and a graphing component. The extraction component is configured to extract image and text data from the video, identify entities in the image data, assign at least one entity relation to the entities in the image data, identifying entities in the text data, and assign at least one entity relation to the entities in the text data. The graphing component is configured to generate an image knowledge graph for the entity relations assigned to the entities in the image data, generate a text knowledge graph for the entity relations assigned to the at least two entities in the text data, and generate a weighted knowledge graph based on the image and text knowledge graphs.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for segmenting videos, comprising: at least one processing component; at least one memory component; a video; an extraction component configured to: extract image data and text data from the video; identify at least two entities in the image data; assign at least one entity relation to the at least two entities in the image data; identify at least two entities in the text data; and assign at least one entity relation to the two or more entities in the text data; and a graphing component configured to: generate an image knowledge graph for the at least one entity relation assigned to the at least two entities in the image data; generate a text knowledge graph for the at least one entity relation assigned to the at least two entities in the text data; and generate a weighted knowledge graph based on the image knowledge graph and the text knowledge graph. 2. The system of claim 1 , wherein the weighted knowledge graph includes relation weights for the at least one entity relation assigned to the at least two entities in the image data and the at least one entity relation assigned to the at least two entities in the text data. 3. The system of claim 2 , further comprising a grouping component configured to: identify a top relation in the at least one entity relation assigned to the at least two entities in the image data and the at least one entity relation assigned to the at least two entities in the text data, wherein the top relation is an entity relation having a relation weight greater than a threshold relation weight; select frames of the video that correspond to the top relation; and group the frames into a video segment. 4. The system of claim 3 , wherein the grouping component is further configured to: determine that there are remaining frames of the video that do not include the top relation; determine that the frames in the video segment are nearest to the remaining frames; and group the remaining frames with the video segment. 5. The system of claim 1 , wherein the video is divided into pictures, wherein each picture includes a set of frames. 6. The system of claim 1 , wherein the text data is captions. 7. The system of claim 1 , wherein the text data is extracted from speech data. 8. The system of claim 1 , wherein the at least two entities in the image data are identified based on facial recognition. 9. A method, comprising: receiving a video; extracting image data and text data from the video; identifying at least two entities in the image data; assigning at least one entity relation to the at least two entities in the image data; identifying at least two entities in the text data; assigning at least one entity relation to the at least two entities in the text data; generating an image knowledge graph for the at least one entity relation assigned to the at least two entities in the image data; generating a text knowledge graph for the at least one entity relation assigned to the at least two entities in the text data; and generating a weighted knowledge graph based on the image knowledge graph and the text knowledge graph. 10. The method of claim 9 , wherein the weighted knowledge graph includes relation weights for the at least one entity relation assigned to the at least two entities in the image data and the at least one entity relation assigned to the at least two entities in the text data. 11. The method of claim 10 , further comprising: identifying a top relation in the at least one entity relation assigned to the at least two entities in the image data and the at least one entity relation assigned to the at least two entities in the text data, wherein the top relation is an entity relation having a relation weight greater than a threshold relation weight; selecting frames of the video that correspond to the top relation; and grouping the frames into a video segment. 12. The method of claim 11 , further comprising: determining that there are remaining frames of the video that do not include the top relation; determining that the frames in the video segment are nearest to the remaining frames; and grouping the remaining frames with the video segment. 13. The method of claim 9 , wherein the video is divided into pictures, wherein each picture includes a set of frames. 14. The method of claim 9 , wherein the text data is captions. 15. The method of claim 9 , wherein the at least two entities in the image data are identified based on facial recognition. 16. A computer program product for segmenting videos, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause a device to perform a method, the method comprising: receiving a video; extracting image data and text data from the video; identifying at least two entities in the image data; assigning at least one entity relation to the at least two entities in the image data; identifying at least two entities in the text data; assigning at least one entity relation to the at least two entities in the text data; generating an image knowledge graph for the at least one entity relation assigned to the at least two entities in the image data; generating a text knowledge graph for the at least one entity relation assigned to the at least two entities in the text data; and generating a weighted knowledge graph based on the image knowledge graph and the text knowledge graph. 17. The computer program product of claim 16 , wherein the weighted knowledge graph includes relation weights for the at least one entity relation assigned to the at least two entities in the image data and the at least one entity relation assigned to the at least two entities in the text data. 18. The computer program product of claim 17 , further comprising: identifying a top relation in the at least one entity relation assigned to the at least two entities in the image data and the at least one entity relation assigned to the at least two entities in the text data, wherein the top relation is an entity relation having a relation weight greater than a threshold relation weight; selecting frames of the video that correspond to the top relation; and grouping the frames into a video segment. 19. The computer program product of claim 18 , further comprising: determining that there are remaining frames of the video that do not include the top relation; determining that the frames in the video segment are nearest to the remaining frames; and grouping the remaining frames with the video segment. 20. The computer program product of claim 16 , wherein the at least two entities in the image data are identified based on facial recognition.

Assignees

Inventors

Classifications

G06N5/022Primary
Knowledge engineering; Knowledge acquisition · CPC title
G06V10/70
using pattern recognition or machine learning (optical pattern recognition or electronic computations therefor G06V10/88) · CPC title
G06V10/82
using neural networks · CPC title
G06V20/49Primary
Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes · CPC title
G06V10/40
Extraction of image or video features · CPC title

Patent family

Related publications grouped by family.

View patent family 75909550

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11093755B2 cover?: A system, method, and computer program product for segmenting videos. The system includes at least one processing component, at least one memory component, a video, an extraction component, and a graphing component. The extraction component is configured to extract image and text data from the video, identify entities in the image data, assign at least one entity relation to the entities in the…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06N5/022. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 17 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).