Video segmentation based on weighted knowledge graph

US11093755B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11093755-B2
Application numberUS-201916688356-A
CountryUS
Kind codeB2
Filing dateNov 19, 2019
Priority dateNov 19, 2019
Publication dateAug 17, 2021
Grant dateAug 17, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system, method, and computer program product for segmenting videos. The system includes at least one processing component, at least one memory component, a video, an extraction component, and a graphing component. The extraction component is configured to extract image and text data from the video, identify entities in the image data, assign at least one entity relation to the entities in the image data, identifying entities in the text data, and assign at least one entity relation to the entities in the text data. The graphing component is configured to generate an image knowledge graph for the entity relations assigned to the entities in the image data, generate a text knowledge graph for the entity relations assigned to the at least two entities in the text data, and generate a weighted knowledge graph based on the image and text knowledge graphs.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for segmenting videos, comprising: at least one processing component; at least one memory component; a video; an extraction component configured to: extract image data and text data from the video; identify at least two entities in the image data; assign at least one entity relation to the at least two entities in the image data; identify at least two entities in the text data; and assign at least one entity relation to the two or more entities in the text data; and a graphing component configured to: generate an image knowledge graph for the at least one entity relation assigned to the at least two entities in the image data; generate a text knowledge graph for the at least one entity relation assigned to the at least two entities in the text data; and generate a weighted knowledge graph based on the image knowledge graph and the text knowledge graph. 2. The system of claim 1 , wherein the weighted knowledge graph includes relation weights for the at least one entity relation assigned to the at least two entities in the image data and the at least one entity relation assigned to the at least two entities in the text data. 3. The system of claim 2 , further comprising a grouping component configured to: identify a top relation in the at least one entity relation assigned to the at least two entities in the image data and the at least one entity relation assigned to the at least two entities in the text data, wherein the top relation is an entity relation having a relation weight greater than a threshold relation weight; select frames of the video that correspond to the top relation; and group the frames into a video segment. 4. The system of claim 3 , wherein the grouping component is further configured to: determine that there are remaining frames of the video that do not include the top relation; determine that the frames in the video segment are nearest to the remaining frames; and group the remaining frames with the video segment. 5. The system of claim 1 , wherein the video is divided into pictures, wherein each picture includes a set of frames. 6. The system of claim 1 , wherein the text data is captions. 7. The system of claim 1 , wherein the text data is extracted from speech data. 8. The system of claim 1 , wherein the at least two entities in the image data are identified based on facial recognition. 9. A method, comprising: receiving a video; extracting image data and text data from the video; identifying at least two entities in the image data; assigning at least one entity relation to the at least two entities in the image data; identifying at least two entities in the text data; assigning at least one entity relation to the at least two entities in the text data; generating an image knowledge graph for the at least one entity relation assigned to the at least two entities in the image data; generating a text knowledge graph for the at least one entity relation assigned to the at least two entities in the text data; and generating a weighted knowledge graph based on the image knowledge graph and the text knowledge graph. 10. The method of claim 9 , wherein the weighted knowledge graph includes relation weights for the at least one entity relation assigned to the at least two entities in the image data and the at least one entity relation assigned to the at least two entities in the text data. 11. The method of claim 10 , further comprising: identifying a top relation in the at least one entity relation assigned to the at least two entities in the image data and the at least one entity relation assigned to the at least two entities in the text data, wherein the top relation is an entity relation having a relation weight greater than a threshold relation weight; selecting frames of the video that correspond to the top relation; and grouping the frames into a video segment. 12. The method of claim 11 , further comprising: determining that there are remaining frames of the video that do not include the top relation; determining that the frames in the video segment are nearest to the remaining frames; and grouping the remaining frames with the video segment. 13. The method of claim 9 , wherein the video is divided into pictures, wherein each picture includes a set of frames. 14. The method of claim 9 , wherein the text data is captions. 15. The method of claim 9 , wherein the at least two entities in the image data are identified based on facial recognition. 16. A computer program product for segmenting videos, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause a device to perform a method, the method comprising: receiving a video; extracting image data and text data from the video; identifying at least two entities in the image data; assigning at least one entity relation to the at least two entities in the image data; identifying at least two entities in the text data; assigning at least one entity relation to the at least two entities in the text data; generating an image knowledge graph for the at least one entity relation assigned to the at least two entities in the image data; generating a text knowledge graph for the at least one entity relation assigned to the at least two entities in the text data; and generating a weighted knowledge graph based on the image knowledge graph and the text knowledge graph. 17. The computer program product of claim 16 , wherein the weighted knowledge graph includes relation weights for the at least one entity relation assigned to the at least two entities in the image data and the at least one entity relation assigned to the at least two entities in the text data. 18. The computer program product of claim 17 , further comprising: identifying a top relation in the at least one entity relation assigned to the at least two entities in the image data and the at least one entity relation assigned to the at least two entities in the text data, wherein the top relation is an entity relation having a relation weight greater than a threshold relation weight; selecting frames of the video that correspond to the top relation; and grouping the frames into a video segment. 19. The computer program product of claim 18 , further comprising: determining that there are remaining frames of the video that do not include the top relation; determining that the frames in the video segment are nearest to the remaining frames; and grouping the remaining frames with the video segment. 20. The computer program product of claim 16 , wherein the at least two entities in the image data are identified based on facial recognition.

Assignees

Inventors

Classifications

  • G06N5/022Primary

    Knowledge engineering; Knowledge acquisition · CPC title

  • using pattern recognition or machine learning (optical pattern recognition or electronic computations therefor G06V10/88) · CPC title

  • using neural networks · CPC title

  • G06V20/49Primary

    Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes · CPC title

  • Extraction of image or video features · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11093755B2 cover?
A system, method, and computer program product for segmenting videos. The system includes at least one processing component, at least one memory component, a video, an extraction component, and a graphing component. The extraction component is configured to extract image and text data from the video, identify entities in the image data, assign at least one entity relation to the entities in the…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N5/022. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 17 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).