What technology area does this patent fall under?

Primary CPC classification H04N21/23418. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Nov 10 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Dynamic audiovisual segment padding for machine learning

US10832734B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10832734-B2
Application number	US-201916283912-A
Country	US
Kind code	B2
Filing date	Feb 25, 2019
Priority date	Feb 25, 2019
Publication date	Nov 10, 2020
Grant date	Nov 10, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for padding audiovisual clips (for example, audiovisual clips of sporting events) for the purpose of causing the clip to have a predetermined duration so that the padded clip can be evaluated for viewer interest by a machine learning (ML) algorithm. The unpadded clip is padded with audiovisual segment(s) that will cause the padded clip to have a level of viewer interest that it would have if the unpadded clip had been longer. In some embodiments the padded segments are synthetic images generated by a generative adversarial network such that the synthetic images would have the same level of viewer interest (as adjudged by an ML algorithm) as if the unpadded clip had been shot to be longer.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving a first unpadded audiovisual segment data set including information indicative of a first unpadded audiovisual segment; determining a set of padding time interval(s) occurring before and/or after the first unpadded segment; for each given padding time interval of the set of padding time interval(s): determining a respectively corresponding viewer interest value that would characterize the given padding time interval if the first unpadded audiovisual segment continued through the given padding time interval and had its viewer interest value determined by a machine learning (ML) algorithm, and generating a synthetic audiovisual segment for the given padding time interval so that the synthetic audiovisual segment for the given padding time interval is characterized by the viewer interest value determined for the given padding time interval; assembling the first unpadded audiovisual segment with the synthetic audiovisual segment(s) corresponding to each padding time interval of the set of padding time interval(s) to obtain a first padded audiovisual segment data set including information indicative of a first padded audiovisual segment; and determining, by the ML algorithm, a viewer interest value for the first padded audiovisual segment considered as a whole; wherein the generation of the synthetic audiovisual segment for each given padding time interval is performed by a generative adversarial network (GAN). 2. The method of claim 1 further comprising: selecting the first unpadded audiovisual segment for inclusion in a larger video presentation based, at least in part, upon the viewer interest value for the first padded audiovisual segment considered as a whole. 3. The method of claim 1 wherein the synthetic audiovisual segment(s) are not understandable to human viewers. 4. The method of claim 1 wherein: there are two padding time intervals as follows: (i) a first padding time interval occurring immediately before the first unpadded audiovisual segment, and (ii) a second padding time interval occurring immediately after the first unpadded audiovisual segment; and the first and second padding time intervals are at least substantially of equal duration. 5. The method of claim 1 further comprising: training the ML algorithm with a plurality of training data sets, with each training data set including: (i) an audiovisual segment data set including information indicative of an audiovisual segment, and (ii) a viewer interest value; wherein the generation of the synthetic audiovisual segment for each given padding time interval is based upon the plurality of training data sets. 6. A computer program product (CPP) comprising: a computer readable storage medium; and computer code stored on the machine readable storage device, with the computer code including instructions for causing a processor(s) set to perform operations including the following: receiving a first unpadded audiovisual segment data set including information indicative of a first unpadded audiovisual segment, determining a set of padding time interval occurring before and/or after the first unpadded segment, for each given padding time interval of the set of padding time interval(s): determining a respectively corresponding viewer interest value that would characterize the given padding time interval if the first unpadded audiovisual segment continued through the given padding time interval and had its viewer interest value determined by a machine learning (ML) algorithm, and generating a synthetic audiovisual segment for the given padding time interval so that the synthetic audiovisual segment for the given padding time interval is characterized by the viewer interest value determined for the given padding time interval, assembling the first unpadded audiovisual segment with the synthetic audiovisual segment(s) corresponding to each padding time interval of the set of padding time interval(s) to obtain a first padded audiovisual segment data set including information indicative of a first padded audiovisual segment, and determining, by the ML algorithm, a viewer interest value for the first padded audiovisual segment considered as a whole, wherein the generation of the synthetic audiovisual segment for each given padding time interval is performed by a generative adversarial network (GAN). 7. The CPP of claim 6 , wherein the computer code further includes instructions for causing the processor(s) set to perform the following operations: selecting the first unpadded audiovisual segment for inclusion in a larger video presentation based, at least in part, upon the viewer interest value for the first padded audiovisual segment considered as a whole. 8. The CPP of claim 6 wherein the synthetic audiovisual segment(s) are not understandable to human viewers. 9. The CPP of claim 6 wherein: there are two padding time intervals as follows: (i) a first padding time interval occurring immediately before the first unpadded audiovisual segment, and (ii) a second padding time interval occurring immediately after the first unpadded audiovisual segment; and the first and second padding time intervals are at least substantially of equal duration. 10. The CPP of claim 6 , wherein the computer code further includes instructions for causing the processor(s) set to perform the following operations: training the ML algorithm with a plurality of training data sets, with each training data set including: (i) an audiovisual segment data set including information indicative of an audiovisual segment, and (ii) a viewer interest value; wherein the generation of the synthetic audiovisual segment for each given padding time interval is based upon the plurality of training data sets. 11. A computer system (CS) comprising: a processor(s) set; a machine readable storage device; and computer code stored on the machine readable storage device, with the computer code including instructions for causing the processor(s) set to perform operations including the following: receiving a first unpadded audiovisual segment data set including information indicative of a first unpadded audiovisual segment, determining a set of padding time interval(s) occurring before and/or after the first unpadded segment, for each given padding time interval of the set of padding time interval(s): determining a respectively corresponding viewer interest value that would characterize the given padding time interval if the first unpadded audiovisual segment continued through the given padding time interval and had its viewer interest value determined by a machine learning (ML) algorithm, and generating a synthetic audiovisual segment for the given padding time interval so that the synthetic audiovisual segment for the given padding time interval is characterized by the viewer interest value determined for the given padding time interval, assembling the first unpadded audiovisual segment with the synthetic audiovisual segment(s) corresponding to each padding time interval of the set of padding time interval(s) to obtain a first padded audiovisual segment data set including information indicative of a first padded audiovisual segment, and determining, by the ML algorithm, a viewer interest value for the first padded audiovisual segment considered as a whole, wherein the generation of the synthetic audiovisual segment for each given padding time interval is performed by a generative adversarial network (GAN). 12. The CS of claim 11 , wherein the computer code further includes instructions for causing the processor(s) set to perform the following operations: selecting the first unpadded audio

Assignees

Inventors

Classifications

H04N21/23418Primary
involving operations for analysing video streams, e.g. detecting features or characteristics (television picture signal circuitry for scene change detection H04N5/147; filtering for image enhancement G06T5/00; methods or arrangements for recognising scenes G06V20/00; arrangements characterised by components specially adapted for monitoring, identification or recognition of video in broadcast systems H04H60/59) · CPC title
G06V10/82
using neural networks · CPC title
G06V10/764
using classification, e.g. of video objects · CPC title
G11B27/036Primary
Insert-editing · CPC title
G06N3/045
Combinations of networks · CPC title

Patent family

Related publications grouped by family.

View patent family 72143024

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10832734B2 cover?: Techniques for padding audiovisual clips (for example, audiovisual clips of sporting events) for the purpose of causing the clip to have a predetermined duration so that the padded clip can be evaluated for viewer interest by a machine learning (ML) algorithm. The unpadded clip is padded with audiovisual segment(s) that will cause the padded clip to have a level of viewer interest that it would…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification H04N21/23418. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Nov 10 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Face recognition using larger pose face frontalization

Deep learning medical systems and methods for image reconstruction and quality evaluation

Attribute similarity-based search

Super resolution using a generative adversarial network

System and method for brand monitoring and trend analysis based on deep-content-classification

Signature generation for multimedia deep-content-classification by a large-scale matching system and method thereof

Frequently asked questions