Face recognition using larger pose face frontalization
US-2018268201-A1 · Sep 20, 2018 · US
US10832734B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10832734-B2 |
| Application number | US-201916283912-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 25, 2019 |
| Priority date | Feb 25, 2019 |
| Publication date | Nov 10, 2020 |
| Grant date | Nov 10, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for padding audiovisual clips (for example, audiovisual clips of sporting events) for the purpose of causing the clip to have a predetermined duration so that the padded clip can be evaluated for viewer interest by a machine learning (ML) algorithm. The unpadded clip is padded with audiovisual segment(s) that will cause the padded clip to have a level of viewer interest that it would have if the unpadded clip had been longer. In some embodiments the padded segments are synthetic images generated by a generative adversarial network such that the synthetic images would have the same level of viewer interest (as adjudged by an ML algorithm) as if the unpadded clip had been shot to be longer.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving a first unpadded audiovisual segment data set including information indicative of a first unpadded audiovisual segment; determining a set of padding time interval(s) occurring before and/or after the first unpadded segment; for each given padding time interval of the set of padding time interval(s): determining a respectively corresponding viewer interest value that would characterize the given padding time interval if the first unpadded audiovisual segment continued through the given padding time interval and had its viewer interest value determined by a machine learning (ML) algorithm, and generating a synthetic audiovisual segment for the given padding time interval so that the synthetic audiovisual segment for the given padding time interval is characterized by the viewer interest value determined for the given padding time interval; assembling the first unpadded audiovisual segment with the synthetic audiovisual segment(s) corresponding to each padding time interval of the set of padding time interval(s) to obtain a first padded audiovisual segment data set including information indicative of a first padded audiovisual segment; and determining, by the ML algorithm, a viewer interest value for the first padded audiovisual segment considered as a whole; wherein the generation of the synthetic audiovisual segment for each given padding time interval is performed by a generative adversarial network (GAN). 2. The method of claim 1 further comprising: selecting the first unpadded audiovisual segment for inclusion in a larger video presentation based, at least in part, upon the viewer interest value for the first padded audiovisual segment considered as a whole. 3. The method of claim 1 wherein the synthetic audiovisual segment(s) are not understandable to human viewers. 4. The method of claim 1 wherein: there are two padding time intervals as follows: (i) a first padding time interval occurring immediately before the first unpadded audiovisual segment, and (ii) a second padding time interval occurring immediately after the first unpadded audiovisual segment; and the first and second padding time intervals are at least substantially of equal duration. 5. The method of claim 1 further comprising: training the ML algorithm with a plurality of training data sets, with each training data set including: (i) an audiovisual segment data set including information indicative of an audiovisual segment, and (ii) a viewer interest value; wherein the generation of the synthetic audiovisual segment for each given padding time interval is based upon the plurality of training data sets. 6. A computer program product (CPP) comprising: a computer readable storage medium; and computer code stored on the machine readable storage device, with the computer code including instructions for causing a processor(s) set to perform operations including the following: receiving a first unpadded audiovisual segment data set including information indicative of a first unpadded audiovisual segment, determining a set of padding time interval occurring before and/or after the first unpadded segment, for each given padding time interval of the set of padding time interval(s): determining a respectively corresponding viewer interest value that would characterize the given padding time interval if the first unpadded audiovisual segment continued through the given padding time interval and had its viewer interest value determined by a machine learning (ML) algorithm, and generating a synthetic audiovisual segment for the given padding time interval so that the synthetic audiovisual segment for the given padding time interval is characterized by the viewer interest value determined for the given padding time interval, assembling the first unpadded audiovisual segment with the synthetic audiovisual segment(s) corresponding to each padding time interval of the set of padding time interval(s) to obtain a first padded audiovisual segment data set including information indicative of a first padded audiovisual segment, and determining, by the ML algorithm, a viewer interest value for the first padded audiovisual segment considered as a whole, wherein the generation of the synthetic audiovisual segment for each given padding time interval is performed by a generative adversarial network (GAN). 7. The CPP of claim 6 , wherein the computer code further includes instructions for causing the processor(s) set to perform the following operations: selecting the first unpadded audiovisual segment for inclusion in a larger video presentation based, at least in part, upon the viewer interest value for the first padded audiovisual segment considered as a whole. 8. The CPP of claim 6 wherein the synthetic audiovisual segment(s) are not understandable to human viewers. 9. The CPP of claim 6 wherein: there are two padding time intervals as follows: (i) a first padding time interval occurring immediately before the first unpadded audiovisual segment, and (ii) a second padding time interval occurring immediately after the first unpadded audiovisual segment; and the first and second padding time intervals are at least substantially of equal duration. 10. The CPP of claim 6 , wherein the computer code further includes instructions for causing the processor(s) set to perform the following operations: training the ML algorithm with a plurality of training data sets, with each training data set including: (i) an audiovisual segment data set including information indicative of an audiovisual segment, and (ii) a viewer interest value; wherein the generation of the synthetic audiovisual segment for each given padding time interval is based upon the plurality of training data sets. 11. A computer system (CS) comprising: a processor(s) set; a machine readable storage device; and computer code stored on the machine readable storage device, with the computer code including instructions for causing the processor(s) set to perform operations including the following: receiving a first unpadded audiovisual segment data set including information indicative of a first unpadded audiovisual segment, determining a set of padding time interval(s) occurring before and/or after the first unpadded segment, for each given padding time interval of the set of padding time interval(s): determining a respectively corresponding viewer interest value that would characterize the given padding time interval if the first unpadded audiovisual segment continued through the given padding time interval and had its viewer interest value determined by a machine learning (ML) algorithm, and generating a synthetic audiovisual segment for the given padding time interval so that the synthetic audiovisual segment for the given padding time interval is characterized by the viewer interest value determined for the given padding time interval, assembling the first unpadded audiovisual segment with the synthetic audiovisual segment(s) corresponding to each padding time interval of the set of padding time interval(s) to obtain a first padded audiovisual segment data set including information indicative of a first padded audiovisual segment, and determining, by the ML algorithm, a viewer interest value for the first padded audiovisual segment considered as a whole, wherein the generation of the synthetic audiovisual segment for each given padding time interval is performed by a generative adversarial network (GAN). 12. The CS of claim 11 , wherein the computer code further includes instructions for causing the processor(s) set to perform the following operations: selecting the first unpadded audio
involving operations for analysing video streams, e.g. detecting features or characteristics (television picture signal circuitry for scene change detection H04N5/147; filtering for image enhancement G06T5/00; methods or arrangements for recognising scenes G06V20/00; arrangements characterised by components specially adapted for monitoring, identification or recognition of video in broadcast systems H04H60/59) · CPC title
using neural networks · CPC title
using classification, e.g. of video objects · CPC title
Insert-editing · CPC title
Combinations of networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.