Autonomous video conferencing system with virtual director assistance
US-2024414437-A1 · Dec 12, 2024 · US
US9514353B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9514353-B2 |
| Application number | US-201114347280-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 31, 2011 |
| Priority date | Oct 31, 2011 |
| Publication date | Dec 6, 2016 |
| Grant date | Dec 6, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for finding a temporal face sequence ( 412 ) includes, with a physical computing system ( 100 ), analyzing frames within a shot within a video, applying a face detection function to the frames, and in response to detecting a face within one of the frames, tracing a person associated with the face both backwards and forwards through frames within the shot. A temporal face sequence ( 412 ) is then defined as a sequence of frames that include frames within the shot spanning when the person is shown.
Opening claim text (preview).
What is claimed is: 1. A method for finding a temporal face sequence, the method comprising: applying, by a computing system, a face detection function to a shot within a video, wherein the shot comprises a series of frames; determining, by the computing system, a face detection instant, wherein the face detection instant is an instant the face detection function detects a face of a person within one frame of the frames; determining, by the computer system, a time the person enters the shot by backward tracking the person through the frames from the face detection instant; determining, by the computer system, a time the person leaves the shot by forward tracking the person through the frames from the face detection instant; determining, by the computing system, a time range from the time the person enters the shot to the time the person leaves the shot; and identifying a face temporal sequence for the person, wherein the face temporal sequence comprises the determined time range. 2. The method of claim 1 , further comprising determining, by the computing system, an optimal frame within the temporal face sequence, wherein the optimal frame comprises an optimal image of the face to be used for face clustering, wherein determining the optimal frame includes determining at least one of an angle of the face to a camera, eye localization, size of the face, an illumination condition of the face, and detection confidence values. 3. The method of claim 2 , further comprising applying, by the computing system, a face clustering function to a number of temporal face sequences from a number of shots from the video. 4. The method of claim 3 , wherein the face clustering function comprises an agglomerative face clustering function. 5. The method of claim 3 , wherein the face clustering function is subject to at least one of: a must-link constraint and a cannot-link constraint. 6. The method of claim 1 , wherein forward tracking and backward tracking the person comprises using a head-and-shoulder model. 7. The method of claim 1 , wherein the forward tracking and the backward tracking of the person comprises reapplying the face detection function every set number of frames. 8. The method of claim 1 , wherein the face detection instant is a first instant the face detection function detects the face of the person and not another feature of the person. 9. A computing system comprising: at least one processor; a memory communicatively coupled to the at least one processor, the memory comprising computer executable instructions that, when executed by the at least one processor, causes the at least one processor to: apply a face detection function to a shot within a video, wherein the shot comprises a series of frames; determine a face detection instant, wherein the face detection instant is an instant the face detection function detects a face within one frame of the frames determine a time the person enters the shot by backward tracking the person through the frames from the face detection instant; determine a time the person leaves the shot by forward tracking the person through the frames from the face detection instant; determine a time range from the time the person enters the shot to the time the person leaves the shot; and identify a face temporal sequence for the person, wherein the face temporal sequence comprises the determined time range. 10. The system of claim 9 , wherein the computer readable program code further comprises computer executable instructions that, when executed, cause the at least one processor to determine an optimal frame within the temporal face sequence, the optimal frame comprising an optimal image of the face to be used for face clustering, wherein to determine the optimal frame, the computer executable instructions, when executed, cause the at least one processor to determine at least one of an angle of the face to a camera, eye localization, size of the face, an illumination condition of the face, and detection confidence values. 11. The system of claim 10 , wherein the computer readable program code further comprises computer executable instructions that, when executed, cause the at least one processor to apply a face clustering function to a number of temporal face sequences from a number of shots from the video and optimal frames within the number of temporal face sequences. 12. The system of claim 11 , wherein the face clustering function comprises an agglomerative face clustering function. 13. The system of claim 11 , wherein the face clustering function is subject to at least one of: a must-link constraint and a cannot-link constraint. 14. The system of claim 9 , wherein to backward track and forward track the person, the computer executable instructions are to cause the at least one processor to use a head-and-shoulder model. 15. The system of claim 9 , wherein to backward track and forward track the person, the computer executable instructions are to cause the at least one processor to reapply the face detection function every set number of frames. 16. The system of claim 9 , wherein the face detection instant is a first instant the face detection function detects the face of the person and not another feature of the person. 17. A method for video face clustering executed by a computing system, the method comprising: applying, by the computing system, a face detection function to a series of frames of a shot within a video; determining a face detection instant, wherein the face detection instant is an instant the face detection function detects a face within one frame of the frames; determining a time a person associated with the face enters the shot by backward tracking the person through the series of frames within the shot; determining a time the person leaves the shot by forward tracking the person through the frames from the face detection instant; determining a time range from the time the person enters the shot to the time the person leaves the shot; defining a temporal face sequence identifying a subset of frames in which the person appears based on the determined time range; determining a set of optimal frames within the temporal face sequence, the optimal frames comprising an optimal image of the face to be used for face clustering, wherein determining the set of optimal frames includes determining at least one of an angle of the face to a camera, eye localization, size of the face, an illumination condition of the face, and detection confidence values; and applying a face clustering function to a number of temporal face sequences from a number of shots from the video based on the determined set of optimal frames. 18. The method of claim 17 , wherein the face clustering function is subject to at least one of: a must-link constraint and a cannot-link constraint. 19. The method of claim 17 , wherein the face detection instant is a first instant a face of the person, and not another feature of the person, is first detected within one frame of the frames.
in video content (extracting overlay text G06V20/62; video retrieval G06F16/70; processing of video elementary streams in video servers H04N21/234; processing of video elementary streams in video clients H04N21/44) · CPC title
Human faces, e.g. facial parts, sketches or expressions · CPC title
Hierarchical techniques, i.e. dividing or merging pattern sets so as to obtain a dendrogram · CPC title
Clustering techniques · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.