Unsupervised video segmentation

US10402986B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10402986-B2
Application numberUS-201715849341-A
CountryUS
Kind codeB2
Filing dateDec 20, 2017
Priority dateDec 20, 2017
Publication dateSep 3, 2019
Grant dateSep 3, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment, a method includes a computing system accessing a first training data comprising a first image and a second image and an associated optical flow estimation. The system may input (1) the first image into a first machine-learning model configured to generate a first output and (2) the optical flow estimation into a second machine-learning model configured to generate a second output. The first output of the first machine-learning model is associated with first image segments of a predetermined number, and the second output of the second machine-learning model is associated with transformations of the predetermined number. The first output, the transformations, and the first image are configured to generate an estimated image. The system trains the first machine-learning model and the second machine-learning model based on at least a comparison of the estimated image and the second image.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: by a computing system, accessing a first training data in a set of training data, the first training data comprising a sequence of images with a first image and a second image; by the computing system, accessing an optical flow estimation associated with the first training data; by the computing system, inputting (1) the first image into a first machine-learning model configured to generate a first output and (2) the optical flow estimation into a second machine-learning model configured to generate a second output, wherein the first output of the first machine-learning model is associated with first image segments of a predetermined number, wherein the second output of the second machine-learning model is associated with transformations of the predetermined number, wherein the first output, the transformations, and the first image are configured to generate an estimated image; and by the computing system, training the first machine-learning model and the second machine-learning model based on at least a comparison of the estimated image and the second image; wherein the trained first machine-learning model is configured to segment images into the predetermined number of segments. 2. The method of claim 1 , wherein: the first image precedes the second image in the sequence of images; or the second image precedes the first image in the sequence of images. 3. The method of claim 1 , further comprising: by the computing system, generating the optical flow estimation using the first image and the second image. 4. The method of claim 1 , wherein the first output of the first machine-learning model comprises masks of the predetermined number, wherein applying the masks to the first image generates, respectively, the first image segments of the predetermined number. 5. The method of claim 4 , wherein the estimated image is generated by: transforming each of the first image segments using an associated one of the transformations; and combining the transformed first image segments. 6. The method of claim 5 , wherein the transforming of each of the first image segments comprises: applying the associated transformation to the associated mask; applying the associated transformation to the first image; and applying the transformed associated mask to the transformed first image. 7. The method of claim 5 , wherein the combining of the transformed first image segments comprises sequentially layering the transformed first image segments. 8. The method of claim 1 , wherein the transformations of the predetermined number are rigid transformations, wherein the rigid transformations are respectively associated with the first image segments. 9. The method of claim 1 , wherein the transformations of the predetermined number are warping operators, wherein the warping operators are respectively associated with the first image segments. 10. The method of claim 9 , wherein the second output of the second machine-learning model comprises rigid transformations of the predetermined number, wherein the rigid transformations are used to respectively generate the warping operators. 11. The method of claim 1 , wherein the first image is a combination of the first image segments. 12. The method of claim 1 , further comprising: by the computing system, accessing a second training data in a second set of training data, the second training data comprising a third image with one or more segments, wherein at least one of the one or more segments is associated with an identified object; by the computing system, further training the trained first machine-learning model using the second training data; wherein the further-trained first machine-learning model is configured to segment and identify objects in images. 13. One or more computer-readable non-transitory storage media embodying software that is operable when executed by a computing system to cause the computing system to perform operations comprising: accessing a first training data in a set of training data, the first training data comprising a sequence of images with a first image and a second image; accessing an optical flow estimation associated with the first training data; inputting (1) the first image into a first machine-learning model configured to generate a first output and (2) the optical flow estimation into a second machine-learning model configured to generate a second output, wherein the first output of the first machine-learning model is associated with first image segments of a predetermined number, wherein the second output of the second machine-learning model is associated with transformations of the predetermined number, wherein the first output, the transformations, and the first image are configured to generate an estimated image; and training the first machine-learning model and the second machine-learning model based on at least a comparison of the estimated image and the second image; wherein the trained first machine-learning model is configured to segment images into the predetermined number of segments. 14. The media of claim 13 , wherein the first output of the first machine-learning model comprises masks of the predetermined number, wherein applying the masks to the first image generates, respectively, the first image segments of the predetermined number. 15. The media of claim 14 , wherein the estimated image is generated by: transforming each of the first image segments using an associated one of the transformations; and combining the transformed first image segments. 16. The media of claim 15 , wherein the transforming of each of the first image segments comprises: applying the associated transformation to the associated mask; applying the associated transformation to the first image; and applying the transformed associated mask to the transformed first image. 17. A system comprising: one or more processors; and one or more computer-readable non-transitory storage media coupled to one or more of the processors and comprising instructions operable when executed by one or more of the processors to cause the system to perform operations comprising: accessing a first training data in a set of training data, the first training data comprising a sequence of images with a first image and a second image; accessing an optical flow estimation associated with the first training data; inputting (1) the first image into a first machine-learning model configured to generate a first output and (2) the optical flow estimation into a second machine-learning model configured to generate a second output, wherein the first output of the first machine-learning model is associated with first image segments of a predetermined number, wherein the second output of the second machine-learning model is associated with transformations of the predetermined number, wherein the first output, the transformations, and the first image are configured to generate an estimated image; and training the first machine-learning model and the second machine-learning model based on at least a comparison of the estimated image and the second image; wherein the trained first machine-learning model is configured to segment images into the predetermined number of segments. 18. The system of claim 17 , wherein the first output of the first machine-learning model comprises masks of the predetermined number, wherein applying the masks to the first image generates, respectively, the first image segments of the predetermined number. 19. The system of c

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10402986B2 cover?
In one embodiment, a method includes a computing system accessing a first training data comprising a first image and a second image and an associated optical flow estimation. The system may input (1) the first image into a first machine-learning model configured to generate a first output and (2) the optical flow estimation into a second machine-learning model configured to generate a second ou…
Who is the assignee on this patent?
Facebook Inc
What technology area does this patent fall under?
Primary CPC classification G06T7/215. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 03 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).