Edge-guided ranking loss for monocular depth prediction

US2021256717A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021256717-A1
Application numberUS-202016790056-A
CountryUS
Kind codeA1
Filing dateFeb 13, 2020
Priority dateFeb 13, 2020
Publication dateAug 19, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In order to provide monocular depth prediction, a trained neural network may be used. To train the neural network, edge detection on a digital image may be performed to determine at least one edge of the digital image, and then a first point and a second point of the digital image may be sampled, based on the at least one edge. A relative depth between the first point and the second point may be predicted, and the neural network may be trained to perform monocular depth prediction using a loss function that compares the predicted relative depth with a ground truth relative depth between the first point and the second point.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer program product, the computer program product being tangibly embodied on a non-transitory computer-readable storage medium and comprising instructions that, when executed by at least one computing device, are configured to cause the at least one computing device to: sample a first point and a second point of a digital image, based on at least one edge of the digital image determined by an edge detection process; predict a relative depth between the first point and the second point; and train a neural network to perform monocular depth prediction using a loss function that compares the predicted relative depth with a ground truth relative depth between the first point and the second point. 2 . The computer program product of claim 1 , wherein the instructions, when executed, are further configured to cause the at least one computing device to: identify an anchor point on the at least one edge; and sample the first point and the second point, based on the anchor point. 3 . The computer program product of claim 2 , wherein the instructions, when executed, are further configured to cause the at least one computing device to: determine a normal direction of a gradient of the at least one edge that extends on a first side and a second side of the at least one edge and through the anchor point; and sample the first point and the second point from the gradient. 4 . The computer program product of claim 3 , wherein the instructions, when executed, are further configured to cause the at least one computing device to: sample the first point and the second point from the first side of the at least one edge. 5 . The computer program product of claim 3 , wherein the instructions, when executed, are further configured to cause the at least one computing device to: sample the first point from the first side of the at least one edge; and sample the second point from the second side of the at least one edge. 6 . The computer program product of claim 5 , wherein the instructions, when executed, are further configured to cause the at least one computing device to: sample a third point from the first side of the at least one edge; sample a fourth point from the second side of the at least one edge; and predict the relative depth based on point pairs that include the first and third points, the first and second points, and the second and fourth points. 7 . The computer program product of claim 1 , wherein the instructions, when executed, are further configured to cause the at least one computing device to: sample a third point and a fourth point at random from the digital image; and train the neural network to perform monocular depth prediction using the loss function to compare a predicted relative depth between the third point and the fourth point with a ground truth relative depth between the third point and the fourth point. 8 . The computer program product of claim 1 , wherein the instructions, when executed, are further configured to cause the at least one computing device to: implement the loss function as a ranking loss function based on an ordinal relation between the first point and the second point, as compared to the ground truth relative depth between the first point and the second point. 9 . The computer program product of claim 1 , wherein the first point and the second point are a first pixel and a second pixel, respectively, of the digital image. 10 . A computer-implemented method, the method comprising: sampling a first point and a second point of a digital image, based on at least one edge of the digital image, as determined by an edge detection process; predicting a relative depth between the first point and the second point; and training a neural network to perform monocular depth prediction using a loss function that compares the predicted relative depth with a ground truth relative depth between the first point and the second point. 11 . The method of claim 10 , further comprising: identifying an anchor point on the at least one edge; and sampling the first point and the second point, based on the anchor point. 12 . The method of claim 11 , further comprising: determining a normal direction of a gradient of the at least one edge that extends on a first side and a second side of the at least one edge and through the anchor point; and sampling the first point and the second point from the gradient. 13 . The method of claim 12 , further comprising: sampling the first point and the second point from the first side of the at least one edge. 14 . The method of claim 12 , further comprising: sampling the first point from the first side of the at least one edge; and sampling the second point from the second side of the at least one edge. 15 . The method of claim 14 , further comprising: sample a third point from the first side of the at least one edge; sample a fourth point from the second side of the at least one edge; and predict the relative depth based on point pairs that include the first and third points, the first and second points, and the second and fourth points. 16 . The method of claim 10 , further comprising: sampling a third point and a fourth point at random from the digital image; and training the neural network to perform monocular depth prediction using the loss function to compare a predicted relative depth between the third point and the fourth point with a ground truth relative depth between the third point and the fourth point. 17 . The method of claim 10 , further comprising: implementing the loss function as a ranking loss function based on an ordinal relation between the first point and the second point, as compared to the ground truth relative depth between the first point and the second point. 18 . The method of claim 10 , wherein the first point and the second point are a first pixel and a second pixel, respectively, of the digital image. 19 . A system comprising: at least one memory including instructions; and at least one processor that is operably coupled to the at least one memory and that is arranged and configured to execute instructions that, when executed, cause the at least one processor to execute a neural network to predict monocular depth prediction for a digital image, the neural network being trained by determining a ground truth depth map for a ground truth digital image; identifying at least one edge within the ground truth digital image; identifying a direction of a gradient passing through the at least one edge; sampling a point pair along the gradient; and training the neural network to predict a relative depth between points of the point pair, based on the ground truth depth map. 20 . The system of claim 19 , wherein the point pair includes a first pixel along the gradient and on a first side of the at least one edge, and a second pixel along the gradient and on a second side of the at least one edge.

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Activation functions · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021256717A1 cover?
In order to provide monocular depth prediction, a trained neural network may be used. To train the neural network, edge detection on a digital image may be performed to determine at least one edge of the digital image, and then a first point and a second point of the digital image may be sampled, based on the at least one edge. A relative depth between the first point and the second point may b…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06T7/50. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 19 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).