Compression of images having overlapping fields of view using machine-learned models

US11019364B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11019364-B2
Application numberUS-202016825095-A
CountryUS
Kind codeB2
Filing dateMar 20, 2020
Priority dateMar 23, 2019
Publication dateMay 25, 2021
Grant dateMay 25, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A machine-learned image compression model includes a first encoder configured to generate a first image code based at least in part on first image data. The first encoder includes a first series of convolutional layers configured to generate a first series of respective feature maps based at least in part on the first image. A second encoder is configured to generate a second image code based at least in part on second image data and includes a second series of convolutional layers configured to generate a second series of respective feature maps based at least in part on the second image and disparity-warped feature data. Respective parametric skip functions associated convolutional layers of the second series are configured to generate disparity-warped feature data based at least in part on disparity associated with the first series of respective feature maps and the second series of respective feature maps.

First claim

Opening claim text (preview).

What is claimed is: 1. A computing system, comprising: one or more processors; and one or more non-transitory computer-readable media that collectively store a machine-learned image compression model configured to generate compressed image data in response to image data associated with at least two image sensors having at least partially overlapping fields of view, the machine-learned image compression model comprising: a first encoder configured to generate a first image code based at least in part on first image data indicative of a first image associated with a first image sensor, wherein the first encoder includes a first series of convolutional layers configured to generate a first series of respective feature maps based at least in part on the first image; a second encoder configured to generate a second image code based at least in part on second image data indicative of a second image associated with a second image sensor, wherein the second encoder includes a second series of convolutional layers configured to generate a second series of respective feature maps based at least in part on the second image and disparity-warped feature data associated with the first image; and a plurality of respective parametric skip functions associated with at least a subset of convolutional layers of the second series of convolutional layers and configured to generate the disparity-warped feature data based at least in part on disparity associated with the first series of respective feature maps and the second series of respective feature maps. 2. The computing system of claim 1 , wherein the respective parametric skip function associated with each convolutional layer of the subset of convolutional layers of the second series of convolutional layers is configured to generate a respective disparity-warped feature map based at least in part on disparity between the respective feature map from a previous convolutional layer of the second series of convolutional layers and the respective feature map from a previous convolutional layer of the first series of convolutional layers. 3. The computing system of claim 2 , wherein the respective parametric skip function associated with each convolutional layer of the subset of convolutional layers of the second series of convolutional layers is configured to generate the respective disparity-warped feature map based at least in part on the first image code. 4. The computing system of claim 1 , wherein the machine-learned image compression model comprises: a conditional entropy model including one or more neural networks configured to model a probabilistic dependence between the first image code and the second image code. 5. The computing system of claim 4 , wherein: the conditional entropy model models a probability of the second image conditioned on the image data indicative of the first image from the first image sensor. 6. The computing system of claim 4 , wherein the disparity-warped feature data is first disparity-warped feature data, the plurality of respective parametric skip functions is a first plurality of respective parametric skip functions, and the machine-learned image compression model comprises: a first decoder configured to generate first reconstructed image data including a reconstruction of the first image based at least in part on the first image code, wherein the first decoder includes a third series of convolutional layers configured to generate a third series of respective feature maps based at least in part on the first image code; a second decoder configured to generate second reconstructed image data including a reconstruction of the second image based at least in part on the second image code, wherein the second decoder includes a fourth series of convolutional layers configured to generate a fourth series of respective feature maps based at least in part on the second image code and second disparity-warped feature data associated with the first image; and a second plurality of respective parametric skip functions associated with at least a subset of convolutional layers of the fourth series of convolutional layers and configured to generate the second disparity-warped feature data based at least in part on disparity associated with the third series of respective feature maps and the fourth series of respective feature maps. 7. The computing system of claim 6 , wherein: the machine-learned image compression model is trained end-to-end to minimize an objective function including a first term that encodes a reconstruction quality of the first image and the second image and a second term that is associated with a bitrate predicted by the conditional entropy model. 8. The computing system of claim 1 , wherein the respective parametric skip function associated with at least a subset of convolutional layers includes: a fully convolutional global context encoding component configured to encode the first image code to a feature descriptor in order to capture global context information associated with the first image. 9. The computing system of claim 8 , wherein the respective parametric skip function associated with at least a subset of convolutional layers includes: a stereo cost volume component configured to estimate a cost volume based at least in part on the respective feature map from the first series of respective feature maps, the respective feature map from the second series of respective feature maps, and the global context information. 10. The computing system of claim 9 , wherein the respective parametric skip function associated with at least a subset of convolutional layers includes: a feature warping component configured to warp features associated with the first image to align with the second image based at least in part on the cost volume. 11. The computing system of claim 8 , wherein the respective parametric skip function associated with at least a subset of convolutional layers includes: an aggregation function configured to generate a respective predicted feature map for the second image based at least in part on the disparity-warped feature data associated with the first image and the respective feature map from a previous convolutional layer of the second series of convolutional layers. 12. The computing system of claim 1 , wherein the one or more non-transitory computer-readable media collectively store instructions that, when executed by the one or more processors, cause the one or more processors to perform operations, the operations comprising: obtaining the first image data indicative of the first image and the second image data indicative of the second image; inputting the first image data indicative of the first image and the second image data indicative of the second image to the machine-learned image compression model; and receiving compressed image data indicative of the first image code and the second image code as an output of the machine-learned image compression model in response to the first image data indicative of the first image and the second image data indicative of the second image. 13. A computer-implemented method of digital image compression, the method comprising: obtaining, by a computing system comprising one or more computing devices, first image data indicative of a first image associated with a first image sensor and second image data indicative of a second image associated with a second image sensor; encoding, by the computing system using a first series of convolutional layers of a machine-learned image compression model, the first image data indicative of the first image into a first series of respective feature maps and a first image

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

  • Entropy coding, e.g. variable length coding [VLC] or arithmetic coding · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11019364B2 cover?
A machine-learned image compression model includes a first encoder configured to generate a first image code based at least in part on first image data. The first encoder includes a first series of convolutional layers configured to generate a first series of respective feature maps based at least in part on the first image. A second encoder is configured to generate a second image code based a…
Who is the assignee on this patent?
Uatc Llc
What technology area does this patent fall under?
Primary CPC classification H04N19/597. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue May 25 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).