Spatial transformer modules

US10032089B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10032089-B2
Application numberUS-201615174133-A
CountryUS
Kind codeB2
Filing dateJun 6, 2016
Priority dateJun 5, 2015
Publication dateJul 24, 2018
Grant dateJul 24, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using an image processing neural network system that includes a spatial transformer module. One of the methods includes receiving an input feature map derived from the one or more input images, and applying a spatial transformation to the input feature map to generate a transformed feature map, comprising: processing the input feature map to generate spatial transformation parameters for the spatial transformation, and sampling from the input feature map in accordance with the spatial transformation parameters to generate the transformed feature map.

First claim

Opening claim text (preview).

What is claimed is: 1. An image processing neural network system implemented by one or more computers, wherein the image processing neural network system is configured to receive one or more input images and to process the one or more input images to generate a neural network output from the one or more input images, the image processing neural network system comprising: a spatial transformer module, wherein the spatial transformer module is configured to perform operations comprising: receiving an input feature map derived from the one or more input images, and applying a spatial transformation to the input feature map to generate a transformed feature map, comprising: processing the input feature map to generate, based on the input feature map, spatial transformation parameters that define the spatial transformation to be applied to the input feature map, and sampling from the input feature map in accordance with the spatial transformation parameters generated based on the input feature map to generate the transformed feature map. 2. The image processing neural network system of claim 1 , the operations further comprising: providing the transformed feature map as input to another component of the image processing neural network system. 3. The image processing neural network system of claim 1 , wherein the input feature map is an output generated by another component of the image processing neural network system. 4. The image processing neural network system of claim 3 , wherein the other component of the image processing neural network system is another spatial transformer module. 5. The image processing neural network system of claim 4 , wherein the other spatial transformer module performs a different type of spatial transformation than the spatial transformer module. 6. The image processing neural network system of claim 3 , wherein the other component of the image processing neural network system is a neural network layer. 7. The image processing neural network system of claim 1 , wherein the input feature map is one of the one or more input images. 8. The image processing neural network system of claim 1 , wherein the spatial transformer module comprises: a localisation subnetwork comprising one or more neural network layers, wherein the localisation subnetwork is configured to process the input feature map to generate the spatial transformation parameters in accordance with current values of a set of parameters of the localisation subnetwork, and wherein processing the input feature map to generate the spatial transformation parameters comprises processing the input feature map using the localisation subnetwork. 9. The image processing neural network system of claim 1 , wherein sampling from the input feature map in accordance with the spatial transformation parameters to generate the transformed feature map comprises: generating, using the transformation parameters, a sampling grid that defines, for each of a plurality of locations in the transformed feature map, how a value of the location should be derived from values in the input feature map; and sampling from the input feature map in accordance with the sampling grid to generate the transformed feature map. 10. The image processing neural network system of claim 1 , wherein the sampling mechanism is differentiable. 11. The image processing neural network system of claim 10 , wherein the spatial transformer module has been trained using backpropagation during training of the image processing neural network system. 12. The image processing neural network system of claim 1 , wherein the transformed feature map has the same dimensions as the input feature map. 13. The image processing neural network system of claim 1 , wherein the transformed feature map has different dimensions from the input feature map. 14. One or more non-transitory computer storage media encoded with instructions that, when executed by one or more computers, cause the one or more computers to implement an image processing neural network that is configured to receive one or more input images and to process the one or more input images to generate a neural network output from the one or more input images, the image processing neural network comprising: a spatial transformer module, wherein the spatial transformer module is configured to perform operations comprising: receiving an input feature map derived from the one or more input images, and applying a spatial transformation to the input feature map to generate a transformed feature map, comprising: processing the input feature map to generate, based on the input feature map, spatial transformation parameters that define the spatial transformation to be applied to the input feature map, and sampling from the input feature map in accordance with the spatial transformation parameters generated based on the input feature map to generate the transformed feature map. 15. The computer storage media of claim 14 , the operations further comprising: providing the transformed feature map as input to another component of the image processing neural network system. 16. The computer storage media of claim 14 , wherein the input feature map is an output generated by another component of the image processing neural network system. 17. The computer storage media of claim 14 , wherein the input feature map is one of the one or more input images. 18. The computer storage media of claim 14 , wherein the spatial transformer module comprises: a localisation subnetwork comprising one or more neural network layers, wherein the localisation subnetwork is configured to process the input feature map to generate the spatial transformation parameters in accordance with current values of a set of parameters of the localisation subnetwork, and wherein processing the input feature map to generate the spatial transformation parameters comprises processing the input feature map using the localisation subnetwork. 19. The computer storage media of claim 14 , wherein sampling from the input feature map in accordance with the spatial transformation parameters to generate the transformed feature map comprises: generating, using the transformation parameters, a sampling grid that defines, for each of a plurality of locations in the transformed feature map, how a value of the location should be derived from values in the input feature map; and sampling from the input feature map in accordance with the sampling grid to generate the transformed feature map. 20. A method comprising: training an image processing neural network on training images, wherein the image processing neural network is configured to receive one or more input images and to process the one or more input images to generate a neural network output from the one or more input images, wherein the image processing neural network comprises a spatial transformer module, wherein the spatial transformer module is configured to perform operations comprising: receiving an input feature map derived from the one or more input images, and applying a spatial transformation to the input feature map to generate a transformed feature map, comprising: processing the input feature map to generate, based on the input feature map, spatial transformation parameters that define the spatial transformation to be applied to the input feature map, and sampling from the input feature map in accordance with the spatial transformation parameters generated based on the input feature map to generate the t

Assignees

Inventors

Classifications

  • G06N3/084Primary

    Backpropagation, e.g. using gradient descent · CPC title

  • G06N3/045Primary

    Combinations of networks · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • Non-supervised learning, e.g. competitive learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10032089B2 cover?
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using an image processing neural network system that includes a spatial transformer module. One of the methods includes receiving an input feature map derived from the one or more input images, and applying a spatial transformation to the input feature map to generate a transfor…
Who is the assignee on this patent?
Deepmind Tech Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 24 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).