Apparatus and methods for training robots utilizing gaze-based saliency maps
US-2015339589-A1 · Nov 26, 2015 · US
US9830529B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9830529-B2 |
| Application number | US-201615138821-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 26, 2016 |
| Priority date | Apr 26, 2016 |
| Publication date | Nov 28, 2017 |
| Grant date | Nov 28, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for generating a system for predicting saliency in an image and method of use of the prediction system are described. Attention maps for each of a set of training images are used to train the system. The training includes passing the training images though a neural network and optimizing an objective function over the training set which is based on a distance measure computed between a first probability distribution computed for a saliency map output by the neural network and a second probability distribution computed for the attention map for the respective training image. The trained neural network is suited to generation of saliency maps for new images.
Opening claim text (preview).
What is claimed is: 1. A method for generating a system for predicting saliency in an image, comprising: for each of a set of training images: generating an attention map; and representing the attention map as a first probability distribution which includes, for each of a set of pixels, a respective value corresponding to a probability of the pixel being fixated upon; and with a processor, training a neural network to output a saliency map for an input image, the training including updating parameters of the neural network to optimize an objective function over the training set, the objective function being based on a distance measure computed between a second probability distribution computed for a saliency map output by the neural network, given an input training image, and the first probability distribution computed for the attention map of the respective training image, the second probability distribution including, for each of the set of pixels, a respective probability. 2. The method of claim 1 , wherein the generation of the attention map comprises: acquiring eye gaze data for a set of observers observing the training image; generating a binary fixation map based on the eye gaze data; and smoothing the binary map. 3. The method of claim 1 , wherein the probability distributions are computed with a softmax function. 4. The method of claim 1 , wherein the distance measure is selected from the X 2 distance, the total-variation distance, the cosine distance, and the Bhattacharyya distance. 5. The method of claim 4 , wherein the distance measure is the Bhattacharyya distance. 6. The method of claim 1 , wherein the training of the neural network includes updating weights of convolutional layers of the neural network as a function of a derivative of the distance measure. 7. The method of claim 1 wherein the neural network is a fully-convolutional neural network. 8. The method of claim 1 , wherein the training includes receiving layers of a pretrained neural network, adding additional layers, and updating weights of at least the additional layers. 9. The method of claim 1 , wherein the neural network includes at least five layers, each layer outputting a set of activation maps. 10. The method of claim 1 , wherein at least some of the eye gaze data is acquired from mouse clicks on the training images. 11. The method of claim 1 , further comprising predicting a saliency map for a new image with the trained neural network. 12. The method of claim 11 , further comprising outputting information based on the predicted saliency map. 13. A system comprising memory which stores instructions for performing the method of claim 1 and a processor in communication with the memory which executes the instructions. 14. A computer program product comprising a non-transitory storage medium storing instructions, which when executed by a computer, perform the method of claim 1 . 15. A training system for generating a prediction system for predicting saliency in an image, comprising: memory which stores an attention map for each of a set of training images, the attention map having been generated based on eye gaze data, the attention map being modeled as a first probability distribution which includes, for each of a set of pixels, a respective value corresponding to a probability of the pixel being fixated upon; a training component which trains a neural network with the training images and the attention maps by optimizing an objective function over the training set, the objective function being based on a distance measure computed between a second probability distribution computed for a saliency map output by the neural network, when input with a training image, and the first probability distribution for the respective training image; and a hardware processor which implements the training component. 16. The training system of claim 15 , further comprising an attention map generator which generates the attention map for each of the set of training images. 17. A method for predicting saliency in an image, comprising: providing a trained neural network trained by a method comprising: generating an attention map for each of a set of training images, modeling each attention map as a first probability distribution which includes, for each of a set of pixels, a respective value corresponding to a probability of the pixel being fixated upon, and training a neural network to output a saliency map for an input image, the training comprising updating parameters of the neural network to optimize an objective function over the training set, which is based on a distance measure computed between the first probability distribution for the attention map and a second probability distribution computed for a saliency map output by the neural network for the respective training image, the second probability distribution including, for each of the set of pixels, a respective probability; receiving an image; and passing the image through the neural network to generate a saliency map for the image. 18. The method of claim 17 , further comprising extracting information from a salient region of the image based on the saliency map. 19. The method of claim 17 , further comprising outputting the saliency map or information based thereon.
using neural networks · CPC title
Salient features, e.g. scale invariant feature transforms [SIFT] · CPC title
Backpropagation, e.g. using gradient descent · CPC title
Combinations of networks · CPC title
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.