Generating an enriched knowledge base from annotated images

US10002311B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10002311-B1
Application numberUS-201715429735-A
CountryUS
Kind codeB1
Filing dateFeb 10, 2017
Priority dateFeb 10, 2017
Publication dateJun 19, 2018
Grant dateJun 19, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A knowledge base is generated based on eye tracking, audio monitoring and image annotations, for determining image features from given images and sequences of image features to focus on in analyzing an image. An eye tracker monitors eye movements of a user analyzing an image and generates a sequence of eye movements. A user interface receives annotations on the image. Audio data received via a microphone is translated into text and keywords are extracted. The sequence of eye movements, the annotations and the keywords are correlated according to their time of occurrence. Image features are extracted from the image and mapped with the sequence of eye movements, the annotations and the keywords that are correlated. A recurrent neural network model is generated based on the mapped image features and predicts a likelihood of an expert image analyzer focusing on a feature in a given new image.

First claim

Opening claim text (preview).

We claim: 1. A system of generating a knowledge base from annotated images, comprising: a hardware processor executing a user interface, the hardware processor retrieving an image from a database of images and presenting the image on the user interface displayed on a display device; an eye tracker comprising at least a camera and coupled to the hardware processor, the eye tracker monitoring eye movements of a user analyzing the image and generating a sequence of eye movements; the user interface receiving annotations on the image input by the user; a microphone coupled to the hardware processor; the hardware processor receiving via the microphone audio data associated with the image spoken by the user, the hardware processor translating the audio data into text, the hardware processor extracting keywords from the text; the hardware processor correlating the sequence of eye movements, the annotations and the keywords according to their time of occurrence; the hardware processor extracting image features from the image and mapping the image features with the sequence of eye movements, the annotations and the keywords that are correlated; the hardware processor generating a recurrent neural network model that predicts a likelihood of an expert image analyzer focusing on a feature in a given new image, the hardware processor generating the recurrent neural network model based on mappings of the image features with the sequence of eye movements, the annotations and the keywords that are correlated; and a knowledgebase storing the recurrent neural network model and the mappings of the image features with the sequence of eye movements, the annotations and the keywords that are correlated. 2. The system of claim 1 , wherein the image features are labeled with disease labels, wherein based on the disease labels, the recurrent neural network model autonomously identifies a region in the given new image associated with a probable disease. 3. The system of claim 2 , wherein the hardware processor generates a location coordinate system definition associated with the image for mapping the image features in the image with the sequence of eye movements, the location coordinate system defined by identifying an optical optic disk center and a fovea location of a fundus in the image, and defining an x-axis in a direction from the optic disk center to the fovea, and a y-axis perpendicular to the x-axis, the origin of the x-axis defined by the optical optic disk center, and a unit of the coordinate system is defined as a distance between the optic disc center and the fovea. 4. The system of claim 1 , wherein the recurrent neural network model is generated based on monitoring multiples users analyzing multiples of images. 5. The system of claim 1 , wherein the recurrent neural network model, given the new image, predicts a sequence of image features in the new image the expert image analyzer would follow in analyzing the new image. 6. The system of claim 1 , wherein the hardware processor extracting image features comprises the hardware processor training a convolutional neural network model to learn convolutional filters that recognize the image features, and executing the convolutional neural network that is trained to extract the image features. 7. The system of claim 6 , wherein the convolutional neural network is trained to learn convolutional filters that distinguish identifiable regions and ambiguous regions in the given new image. 8. The system of claim 1 , wherein the hardware processor further determines time spent by the user on the image features that are mapped, and based on the time spent, predicts whether an image feature is ambiguous or clearly identifiable. 9. A method of generating a knowledge base from annotated images, the method performed by at least one hardware processor, the method comprising: retrieving an image from a database of images and presenting the image on a user interface displayed on a display device; transmitting a signal to an eye tracker comprising at least a camera coupled to the hardware processor, the signal representing a notification to the eye tracker to monitor eye movements of a user analyzing the image and generating a sequence of eye movements based on the eye tracker monitor the eye movements; receiving via the user interface, annotations on the image input by the user; receiving via a microphone coupled to the hardware processor, audio data associated with the image spoken by the user, and translating the audio data into text, and extracting keywords from the text; correlating the sequence of eye movements, the annotations and the keywords according to their time of occurrence; extracting image features from the image and mapping the image features with the sequence of eye movements, the annotations and the keywords that are correlated; generating a recurrent neural network model that predicts a likelihood of an expert image analyzer focusing on a feature in a given new image, the generating the recurrent neural network model based on mappings of the image features with the sequence of eye movements, the annotations and the keywords that are correlated; and storing in a knowledgebase, the recurrent neural network model and the mappings of the image features with the sequence of eye movements, the annotations and the keywords that are correlated. 10. The method of claim 9 , further comprising: receiving the given new image; and predicting by the recurrent neural network model executing on the hardware processor, a sequence of image features in the new image the expert image analyzer would follow in analyzing the new image. 11. The method of claim 9 , wherein the image features are labeled with disease labels, wherein based on the disease labels, the recurrent neural network model autonomously identifies a region in the given new image associated with a probable disease. 12. The method of claim 11 , further comprising generating a location coordinate system definition associated with the image for mapping the image features in the image with the sequence of eye movements, the location coordinate system defined by identifying an optical optic disk center and a fovea location of a fundus in the image, and defining an x-axis in a direction from the optic disk center to the fovea, and a y-axis perpendicular to the x-axis, the origin of the x-axis defined by the optical optic disk center, and a unit of the coordinate system is defined as a distance between the optic disc center and the fovea. 13. The method of claim 9 , wherein the recurrent neural network model is generated based on monitoring multiples users analyzing multiples of images. 14. The method of claim 9 , wherein the extracting of the image features comprises the training a convolutional neural network model to learn convolutional filters that recognize the image features, and executing the convolutional neural network that is trained to extract the image features. 15. The method of claim 14 , wherein the convolutional neural network is trained to learn convolutional filters that distinguish identifiable regions and ambiguous regions in the given new image. 16. The method of claim 9 , further comprising determining time spent by the user on the image features that are mapped, and based on the time spent, predicts whether an image feature is ambiguous or clearly identifiable. 17. A computer readable storage device storing a program of instructions executable by a machine to perform a method of generating a knowledge base from annotated images, the method comprising: retrieving an

Assignees

Inventors

Classifications

  • based on feedback from supervisors · CPC title

  • using classification, e.g. of video objects · CPC title

  • Combinations of networks · CPC title

  • based on feedback of a supervisor · CPC title

  • Distances to prototypes · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10002311B1 cover?
A knowledge base is generated based on eye tracking, audio monitoring and image annotations, for determining image features from given images and sequences of image features to focus on in analyzing an image. An eye tracker monitors eye movements of a user analyzing an image and generates a sequence of eye movements. A user interface receives annotations on the image. Audio data received via a …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 19 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).