Viewfinder assistant for visually impaired

US11417079B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11417079-B2
Application numberUS-202016928455-A
CountryUS
Kind codeB2
Filing dateJul 14, 2020
Priority dateJul 14, 2020
Publication dateAug 16, 2022
Grant dateAug 16, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In an approach for guiding a visually impaired user to position a mobile device appropriately in relation to a screen so that dynamic information on the screen can be reliably extracted and conveyed to the visually impaired user, a processor receives an image captured by a camera of a mobile device. A processor performs object recognition on the image to identify a digital screen and a location of the digital screen in the image. A processor retrieves a template of the digital screen. A processor performs angle-sensitive optical character recognition (OCR) on the location of the digital screen in the image. Responsive to a processor determining text on the digital screen can be extracted, a processor conveys the text to a user. Responsive to a processor determining text on the digital screen cannot be extracted, a processor guides the user to re-orient the mobile device to capture a better image.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving, by one or more processors, an image captured by a camera of a user mobile device; performing, by the one or more processors, object recognition on the image to identify a digital screen and a location of the digital screen in the image, wherein the digital screen is identified to be of a known type, brand, or model of digital screen; retrieving, by the one or more processors, a template of the digital screen based on the known type, brand, or model of digital screen; performing, by the one or more processors, angle-sensitive optical character recognition (OCR) on the location of the digital screen in the image to detect rectangular regions in the image that contain text and calculate an angle of the rectangular regions relative to a horizontal axis; determining, by the one or more processors, whether, within the image, the text on the digital screen can be extracted based on whether the rectangular regions overlap within a pre-defined threshold with expected text locations based on the template; and responsive to determining the text cannot be extracted, guiding, by the one or more processors, a user of the user mobile device to re-orient at least one of a position and a rotation of the user mobile device based on the angle calculated using the angle-sensitive OCR to capture another image. 2. The computer-implemented method of claim 1 , further comprising: responsive to determining the text can be extracted, audibly conveying, by the one or more processors, the text to a user of the user mobile device using text-to-speech. 3. The computer-implemented method of claim 1 , wherein performing angle-sensitive OCR on the location of the digital screen in the image comprises: detecting, by the one or more processors, rectangular regions in the image that contain text; calculating, by the one or more processors, angles of the rectangular regions relative to a horizontal axis; and converting, by the one or more processors, text detected in these rectangular regions from image data to text data. 4. The computer-implemented method of claim 3 , further comprising: wherein the template indicates a set of locations of where text would be located on the digital screen; and comparing, by the one or more processors, locations of the rectangular regions in the image to the set of locations of where text would be located on the digital screen based on the template. 5. The computer-implemented method of claim 4 , wherein determining whether the text on the digital screen can be extracted is based on comparing the locations of the rectangular regions in the image to the set of locations of where text would be located on the digital screen based on the template. 6. A computer program product comprising: one or more computer readable storage media and program instructions stored on the one or more computer readable storage media, the program instructions comprising: program instructions to receive an image captured by a camera of a user mobile device; program instructions to perform object recognition on the image to identify a digital screen and a location of the digital screen in the image, wherein the digital screen is identified to be of a known type, brand, or model of digital screen; program instructions to retrieve a template of the digital screen based on the known type, brand, or model of digital screen; program instructions to perform angle-sensitive optical character recognition (OCR) on the location of the digital screen in the image to detect rectangular regions in the image that contain text and calculate an angle of the rectangular regions relative to a horizontal axis; program instructions to determine whether, within the image, text on the digital screen can be extracted based on whether the rectangular regions overlap within a pre-defined threshold with expected text locations based on the template; and responsive to determining the text cannot be extracted, program instructions to guide a user of the user mobile device to re-orient at least one of a position and a rotation of the user mobile device based on the angle calculated using the angle-sensitive OCR to capture another image. 7. The computer program product of claim 6 , further comprising: responsive to determining the text can be extracted, program instructions to audibly convey the text to a user of the user mobile device using text-to-speech. 8. The computer program product of claim 6 , wherein the program instructions to perform angle-sensitive OCR on the location of the digital screen in the image comprise: program instructions to detect rectangular regions in the image that contain text; program instructions to calculate angles of the rectangular regions relative to a horizontal axis; and program instructions to convert text detected in these rectangular regions from image data to text data. 9. The computer program product of claim 8 , further comprising: wherein the template indicates a set of locations of where text would be located on the digital screen; and program instructions to compare locations of the rectangular regions in the image to the set of locations of where text would be located on the digital screen based on the template. 10. The computer program product of claim 9 , wherein the program instructions to determine whether the text on the digital screen can be extracted is based on comparing the locations of the rectangular regions in the image to the set of locations of where text would be located on the digital screen based on the template. 11. A computer system comprising: one or more computer processors; one or more computer readable storage media; program instructions stored on the computer readable storage media for execution by at least one of the one or more processors, the program instructions comprising: program instructions to receive an image captured by a camera of a user mobile device; program instructions to perform object recognition on the image to identify a digital screen and a location of the digital screen in the image, wherein the digital screen is identified to be of a known type, brand, or model of digital screen; program instructions to retrieve a template of the digital screen based on the known type, brand, or model of digital screen; program instructions to perform angle-sensitive optical character recognition (OCR) on the location of the digital screen in the image to detect rectangular regions in the image that contain text and calculate an angle of the rectangular regions relative to a horizontal axis; program instructions to determine whether, within the image, text on the digital screen can be extracted based on whether the rectangular regions overlap within a pre-defined threshold with expected text locations based on the template; and responsive to determining the text cannot be extracted, program instructions to guide a user of the user mobile device to re-orient at least one of a position and a rotation of the user mobile device based on the angle calculated using the angle-sensitive OCR to capture another image. 12. The computer system of claim 11 , further comprising: responsive to determining the text can be extracted, program instructions to audibly convey the text to a user of the user mobile device using text-to-speech. 13. The computer system of claim 11 , wherein the program instructions to perform angle-sensitive OCR on the location of the digital screen in the image comprise: program instructions to detect rectangular regions in the image that contain text; program instructions to calculate angles of the rectangul

Assignees

Inventors

Classifications

  • Matching criteria, e.g. proximity measures · CPC title

  • Text, e.g. of license plates, overlay texts or captions on TV images · CPC title

  • G09B21/001Primary

    Teaching or communicating with blind persons (G09B21/02 - G09B21/06 take precedence) · CPC title

  • Speech synthesis; Text to speech systems · CPC title

  • Inclination or skew detection or correction of characters or of image to be recognised · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11417079B2 cover?
In an approach for guiding a visually impaired user to position a mobile device appropriately in relation to a screen so that dynamic information on the screen can be reliably extracted and conveyed to the visually impaired user, a processor receives an image captured by a camera of a mobile device. A processor performs object recognition on the image to identify a digital screen and a location…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G09B21/001. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 16 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).