Who is the assignee on this patent?

Toshiba Kk, Toshiba Digital Solutions Corp

What technology area does this patent fall under?

Primary CPC classification G10L15/22. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Viewing assistance system, viewing assistance method, and nonvolatile recording medium storing program

US12494201B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12494201-B2
Application number	US-202318333176-A
Country	US
Kind code	B2
Filing date	Jun 12, 2023
Priority date	Dec 14, 2020
Publication date	Dec 9, 2025
Grant date	Dec 9, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A viewing assistance system includes a display unit, a captured image generation unit, a voice recognition unit, and an association storage unit. The display unit includes a display screen on which a content having multiple pages are displayable. The captured image generation unit that generates a captured image of the content that is displayed on the display screen. The voice recognition unit that recognizes a voice included in the content. The association storage unit that associates a voice recognition result that is a result of recognizing the voice included in the content by the voice recognition unit with the captured image generated by the captured image generation unit to store.

First claim

Opening claim text (preview).

What is claimed is: 1 . A viewing assistance system, comprising: a display unit including a display screen on which a content having multiple pages are displayable; a captured image generation unit that generates a captured image of the content that is displayed on the display screen; a voice recognition unit that recognizes a voice included in the content; a page turning detection unit that detects a page turning of the content that is displayed on the display screen; and an association storage unit that associates a voice recognition result that is a result of recognizing the voice included in the content by the voice recognition unit with the captured image generated by the captured image generation unit to store based on a detection result of the page turning by the page turning detection unit, wherein in a case in which the captured image generation unit generates a pre-page-turning captured image as the captured image of the content displayed on the display screen before the timing of the page turning and a post-page-turning captured image as the captured image of the content displayed on the display screen after the timing of the page turning, and the voice recognized by the voice recognition unit straddles the timing of the page turning, the viewing assistance system further comprises a sorting unit that sorts a straddling voice recognition result that is a result of recognizing the voice straddling the timing of the page turning by the voice recognition unit to either of the pre-page-turning captured image or the post-page-turning captured image, and the sorting unit has a function of determining to which of the pre-page-turning captured image or the post-page-turning captured image the straddling voice recognition result is sorted to. 2 . The viewing assistance system according to claim 1 , wherein the sorting unit comprises an utterance voice content recognition unit that recognizes a content of an utterance voice corresponding to the straddling voice recognition result; an image content recognition unit that recognizes contents of the pre-page-turning captured image and the post-page-turning captured image; and a similarity calculation unit that calculates a first similarity that is a similarity between the content of the utterance voice recognized by the utterance voice content recognition unit and the content of the pre-page-turning captured image that is recognized by the image content recognition unit, and a second similarity that is a similarity between the content of the utterance voice recognized by the utterance voice content recognition unit and the content of the post-page-turning captured image that is recognized by the image content recognition unit, in a case in which the first similarity is higher than the second similarity, the sorting unit sorts the straddling voice recognition result to the pre-page-turning captured image and the association storage unit associates the straddling voice recognition result with the pre-page-turning captured image to store, and in a case in which the first similarity is lower than the second similarity, the sorting unit sorts the straddling voice recognition result to the post-page-turning captured image and the association storage unit associates the straddling voice recognition result with the post-page-turning captured image to store. 3 . The viewing assistance system according to claim 1 , wherein the sorting unit comprises a keyword determination unit that determines whether a predetermined keyword is included in an utterance voice corresponding to the straddling voice recognition result, in a case in which the keywords is determined to be included in the utterance voice by the keyword determination unit, the sorting unit sorts the straddling voice recognition result to either of the pre-page-turning captured image or the post-page-turning captured image, and the association storage unit associates and stores the straddling voice recognition result to the captured image to which the straddling voice recognition result is sorted. 4 . The viewing assistance system according to claim 1 , wherein the page turning detection unit includes a video determination unit that determines whether a video is displayed on the display screen. 5 . The viewing assistance system according to claim 4 , wherein in a case in which the video is determined to be included in the display screen by the video determination unit, the page turning detection unit halts the function of detecting the page turning. 6 . The viewing assistance system according to claim 4 , wherein in a case in which the video is determined to be included in the display screen by the video determination unit, the page turning detection unit detects the page turning by excluding a part of the display screen in which the video is included. 7 . The viewing assistance system according to claim 1 , wherein the captured image generation unit generates a first captured image that is the captured image of the content displayed on the display screen at a first time and a second captured image that is the captured image of the content displayed on the display screen at a second time when a predetermined captured image generation interval is elapsed from the first time, and in a case in which a change amount of the second captured image with respect to the first captured image exceeds a threshold value, the page turning detection unit detects the page turning, and the association storage unit associates the voice recognition result with the second captured image to store while associating the voice recognition result with the first captured image based on a detection result of the page turning by the page turning detection unit. 8 . The viewing assistance system according to claim 1 wherein the captured image generation unit generates multiple captured images at a predetermined captured image generation interval, the page turning detection unit includes a character string area determination unit, the character string area determination unit has a function of calculating a word count included in each of the multiple captured images generated by the captured image generation unit, and a function of selecting the captured image having the most word count among the multiple captured images generated by the captured image generation unit, and the association storage unit associates the voice recognition result with the captured image that is selected by the character string area determination unit to store. 9 . The viewing assistance system according to claim 1 , wherein the captured image generation unit generates multiple captured images by a predetermined captured image generation interval, the association storage unit includes a recorded page generation unit and a recorded page deletion unit, the recorded page generation unit generates multiple recorded pages by associating the voice recognition result to each of the multiple captured images generated by the captured image generation unit as candidates of a storage page in which the voice recognition result is associated with the captured image stored in the association storage unit, the recorded page deletion unit has a function of deleting a part of the multiple recorded pages generated by the recorded page generation unit, and in a case in which the recorded page deletion unit deletes a part of the multiple recorded pages, the association storage unit associates the voice recognition result associated with the captured image configuring the recorded page to be deleted with the captured image configuring the recorded page that is not deleted and stores as the storage page. 10 . The viewing assistance system according

Assignees

Inventors

Classifications

G10L15/26
Speech to text systems (G10L15/08 takes precedence) · CPC title
G10L2015/228
of application context · CPC title
H04N5/91
Television signal processing therefor · CPC title
G06V30/00
Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition (scanning, transmission or reproduction of documents or the like H04N1/00) · CPC title
G06F40/169
Annotation, e.g. comment data or footnotes · CPC title

Patent family

Related publications grouped by family.

View patent family 82057829

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12494201B2 cover?: A viewing assistance system includes a display unit, a captured image generation unit, a voice recognition unit, and an association storage unit. The display unit includes a display screen on which a content having multiple pages are displayable. The captured image generation unit that generates a captured image of the content that is displayed on the display screen. The voice recognition unit …
Who is the assignee on this patent?: Toshiba Kk, Toshiba Digital Solutions Corp
What technology area does this patent fall under?: Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).