Theme detection for object-recognition-based notifications
US-12183330-B2 · Dec 31, 2024 · US
US12494201B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12494201-B2 |
| Application number | US-202318333176-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 12, 2023 |
| Priority date | Dec 14, 2020 |
| Publication date | Dec 9, 2025 |
| Grant date | Dec 9, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A viewing assistance system includes a display unit, a captured image generation unit, a voice recognition unit, and an association storage unit. The display unit includes a display screen on which a content having multiple pages are displayable. The captured image generation unit that generates a captured image of the content that is displayed on the display screen. The voice recognition unit that recognizes a voice included in the content. The association storage unit that associates a voice recognition result that is a result of recognizing the voice included in the content by the voice recognition unit with the captured image generated by the captured image generation unit to store.
Opening claim text (preview).
What is claimed is: 1 . A viewing assistance system, comprising: a display unit including a display screen on which a content having multiple pages are displayable; a captured image generation unit that generates a captured image of the content that is displayed on the display screen; a voice recognition unit that recognizes a voice included in the content; a page turning detection unit that detects a page turning of the content that is displayed on the display screen; and an association storage unit that associates a voice recognition result that is a result of recognizing the voice included in the content by the voice recognition unit with the captured image generated by the captured image generation unit to store based on a detection result of the page turning by the page turning detection unit, wherein in a case in which the captured image generation unit generates a pre-page-turning captured image as the captured image of the content displayed on the display screen before the timing of the page turning and a post-page-turning captured image as the captured image of the content displayed on the display screen after the timing of the page turning, and the voice recognized by the voice recognition unit straddles the timing of the page turning, the viewing assistance system further comprises a sorting unit that sorts a straddling voice recognition result that is a result of recognizing the voice straddling the timing of the page turning by the voice recognition unit to either of the pre-page-turning captured image or the post-page-turning captured image, and the sorting unit has a function of determining to which of the pre-page-turning captured image or the post-page-turning captured image the straddling voice recognition result is sorted to. 2 . The viewing assistance system according to claim 1 , wherein the sorting unit comprises an utterance voice content recognition unit that recognizes a content of an utterance voice corresponding to the straddling voice recognition result; an image content recognition unit that recognizes contents of the pre-page-turning captured image and the post-page-turning captured image; and a similarity calculation unit that calculates a first similarity that is a similarity between the content of the utterance voice recognized by the utterance voice content recognition unit and the content of the pre-page-turning captured image that is recognized by the image content recognition unit, and a second similarity that is a similarity between the content of the utterance voice recognized by the utterance voice content recognition unit and the content of the post-page-turning captured image that is recognized by the image content recognition unit, in a case in which the first similarity is higher than the second similarity, the sorting unit sorts the straddling voice recognition result to the pre-page-turning captured image and the association storage unit associates the straddling voice recognition result with the pre-page-turning captured image to store, and in a case in which the first similarity is lower than the second similarity, the sorting unit sorts the straddling voice recognition result to the post-page-turning captured image and the association storage unit associates the straddling voice recognition result with the post-page-turning captured image to store. 3 . The viewing assistance system according to claim 1 , wherein the sorting unit comprises a keyword determination unit that determines whether a predetermined keyword is included in an utterance voice corresponding to the straddling voice recognition result, in a case in which the keywords is determined to be included in the utterance voice by the keyword determination unit, the sorting unit sorts the straddling voice recognition result to either of the pre-page-turning captured image or the post-page-turning captured image, and the association storage unit associates and stores the straddling voice recognition result to the captured image to which the straddling voice recognition result is sorted. 4 . The viewing assistance system according to claim 1 , wherein the page turning detection unit includes a video determination unit that determines whether a video is displayed on the display screen. 5 . The viewing assistance system according to claim 4 , wherein in a case in which the video is determined to be included in the display screen by the video determination unit, the page turning detection unit halts the function of detecting the page turning. 6 . The viewing assistance system according to claim 4 , wherein in a case in which the video is determined to be included in the display screen by the video determination unit, the page turning detection unit detects the page turning by excluding a part of the display screen in which the video is included. 7 . The viewing assistance system according to claim 1 , wherein the captured image generation unit generates a first captured image that is the captured image of the content displayed on the display screen at a first time and a second captured image that is the captured image of the content displayed on the display screen at a second time when a predetermined captured image generation interval is elapsed from the first time, and in a case in which a change amount of the second captured image with respect to the first captured image exceeds a threshold value, the page turning detection unit detects the page turning, and the association storage unit associates the voice recognition result with the second captured image to store while associating the voice recognition result with the first captured image based on a detection result of the page turning by the page turning detection unit. 8 . The viewing assistance system according to claim 1 wherein the captured image generation unit generates multiple captured images at a predetermined captured image generation interval, the page turning detection unit includes a character string area determination unit, the character string area determination unit has a function of calculating a word count included in each of the multiple captured images generated by the captured image generation unit, and a function of selecting the captured image having the most word count among the multiple captured images generated by the captured image generation unit, and the association storage unit associates the voice recognition result with the captured image that is selected by the character string area determination unit to store. 9 . The viewing assistance system according to claim 1 , wherein the captured image generation unit generates multiple captured images by a predetermined captured image generation interval, the association storage unit includes a recorded page generation unit and a recorded page deletion unit, the recorded page generation unit generates multiple recorded pages by associating the voice recognition result to each of the multiple captured images generated by the captured image generation unit as candidates of a storage page in which the voice recognition result is associated with the captured image stored in the association storage unit, the recorded page deletion unit has a function of deleting a part of the multiple recorded pages generated by the recorded page generation unit, and in a case in which the recorded page deletion unit deletes a part of the multiple recorded pages, the association storage unit associates the voice recognition result associated with the captured image configuring the recorded page to be deleted with the captured image configuring the recorded page that is not deleted and stores as the storage page. 10 . The viewing assistance system according
Speech to text systems (G10L15/08 takes precedence) · CPC title
of application context · CPC title
Television signal processing therefor · CPC title
Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition (scanning, transmission or reproduction of documents or the like H04N1/00) · CPC title
Annotation, e.g. comment data or footnotes · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.