Efficient content extraction from unstructured dialog text
US-2024338398-A1 · Oct 10, 2024 · US
US12555491B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12555491-B2 |
| Application number | US-202318466230-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 13, 2023 |
| Priority date | Sep 13, 2023 |
| Publication date | Feb 17, 2026 |
| Grant date | Feb 17, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method of providing information to a user is provided. The method comprises receiving input of a selected operating mode and extracting, via an extraction pipeline, data from a number of data sources according to the selected operating mode. The extracted data is fed into a large language model (LLM), and the LLM generates verbal and auditory information for the user based on the data. The LLM conveys the verbal and auditory information to the user via a number of interface products according to the selected operating mode.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method of providing digital workplace accessibility to a visually impaired user, the method comprising: using a number of processors to perform: receiving input of a selected operating mode from a plurality of context-specific operating modes, wherein the plurality of context-specific operating modes comprises: a people detection mode, a workplace navigation mode for detecting obstacles in a path of the visually impaired user, a text-to-speech mode for converting visual content to auditory content, and work mode for assisting the visually impaired user to access functions of computer applications; extracting, via an extraction pipeline, multimodal data from a number of heterogeneous data sources according to the selected operating mode, wherein the multimodal data comprises at least two of visual data, text data, optical character recognition images, data tables, graphs, and charts and wherein the heterogeneous data sources include at least one of a productivity platform, a mobile device, a camera, and a smart glasses; performing data transformation operations on the extracted multimodal data, comprising at least vectorization and indexing of the extracted multimodal data; feeding the transformed multimodal extracted data into a large language model (LLM) that has been trained for generating accessibility output for visually impaired users according to the selected operating mode; generating, by the LLM, context-appropriate verbal and auditory information for the visually impaired user based on the transformed multimodal data; and conveying, by the LLM, the verbal and auditory information to the visually impaired user via a number of interface products according to the selected operating mode, thereby enabling the visually impaired user to interact with digital workplace content without visual perception. 2 . The method of claim 1 , wherein the operating mode is selected manually by the user. 3 . The method of claim 1 , wherein the operating mode is selected according to a calendar schedule. 4 . The method of claim 1 , wherein the selected operating mode comprises one of: people detection to identify persons in proximity to the user; workplace navigation to detect obstacles in the path of the user; text-to-speech; or work to assist the user to access functions of computer applications and issue verbal commands to the applications. 5 . The method of claim 1 , wherein the LLM learns to generate verbal and auditory information for the user according to the selected operating mode. 6 . The method of claim 1 , wherein the data sources comprise at least one of: productivity platforms; mobile devices; cameras; or smart glasses. 7 . The method of claim 1 , wherein the extracted data comprises at least one of: visual data; text data; optical character recognition images; data tables; graphs; charts; structured data; or unstructured data. 8 . The method of claim 1 , further comprising extracting the data from the data sources according to a user profile. 9 . The method of claim 1 , wherein the data is extracted according to user permissions. 10 . The method of claim 1 , further comprising: receiving voice commands from the user; and performing input or control functions in response to the voice commands. 11 . The method of claim 1 , further comprising storing the extracted data in a data lake. 12 . The method of claim 11 , further comprising: indexing the extracted data in the data lake; and vectorizing the extracted data in the data lake. 13 . A system for providing digital workplace accessibility to a visually impaired user, the system comprising: a storage device that stores program instructions; one or more processors operably connected to the storage device and configured to execute the program instructions to cause the system to: receive input of a selected operating mode from a plurality of context-specific operating modes, wherein the plurality of context-specific operating modes comprises: a people detection mode, a workplace navigation mode for detecting obstacles in a path of the visually impaired user, a text-to-speech mode for converting visual content to auditory content, and work mode for assisting the visually impaired user to access functions of computer applications; extract, via an extraction pipeline, multimodal data from a number of heterogeneous data sources according to the selected operating mode, wherein the multimodal data comprises at least two of visual data, text data, optical character recognition images, data tables, graphs, and charts and wherein the heterogeneous data sources include at least one of a productivity platform, a mobile device, a camera, and a smart glasses; performing data transformation operations on the extracted multimodal data, comprising at least vectorization and indexing of the extracted multimodal data; feed the transformed multimodal extracted data into a large language model (LLM) that has been trained for generating accessibility output for visually impaired users according to the selected operating mode; generate, by the LLM, context-appropriate verbal and auditory information for the visually impaired user based on the transformed multimodal data; and convey, by the LLM, the verbal and auditory information to the user via a number of interface products according to the selected operating mode. 14 . The system of claim 13 , wherein the operating mode is selected manually by the user. 15 . The system of claim 13 , wherein the operating mode is selected according to a calendar schedule. 16 . The system of claim 13 , wherein the selected operating mode comprises one of: people detection to identify persons in proximity to the user; workplace navigation to detect obstacles in the path of the user; text-to-speech; or work to assist the user to access functions of computer applications and issue verbal commands to the applications. 17 . The system of claim 13 , wherein the LLM learns to generate verbal and auditory information for the user according to the selected operating mode. 18 . The system of claim 13 , wherein the data sources comprise at least one of: productivity platforms; mobile devices; cameras; or smart glasses. 19 . The system of claim 13 , wherein the extracted data comprises at least one of: visual data; text data; optical character recognition images; data tables; graphs; charts; structured data; or unstructured data. 20 . The system of claim 13 , wherein the processors further execute instructions to cause the system to extract the data from the data sources according to a user profile. 21 . The system of claim 13 , wherein the data is extracted according to user permissions. 22 . The system of claim 13 , wherein the processors further execute instructions to cause the system to: receive voice commands from the user; and perform input or control functions in response to the voice commands. 23 . The system of claim 13 , further comprising storing the extracted data in a data lake. 24 . The system of claim 23 , wherein the processors further execute instructions to cause the system to: index the extracted data in the data lake; and vectorize the extracted data in the data lake. 25 . A computer program product for providing digital workplace accessibility to a vi
Training · CPC title
using context dependencies, e.g. language models · CPC title
Architecture of speech synthesisers · CPC title
Execution procedure of a spoken command · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.