Automatically generating context-based alternative text using artificial intelligence techniques

US12182525B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12182525-B2
Application numberUS-202217580951-A
CountryUS
Kind codeB2
Filing dateJan 21, 2022
Priority dateJan 21, 2022
Publication dateDec 31, 2024
Grant dateDec 31, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, apparatus, and processor-readable storage media for automatically generating context-based alternative text using artificial intelligence techniques are provided herein. An example computer-implemented method includes generating text captions for an image derived from a web page by processing the image using an artificial intelligence-based image captioning model; determining context information pertaining to the image by processing the image using an artificial intelligence-based context and emotion recognition library; generating context-based alternative text for at least a portion of the image by processing, using at least one artificial intelligence-based alternative text generation model, at least a portion of one or more of the generated text caption(s) for the image and the determined context information pertaining to at least a portion of the image; and performing one or more automated actions based on the generated context-based alternative text.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: generating one or more text captions for an image relating to a web page by processing at least a portion of the image using at least one artificial intelligence-based image captioning model; determining context information pertaining to at least a portion of the image by processing one or more portions of the image using at least one artificial intelligence-based context and emotion recognition library; generating context-based alternative text for at least a portion of the image by processing, using at least one artificial intelligence-based alternative text generation model, at least a portion of one or more of the one or more generated text captions for the image and the determined context information pertaining to at least a portion of the image; and performing one or more automated actions based at least in part on the generated context-based alternative text, wherein performing one or more automated actions comprises: automatically inserting at least a portion of the generated context-based alternative text into at least one particular portion of application code of the web page, and wherein the at least one particular portion of the application code is determined by parsing content contained within the application code for one or more code-related identifiers; and automatically updating at least a portion of the one or more code-related identifiers associated with the at least one particular portion of the application code; wherein the method is performed by at least one processing device comprising a processor coupled to a memory. 2. The computer-implemented method of claim 1 , further comprising: automatically training the at least one artificial intelligence-based alternative text generation model using at least one of one or more supervised learning techniques and one or more unsupervised learning techniques. 3. The computer-implemented method of claim 1 , wherein performing one or more automated actions comprises: obtaining user feedback pertaining to the generated context-based alternative text; and automatically training, using at least a portion of the user feedback, one or more of the at least one artificial intelligence-based image captioning model, the at least one artificial intelligence-based alternative text generation model, and the at least one artificial intelligence-based context and emotion recognition library. 4. The computer-implemented method of claim 1 , wherein generating context-based alternative text for at least a portion of the image comprises updating an existing set of alternative text for the at least a portion of the image. 5. The computer-implemented method of claim 1 , wherein determining context information pertaining to at least a portion of the image comprises: identifying at least one of one or more facial gestures and one or more body gestures in the image; and determining one or more emotional indications derived from the at least one of one or more identified facial gestures and one or more identified body gestures. 6. The computer-implemented method of claim 1 , wherein processing at least a portion of the image using at least one artificial intelligence-based image captioning model comprises processing the at least a portion of the image using one or more deep learning models. 7. The computer-implemented method of claim 6 , wherein processing the at least a portion of the image using one or more deep learning models comprises processing the at least a portion of the image using at least one of one or more convolutional neural networks, one or more residual neural networks, and one or more deep neural networks. 8. The computer-implemented method of claim 1 , wherein determining context information pertaining to at least a portion of the image comprises automatically identifying one or more actions depicted in the at least a portion of the image. 9. The computer-implemented method of claim 1 , wherein determining context information pertaining to at least a portion of the image comprises automatically identifying one or more scenery variables depicted in the at least a portion of the image. 10. The computer-implemented method of claim 1 , wherein determining context information pertaining to at least a portion of the image comprises automatically identifying one or more event types depicted in the at least a portion of the image. 11. The computer-implemented method of claim 1 , further comprising: extracting text from the image by processing one or more portions of the image using at least one artificial intelligence-based optical character recognition model; and automatically training the at least one artificial intelligence-based alternative text generation model using at least a portion of one or more of the one or more generated text captions for the image, the extracted text from the image, and the determined context information pertaining to at least a portion of the image. 12. A non-transitory processor-readable storage medium having stored therein program code of one or more software programs, wherein the program code when executed by at least one processing device causes the at least one processing device: to generate one or more text captions for an image relating to a web page by processing at least a portion of the image using at least one artificial intelligence-based image captioning model; to determine context information pertaining to at least a portion of the image by processing one or more portions of the image using at least one artificial intelligence-based context and emotion recognition library; to generate context-based alternative text for at least a portion of the image by processing, using at least one artificial intelligence-based alternative text generation model, at least a portion of one or more of the one or more generated text captions for the image and the determined context information pertaining to at least a portion of the image; and to perform one or more automated actions based at least in part on the generated context-based alternative text, wherein performing one or more automated actions comprises: automatically inserting at least a portion of the generated context-based alternative text into at least one particular portion of application code of the web page, wherein the at least one particular portion of the application code is determined by parsing content contained within the application code for one or more code-related identifiers; and automatically updating at least a portion of the one or more code-related identifiers associated with the at least one particular portion of the application code. 13. The non-transitory processor-readable storage medium of claim 12 , wherein performing one or more automated actions comprises: obtaining user feedback pertaining to the generated context-based alternative text; and automatically training, using at least a portion of the user feedback, one or more of the at least one artificial intelligence-based image captioning model, the at least one artificial intelligence-based alternative text generation model, and the at least one artificial intelligence-based context and emotion recognition library. 14. The non-transitory processor-readable storage medium of claim 12 , wherein generating context-based alternative text for at least a portion of the image comprises updating an existing set of alternative text for the at least a portion of the image. 15. An apparatus comprising: at least one processing device comprising a processor coupled to a memory; the at least one processing device being configured

Assignees

Inventors

Classifications

  • characterised by the processing or recognition method (segmentation of character regions G06V30/148) · CPC title

  • Region-based matching · CPC title

  • G06F40/56Primary

    Natural language generation · CPC title

  • Semantic analysis · CPC title

  • using neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12182525B2 cover?
Methods, apparatus, and processor-readable storage media for automatically generating context-based alternative text using artificial intelligence techniques are provided herein. An example computer-implemented method includes generating text captions for an image derived from a web page by processing the image using an artificial intelligence-based image captioning model; determining context i…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06F40/56. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 31 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).