Photorealistic Text Inpainting for Augmented Reality Using Generative Models
US-2024104312-A1 · Mar 28, 2024 · US
US12499332B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12499332-B2 |
| Application number | US-202217954845-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 28, 2022 |
| Priority date | Sep 28, 2022 |
| Publication date | Dec 16, 2025 |
| Grant date | Dec 16, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and computer program products for translating text using generated visual representations and artificial intelligence are provided herein. A computer-implemented method includes generating a tokenized form of at least a portion of input text in a first language; generating at least one visual representation of at least a portion of the input text using a first set of artificial intelligence techniques; generating a tokenized form of at least a portion of the at least one visual representation; and generating an output including a translated version of the input text into at least a second language by processing, using a second set of artificial intelligence techniques, at least a portion of the tokenized form of the at least a portion of the input text and at least a portion of the tokenized form of the at least a portion of the at least one visual representation.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method comprising: generating a tokenized form of at least a portion of input text, wherein the input text is in a first language; generating, as output from utilizing a first set of one or more artificial intelligence techniques, at least one image representation of at least a portion of the input text by mapping portions of stored image data, the portions of the stored image data selected in connection with processing the input text using the first set of one or more artificial intelligence techniques, into portions of the at least one image representation of the at least a portion of the input text, wherein the first set of one or more artificial intelligence techniques comprises at least one neural network-based autoregressive transformer trained on the stored image data; generating a tokenized form of at least a portion of the at least one image representation; generating an output comprising a translated version of the input text into at least a second language by processing, using a second set of one or more artificial intelligence techniques, at least a portion of the tokenized form of the at least a portion of the input text and at least a portion of the tokenized form of the at least a portion of the at least one image representation, wherein the second set of one or more artificial intelligence techniques comprises at least one neural network-based multimodal translation transformer; automatically training, using feedback related to the generated output, at least one of the at least one neural network-based autoregressive transformer and the at least one neural network- based multimodal translation transformer; and automatically executing, subsequent to the automatic training, one or more machine translation operations using the at least one of the at least one neural network-based autoregressive transformer and the at least one neural network-based multimodal translation transformer; wherein the method is carried out by at least one computing device. 2 . The computer-implemented method of claim 1 , further comprising: automatically training the first set of one or more artificial intelligence techniques by processing a training set of text data and processing image data corresponding to the training set of text data. 3 . The computer-implemented method of claim 2 , wherein automatically training the first set of one or more artificial intelligence techniques comprises using visual representation loss-related techniques. 4 . The computer-implemented method of claim 2 , further comprising: automatically training the second set of one or more artificial intelligence techniques using (i) a tokenized combination of a visual representation of the training set of text data and the training set of text data, and (ii) a tokenized combination of the image data and the training set of text data. 5 . The computer-implemented method of claim 4 , wherein automatically training the second set of one or more artificial intelligence techniques comprises using translation loss-related techniques and consistency loss-related techniques. 6 . The computer-implemented method of claim 1 , wherein generating the output comprises mapping, using the second set of one or more artificial intelligence techniques, one or more portions of the tokenized form of at least a portion of input text to one or more portions of the tokenized form of at least a portion of the at least one image representation. 7 . The computer-implemented method of claim 1 , wherein software implementing the method is provided as a service in a cloud environment. 8 . A computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computing device to cause the computing device to: generate a tokenized form of at least a portion of input text, wherein the input text is in a first language; generate, as output from utilizing a first set of one or more artificial intelligence techniques, at least one image representation of at least a portion of the input text by mapping portions of stored image data, the portions of the stored image data selected in connection with processing the input text using the first set of one or more artificial intelligence techniques, into portions of the at least one image representation of the at least a portion of the input text, wherein the first set of one or more artificial intelligence techniques comprises at least one neural network-based autoregressive transformer trained on the stored image data; generate a tokenized form of at least a portion of the at least one image representation; generate an output comprising a translated version of the input text into at least a second language by processing, using a second set of one or more artificial intelligence techniques, at least a portion of the tokenized form of the at least a portion of the input text and at least a portion of the tokenized form of the at least a portion of the at least one image representation, wherein the second set of one or more artificial intelligence techniques comprises at least one neural network-based multimodal translation transformer; automatically train, using feedback related to the generated output, at least one of the at least one neural network-based autoregressive transformer and the at least one neural network- based multimodal translation transformer; and automatically execute, subsequent to the automatic training, one or more machine translation operations using the at least one of the at least one neural network-based autoregressive transformer and the at least one neural network-based multimodal translation transformer. 9 . The computer program product of claim 8 , wherein the program instructions is further executable by the computing device to cause the computing device to: automatically train the first set of one or more artificial intelligence techniques by processing a training set of text data and processing image data corresponding to the training set of text data. 10 . The computer program product of claim 9 , wherein the program instructions is further executable by the computing device to cause the computing device to: automatically train the second set of one or more artificial intelligence techniques using (i) a tokenized combination of a visual representation of the training set of text data and the training set of text data, and (ii) a tokenized combination of the image data and the training set of text data. 11 . The computer program product of claim 9 , wherein automatically training the first set of one or more artificial intelligence techniques comprises using visual representation loss-related techniques. 12 . The computer program product of claim 10 , wherein automatically training the second set of one or more artificial intelligence techniques comprises using translation loss- related techniques and consistency loss-related techniques. 13 . The computer program product of claim 8 , wherein generating the output comprises mapping, using the second set of one or more artificial intelligence techniques, one or more portions of the tokenized form of at least a portion of input text to one or more portions of the tokenized form of at least a portion of the at least one image representation. 14 . A system comprising: a memory configured to store program instructions; and a processor operatively coupled to the memory to execute the program instructions to: generate a tokenized form of at least a portion of input text, wherein the input text is in a first language;
Lexical analysis, e.g. tokenisation or collocates · CPC title
Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation · CPC title
Learning methods · CPC title
Recurrent networks, e.g. Hopfield networks · CPC title
Combinations of networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.