Efficient Index Lookup Using Language-Agnostic Vectors and Context Vectors
US-2023119161-A1 · Apr 20, 2023 · US
US2024354317A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2024354317-A1 |
| Application number | US-202318137944-A |
| Country | US |
| Kind code | A1 |
| Filing date | Apr 21, 2023 |
| Priority date | Apr 21, 2023 |
| Publication date | Oct 24, 2024 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A technique uses an encoder system to produce an index of target item embeddings. Each target item embedding is input-agnostic and universal in the sense that different expressions of a target concept, produced using different combinations of input modes, map to the same target item embedding in the index. The encoder system throttles the amount of computations it performs based on the assessed capabilities of an execution platform. A retrieval system processes a multimodal input query by first generating a candidate set of target item embeddings in the index that match the input query, and then using a filtering operation to identify those target item embeddings that are most likely to match the input query. The encoder system and the retrieval system rely on language-based components having weights that are held constant during a training operation. Other weights of these systems are updated during the training operation.
Opening claim text (preview).
What is claimed is: 1 . A method for creating an index for item retrieval, comprising: receiving an input item, the input item having first content provided by a first input mode and second content provided by a second input mode, the second input mode differing from the first input mode; mapping, using an input-embedding system, the first content and the second content to an input-system embedding; mapping, using a language-based embedding-mapping system, the input-system embedding to a target item embedding that represents the input item; and storing the target item embedding in the index, the input-embedding system including weights that are updated by a training system during a training operation, and the language-based embedding-mapping system including language model weights that are held fixed during the training operation. 2 . The method of claim 1 , wherein the target item embedding in the index represents a particular target concept, and wherein plural expressions of the target concept, that have been generated using different input modes and different combinations of input modes, map to the same target item embedding. 3 . The method of claim 1 , wherein the first input mode and the second input mode are any two different input modes selected from a group that includes: a text input mode, an image input mode, an audio input mode, and a video input mode. 4 . The method of claim 1 , wherein the input item has third content provided by a third input mode that differs from the first input mode and the second input mode. 5 . The method of claim 1 , wherein the input-embedding system includes a first input-embedding subsystem and a second input-embedding subsystem, wherein the method includes: mapping, using the first input-embedding subsystem, the first content to a first input embedding; and mapping, using the second input-embedding subsystem, the second content to a second input embedding, the input-system embedding including the first input embedding and the second input embedding, and at least one of the input-embedding subsystems including weights that are updated by the training system during the training operation. 6 . The method of claim 1 , wherein the language-based embedding-mapping system operates by: mapping, using a language-based encoding operation, the input-system embedding to a first-stage embedding; and mapping, using an embedding conversion operation, the first-stage embedding to the target item embedding, in a vector space of the index, the embedding conversion operation using weights that are updated by the training system during the training operation, and the language-based encoding operation using language model weights that are held fixed during the training operation. 7 . The method of claim 1 , further comprising: assessing a processing capability of an execution platform; and setting an amount of processing operations to be performed by the language-based encoder system based on the processing capability. 8 . The method of claim 1 , further comprising, in retrieval operation: receiving an input query; mapping the input query to a query embedding using the input-embedding system and the language-based embedding-mapping system; and finding a candidate set of target item embeddings in the index that match the query embedding. 9 . The method of claim 8 , further comprising, in a language-based filtering operation, identifying one or more target item embeddings in the candidate set of target items embeddings that are most likely to match the input query. 10 . The method of claim 9 , wherein the language-based filtering operation comprises: receiving prompt information; and mapping, in a language-based encoding operation, the prompt information, the query embedding, and the candidate set of target item embeddings to output results, the output results identifying the one or more target item embeddings that are most likely to match the input query, the language-based encoding operation using language model weights that are held fixed during the training operation. 11 . The method of claim 10 , further comprising: mapping, in a preliminary mapping operation prior to the language-based encoding operation, the candidate set of target item embeddings to a transformed set target item embeddings in a vector space of the language-based encoding operation, the preliminary mapping operation using weights that are updated during the training operation. 12 . The method of claim 9 , wherein the language-based encoder system and/or the language-based filtering operation use transformer-based machine-trained logic. 13 . A computing system for performing a retrieval operation, comprising: an instruction store for storing computer-readable instructions; an index store for storing target item embeddings produced by a language-based encoder system; a processing system for executing the computer-readable instructions to perform operations that include: receiving an input query; mapping the input query to a query embedding using the language-based encoder system; matching the query embedding against the target item embeddings in the index store, to identify a candidate set of target item embeddings; and identifying, in a language-based filtering operation, one or more target item embeddings in the candidate set of item target item embeddings that are most likely to match the input query, the language-based encoder system and the language-based filtering operation using language model weights that are held fixed during a training operation. 14 . The computing system of claim 13 , wherein a target item embedding in the index store represents a particular target concept, and wherein plural expressions of the target concept, that have been generated using different input modes and different combinations of input modes, map to the same target item embedding. 15 . The computing system of claim 13 , wherein the input query includes first content provided by a first input mode and second content provided by a second input mode, the second input mode differing from the first input mode. 16 . The computing system of claim 15 , wherein the first input mode and the second input mode are any two different input modes selected from a group that includes: a text input mode, an image input mode, an audio input mode, and a video input mode. 17 . The computing system of claim 13 , wherein the language-based filtering operation includes: receiving prompt information; and mapping, in a language-based encoding operation, the query embedding, the prompt information, and the candidate set of target item embeddings to output results, the output results identifying the one or more target item embeddings that are most likely to match the input query, the language-based encoding operation using language model weights that are held fixed during the training operation. 18 . The computing system of claim 17 , wherein the operations further include: mapping, in a preliminary mapping operation prior to the language-based encoding operation, the candidate set of target item embeddings to a transformed set target item embeddings in a vector space of the language-based encoding operation, the preliminary mapping operation using weights that are updated during the training operation. 19 . The computing system of claim 13 , wherein the language-based encoder system and/or the language-based filtering operation use transformer-based machine-trained logic 20 .
using natural language analysis · CPC title
using vector based model · CPC title
Selection or weighting of terms for indexing · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.