Using Fixed-Weight Language Models to Create and Interact with a Retrieval Index

US2024354317A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024354317-A1
Application numberUS-202318137944-A
CountryUS
Kind codeA1
Filing dateApr 21, 2023
Priority dateApr 21, 2023
Publication dateOct 24, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A technique uses an encoder system to produce an index of target item embeddings. Each target item embedding is input-agnostic and universal in the sense that different expressions of a target concept, produced using different combinations of input modes, map to the same target item embedding in the index. The encoder system throttles the amount of computations it performs based on the assessed capabilities of an execution platform. A retrieval system processes a multimodal input query by first generating a candidate set of target item embeddings in the index that match the input query, and then using a filtering operation to identify those target item embeddings that are most likely to match the input query. The encoder system and the retrieval system rely on language-based components having weights that are held constant during a training operation. Other weights of these systems are updated during the training operation.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for creating an index for item retrieval, comprising: receiving an input item, the input item having first content provided by a first input mode and second content provided by a second input mode, the second input mode differing from the first input mode; mapping, using an input-embedding system, the first content and the second content to an input-system embedding; mapping, using a language-based embedding-mapping system, the input-system embedding to a target item embedding that represents the input item; and storing the target item embedding in the index, the input-embedding system including weights that are updated by a training system during a training operation, and the language-based embedding-mapping system including language model weights that are held fixed during the training operation. 2 . The method of claim 1 , wherein the target item embedding in the index represents a particular target concept, and wherein plural expressions of the target concept, that have been generated using different input modes and different combinations of input modes, map to the same target item embedding. 3 . The method of claim 1 , wherein the first input mode and the second input mode are any two different input modes selected from a group that includes: a text input mode, an image input mode, an audio input mode, and a video input mode. 4 . The method of claim 1 , wherein the input item has third content provided by a third input mode that differs from the first input mode and the second input mode. 5 . The method of claim 1 , wherein the input-embedding system includes a first input-embedding subsystem and a second input-embedding subsystem, wherein the method includes: mapping, using the first input-embedding subsystem, the first content to a first input embedding; and mapping, using the second input-embedding subsystem, the second content to a second input embedding, the input-system embedding including the first input embedding and the second input embedding, and at least one of the input-embedding subsystems including weights that are updated by the training system during the training operation. 6 . The method of claim 1 , wherein the language-based embedding-mapping system operates by: mapping, using a language-based encoding operation, the input-system embedding to a first-stage embedding; and mapping, using an embedding conversion operation, the first-stage embedding to the target item embedding, in a vector space of the index, the embedding conversion operation using weights that are updated by the training system during the training operation, and the language-based encoding operation using language model weights that are held fixed during the training operation. 7 . The method of claim 1 , further comprising: assessing a processing capability of an execution platform; and setting an amount of processing operations to be performed by the language-based encoder system based on the processing capability. 8 . The method of claim 1 , further comprising, in retrieval operation: receiving an input query; mapping the input query to a query embedding using the input-embedding system and the language-based embedding-mapping system; and finding a candidate set of target item embeddings in the index that match the query embedding. 9 . The method of claim 8 , further comprising, in a language-based filtering operation, identifying one or more target item embeddings in the candidate set of target items embeddings that are most likely to match the input query. 10 . The method of claim 9 , wherein the language-based filtering operation comprises: receiving prompt information; and mapping, in a language-based encoding operation, the prompt information, the query embedding, and the candidate set of target item embeddings to output results, the output results identifying the one or more target item embeddings that are most likely to match the input query, the language-based encoding operation using language model weights that are held fixed during the training operation. 11 . The method of claim 10 , further comprising: mapping, in a preliminary mapping operation prior to the language-based encoding operation, the candidate set of target item embeddings to a transformed set target item embeddings in a vector space of the language-based encoding operation, the preliminary mapping operation using weights that are updated during the training operation. 12 . The method of claim 9 , wherein the language-based encoder system and/or the language-based filtering operation use transformer-based machine-trained logic. 13 . A computing system for performing a retrieval operation, comprising: an instruction store for storing computer-readable instructions; an index store for storing target item embeddings produced by a language-based encoder system; a processing system for executing the computer-readable instructions to perform operations that include: receiving an input query; mapping the input query to a query embedding using the language-based encoder system; matching the query embedding against the target item embeddings in the index store, to identify a candidate set of target item embeddings; and identifying, in a language-based filtering operation, one or more target item embeddings in the candidate set of item target item embeddings that are most likely to match the input query, the language-based encoder system and the language-based filtering operation using language model weights that are held fixed during a training operation. 14 . The computing system of claim 13 , wherein a target item embedding in the index store represents a particular target concept, and wherein plural expressions of the target concept, that have been generated using different input modes and different combinations of input modes, map to the same target item embedding. 15 . The computing system of claim 13 , wherein the input query includes first content provided by a first input mode and second content provided by a second input mode, the second input mode differing from the first input mode. 16 . The computing system of claim 15 , wherein the first input mode and the second input mode are any two different input modes selected from a group that includes: a text input mode, an image input mode, an audio input mode, and a video input mode. 17 . The computing system of claim 13 , wherein the language-based filtering operation includes: receiving prompt information; and mapping, in a language-based encoding operation, the query embedding, the prompt information, and the candidate set of target item embeddings to output results, the output results identifying the one or more target item embeddings that are most likely to match the input query, the language-based encoding operation using language model weights that are held fixed during the training operation. 18 . The computing system of claim 17 , wherein the operations further include: mapping, in a preliminary mapping operation prior to the language-based encoding operation, the candidate set of target item embeddings to a transformed set target item embeddings in a vector space of the language-based encoding operation, the preliminary mapping operation using weights that are updated during the training operation. 19 . The computing system of claim 13 , wherein the language-based encoder system and/or the language-based filtering operation use transformer-based machine-trained logic 20 .

Assignees

Inventors

Classifications

  • using natural language analysis · CPC title

  • using vector based model · CPC title

  • G06F16/313Primary

    Selection or weighting of terms for indexing · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024354317A1 cover?
A technique uses an encoder system to produce an index of target item embeddings. Each target item embedding is input-agnostic and universal in the sense that different expressions of a target concept, produced using different combinations of input modes, map to the same target item embedding in the index. The encoder system throttles the amount of computations it performs based on the assessed…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/3347. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 24 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).