Unified data processing across streaming and indexed data sets
US-11615084-B1 · Mar 28, 2023 · US
US2023350936A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2023350936-A1 |
| Application number | US-202318141337-A |
| Country | US |
| Kind code | A1 |
| Filing date | Apr 28, 2023 |
| Priority date | Apr 28, 2022 |
| Publication date | Nov 2, 2023 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A query processing system is described which receives a query input comprising an input token string and also at least one data item having a second, different modality, and generates a corresponding output token string.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method of generating an output token string based on a query input comprising an input token string and one or more data items, the input token string and output token string being strings of tokens selected from a token vocabulary, and the data items being of a modality other than tokens selected from the token vocabulary, the method comprising: inputting each data item of the query input into a modality network trained, upon receiving a data item of the modality, to generate one or more compressed representations of each data item; generating a prompt input comprising the input token string of the query input; and inputting the prompt input to a data-item-token processing model having a plurality of processing layers arranged as a stack, the output token string being an output of the data-item-token processing model, the processing layers including a plurality of token processing layers and a plurality of gated cross-attention layers, each gated cross-attention layer being arranged to receive at least one of the compressed representations, the token processing layers being interleaved with the gated cross-attention layers. 2 . The computer-implemented method of claim 1 in which the token processing layers are operative to provide together, in the absence of the gated cross-attention layers, a token string processing model, to receive input token strings and to generate corresponding output token strings. 3 . The computer-implemented method of claim 1 , comprising: generating an output token string based on a query input; and at least once performing the set of steps of: based on the query input and the output token string, forming a new query input; and generating a new output token string based on the new query input. 4 . A computer-implemented method of training a query processing system, the query processing system being for generating an output token string based on a query input comprising an input token string and one or more data items, the input token string and output token string being strings of tokens selected from a token vocabulary, and the data items being of a modality other than tokens selected from the token vocabulary, the method employing a token processing model comprising a stack of token processing layers, the stack of token processing layer being configured to receive input token strings and to generate corresponding output token strings, and a database of training examples, each training example comprising at least one data item and at least one token string; the method comprising: forming a data-item-token processing model by interleaving token processing layers from a token processing model with gated cross-attention layers, the data-item-token processing model being configured to generate an output token string upon receiving a prompt input which is a token string, the token processing model comprising a stack of the token processing layers, the stack of token processing layers being configured to receive input token strings and to generate corresponding output token strings, and a database of training examples, each training example comprising at least one data item and at least one token string; forming the query processing system, the query processing system comprising: (a) a modality network configured to receive the data items of the query input, to generate one or more compressed representations of each data item; and (b) the data-item-token processing model, the data-item-token processing model being configured to receive a prompt input comprising the input token string of the query input, and each gated cross-attention layer being arranged to receive at least one of the compressed representations; and using the training database, training: the modality network, and the plurality of gated cross-attention layers. 5 . The computer-implemented method of claim 4 in which the training trains the query processing system, upon an encoder of the modality network receiving the at least one data item of any of the training examples, and the data-item-token processing model receiving a prompt input comprising a first portion of the token string of the training example, to generate an output of the query processing system which is positively statistically correlated with a subsequent portion of the token string of the training example. 6 . The computer-implemented method of claim 4 in which the modality network comprises: an encoder configured to encode a data item received by the encoder to generate an encoded data item, and a compressed representation generation system arranged to receive the encoded data item and generate an output, the output of the modality network being based on the output of the compressed representation generation system. 7 . The computer-implemented method of claim 6 , in which the encoder has been trained to encode a data item received by the encoder to generate an encoded data item, and the training of the modality network and the plurality of gated cross-attention layers comprises training the compressed representation generation system without further training the encoder. 8 . The computer-implemented method of claim 6 , in which the compressed representation generation system comprises a stack of one or more resampler layers, each resampler layer being adapted to perform an attention operation which employs a key vector, a value vector and a query vector, a subset of the key vector, value vector and query vector being based on the encoded data item, and the remainder of the key vector, value vector and query vector being based on either an output of the preceding one of the resampler layers or, in the case of the first resampler layer of the stack, a set of input latent values, the output of the modality network being based on an output of the last resampler layer of the stack of resampler layers. 9 . The computer-implemented method of claim 8 in which the key vector and value vector of each resampler layer are based on the encoded data item and a latent input which is either the output of the preceding one of the resampler layers or, in the case of the first resampler layer of the stack, the set of input latent values, and the query vector is based on the latent input. 10 . The computer-implemented method of claim 8 in which each resampler layer further comprises a perceptron arranged to receive the output of the attention operation, and to generate an output, the output of the modality network being based on the output of the perceptron of the last resampler layer of the stack. 11 . The computer-implemented method of claim 4 , in which the prompt input further comprises one or more corresponding marker items for each data item in the query input, the one or more marker items being indicative of the presence of the data item in the query input. 12 . The computer-implemented method of claim 11 in which a position of each marker item in the prompt input is indicative of a position of the corresponding data item in the query input. 13 . The computer-implemented method of claim 4 in which each gated cross-attention layer generates its output as a component-wise sum of: a first input which is the output of the preceding processing layer in the stack of processing layers or, in the case that the gated cross-attention layer is the first processing layer of the stack of processing layers, the prompt input, and an interaction term based on the output of the compressed representation generation system received by the gated cross-attention layer, and at least part of the first input to the gated cross-attention lay
Presentation of query results · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Learning methods · CPC title
Combinations of networks · CPC title
Feedforward networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.