Identifying content items in response to a text-based request

US11841897B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11841897-B2
Application numberUS-202218148386-A
CountryUS
Kind codeB2
Filing dateDec 29, 2022
Priority dateAug 20, 2020
Publication dateDec 12, 2023
Grant dateDec 12, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for responding to a subscriber's text-based request for content items are presented. In response to a request from a subscriber, word pieces are generated from the text-based terms of the request. A request embedding vector of the word pieces is obtained from a trained machine learning model. Using the request embedding vector, a set of content items, from a corpus of content items, is identified. At least some content items of the set of content items are returned to the subscriber in response to the text-based request for content items.

First claim

Opening claim text (preview).

What is claimed: 1. A non-transitory computer-readable medium bearing computer executable instructions which, when executed on a computing system comprising at least a processor executing the computer executable instructions, carries out a method, comprising: maintaining a corpus of non-text content items comprising non-text content items, wherein each non-text content item of the corpus is associated with an embedding vector that projects the non-text content item into a non-text content item embedding space; receiving a text-based request for content items of the corpus of non-text content items; processing the text-based request to generate a request embedding vector for the text-based request that projects the request embedding vector into the non-text content item embedding space; determining a first non-text content item of the corpus of non-text content items according to a projection of the request embedding vector into the non-text content item embedding space; and returning the first non-text content item in response to the text-based request for content items. 2. The non-transitory computer-readable medium of claim 1 , wherein processing the text-based request to generate the request embedding vector is based at least in part on a plurality of word pieces determined from the text-based request. 3. The non-transitory computer-readable medium of claim 2 , wherein the plurality of word pieces are determined based at least in part on at least one of a morphological analysis of the text-based request or a byte pair encoding technique analysis of the text-based request. 4. The non-transitory computer-readable medium of claim 2 , wherein processing the text-based request to generate the request embedding vector includes: determining a plurality of word piece embedding vectors corresponding to the plurality of word pieces, each of the plurality of word piece embedding vectors projecting a respective word piece of the plurality of word pieces into the non-text content embedding space; and combining the plurality of word piece embedding vectors to generate the request embedding vector. 5. The non-transitory computer-readable medium of claim 1 , wherein determining the first non-text content item of the corpus of non-text content items is based at least in part on a distance between the request embedding vector and the embedding vector associated with the first non-text content item in the non-text content item embedding space. 6. A computer-implemented method, comprising: generating, for a text-based request for content items, a request embedding vector that is representative of the text-based request and projects the request embedding vector into a non-text content item embedding space; determining, based at least in part on the projection of the request embedding vector into the non-text content embedding space, a non-text content item from a plurality of non-text content items, each of the plurality of non-text content items associated with a respective content item embedding vector that projects the plurality of non-text content items into the non-text content item embedding space; and providing the non-text content item in response to the text-based request for content items. 7. The computer-implemented method of claim 6 , wherein generating the request embedding vector is based at least in part on a plurality of word pieces determined from the text-based request. 8. The computer-implemented method of claim 7 , wherein the plurality of word pieces are determined based at least in part on at least one of a morphological analysis of the text-based request or a byte pair encoding technique analysis of the text-based request. 9. The computer-implemented method of claim 7 , wherein: determining the plurality of word pieces includes: processing the text-based request to determine a set of text terms included in the text-based request; and performing a morphological analysis of the set of text terms to determine the plurality of word pieces; and each of the plurality of word pieces includes a respective morpheme. 10. The computer implemented method of claim 7 , wherein each of the plurality of word pieces corresponds to one more word parts of text-based terms of the text-based request and includes one of a prefix of one of the text-based term, a suffix of one of the text-based term, or a root of one of the text-based term. 11. The computer-implemented method of claim 7 , wherein generating the request embedding vector includes: determining a plurality of word piece embedding vectors corresponding to the plurality of word pieces, each of the plurality of word piece embedding vectors projecting a respective word piece of the plurality of word pieces into the non-text content embedding space; determining a plurality of weightings corresponding to the plurality of word pieces; and combining the plurality of word piece embedding vectors in accordance with the plurality of weightings to generate the request embedding vector. 12. The computer-implemented method of claim 11 , further comprising: conducting a semantic analysis of the text-based request to determine at least one of a topic associated with the text-based request or an intent associated with the text-based request, wherein the plurality of weightings is determined at least in part on at least one of the topic or the intent associated with the text-based request. 13. The computer-implemented method of claim 6 , wherein determining the non-text content item from the plurality of non-text content items is based at least in part on a distance between the request embedding vector and the respective content item embedding vector associated with the non-text content item in the non-text content item embedding space based on a cosine similarity of the request embedding vector and the respective content item embedding vector associated with the non-text content item. 14. The computer-implemented method of claim 6 , further comprising: identifying a closest content item from the plurality of non-text content items that is closest to the request embedding vector in the non-text content item embedding space; conducting a random walk originating from the closest content item in a content item graph representing relationships between the plurality of non-text content items; determining, based at least in part on the random walk, a second plurality of non-text content items from the plurality of non-text content items; and providing at least a portion of the second plurality of non-text content items in response to the text-based request for content items. 15. A computing system, comprising: one or more processors; and a memory including program instructions that, when executed by the one or more processors, cause the one or more processors to at least: maintain a content item graph for a corpus of non-text content items, wherein each non-text content item of the corpus of non-text content items is associated with a respective embedding vector that projects the corpus of non-text content items into a non-text content item embedding space; generate, for a text-based request for content items using a trained machine learning model configured to generate output embedding vectors that project text-based inputs into the non-text content item embedding space, a request embedding vector that is representative of the text-based request and projects the request embedding vector into the non-text content item embedding space; determine, based at least in part on the projection of the request embedding vector into the non-text content embedding space, a non-t

Assignees

Inventors

Classifications

  • G06F16/483Primary

    using metadata automatically derived from the content · CPC title

  • Spatial browsing, e.g. 2D maps, 3D or virtual spaces · CPC title

  • Clustering; Classification · CPC title

  • Machine learning · CPC title

  • Natural language generation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11841897B2 cover?
Systems and methods for responding to a subscriber's text-based request for content items are presented. In response to a request from a subscriber, word pieces are generated from the text-based terms of the request. A request embedding vector of the word pieces is obtained from a trained machine learning model. Using the request embedding vector, a set of content items, from a corpus of conten…
Who is the assignee on this patent?
Pinterest Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/483. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 12 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).