Identifying content items in response to a text-based request

US2025307306A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025307306-A1
Application numberUS-202519234699-A
CountryUS
Kind codeA1
Filing dateJun 11, 2025
Priority dateAug 20, 2020
Publication dateOct 2, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for responding to a subscriber's text-based request for content items are presented. In response to a request from a subscriber, word pieces are generated from the text-based terms of the request. A request embedding vector of the word pieces is obtained from a trained machine learning model. Using the request embedding vector, a set of content items, from a corpus of content items, is identified. At least some content items of the set of content items are returned to the subscriber in response to the text-based request for content items.

First claim

Opening claim text (preview).

What is claimed: 1 . A method comprising: generating a collection of training data comprising: obtaining a plurality of text-based requests and a plurality of content items, each text-based request of the plurality of text-based requests being associated with one or more content items of the plurality of content items; and for each text-based request: generating a representative embedding vector for the text-based request; projecting the one or more content items associated with the text-based request into an content item embedding space; clustering the projected one or more content items into a neighborhood of positive representations of the text-based request; generating an instance of positive training data comprising the text-based request, the representative embedding vector, and cluster data; and generating one or more instances of negative training data comprising an embedding vector projected into the content item embedding space outside of the cluster; training an embedding vector generator using the collection of training data, the embedding vector generator configured to generate embedding vectors into a content item embedding space for a text-based request; and generating, in response to a first text-based request received from a user, an embedding vector for the first text-based request in the content item embedding space. 2 . The method of claim 1 , wherein generating the representative embedding vector for each text-based request further comprises: generating a set of one or more word pieces from the text-based request; generating one or more word piece embedding vectors, each word piece embedding vector corresponding to a respective word piece of the one or more word pieces; and generating the representative embedding vector from the one or more word piece embedding vectors. 3 . The method of claim 2 , wherein generating the representative embedding vector from the one or more word piece embedding vectors comprises averaging the one or more word piece embedding vectors. 4 . The method of claim 1 , wherein the cluster data comprise a centroid fort the cluster and dimensional information of the cluster. 5 . The method of claim 1 , wherein obtaining the collection of the plurality of text-based requests and the plurality of content items comprises: obtaining a collection of text-based request and content item pairs, each pair corresponding to a text-based request by a user and an content item with which the user interacted; and aggregating the collection of text-based request and content item pairs so that each text-based request is associated with one or more of the content items. 6 . The method of claim 1 , further comprising: pre-generating a collection of embedding vectors from a collection of text-based requests, the pre-generating comprising: for each text-based request of the collection of text-based requests: generating one or more word pieces for the text-based request; generating a representative embedding vector for the text-based request from the word pieces; generating a request embedding vector that projects the representative embedding vector into the content item embedding space; and storing the text-based request and the generated request embedding vector. 7 . The method of claim 1 , further comprising: maintaining a plurality of content items, wherein the plurality of content items are associated with a plurality of content item embedding vectors that project the plurality of content items into an content item embedding space; in response to a text-based query, projecting a request embedding vector generated from the text-based query into the content item embedding space; determining, based at least on the plurality of content item embedding vectors and the request embedding vector, an content item from the plurality of content items; and providing the content item in response to the text-based request. 8 . A system comprising: one or more processors; and a memory storing program instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: generating a collection of training data comprising: obtaining a plurality of text-based requests and a plurality of content items, each text-based request of the plurality of text-based requests being associated with one or more content items of the plurality of content items; and for each text-based request: generating a representative embedding vector for the text-based request; projecting the one or more content items associated with the text-based request into an content item embedding space; clustering the projected one or more content items into a neighborhood of positive representations of the text-based request; generating an instance of positive training data comprising the text-based request, the representative embedding vector, and cluster data; and generating one or more instances of negative training data comprising an embedding vector projected into the content item embedding space outside of the cluster; training an embedding vector generator using the collection of training data, the embedding vector generator configured to generate embedding vectors into a content item embedding space for a text-based request; and generating, in response to a first text-based request received from a user, an embedding vector for the first text-based request in the content item embedding space. 9 . The system of claim 8 , wherein generating the representative embedding vector for each text-based request further comprises: generating a set of one or more word pieces from the text-based request; generating one or more word piece embedding vectors, each word piece embedding vector corresponding to a respective word piece of the one or more word pieces; and generating the representative embedding vector from the one or more word piece embedding vectors. 10 . The system of claim 9 , wherein generating the representative embedding vector from the one or more word piece embedding vectors comprises averaging the one or more word piece embedding vectors. 11 . The system of claim 8 , wherein the cluster data comprise a centroid fort the cluster and dimensional information of the cluster. 12 . The system of claim 8 , wherein obtaining the collection of the plurality of text-based requests and the plurality of content items comprises: obtaining a collection of text-based request and content item pairs, each pair corresponding to a text-based request by a user and an content item with which the user interacted; and aggregating the collection of text-based request and content item pairs so that each text-based request is associated with one or more of the content items. 13 . The system of claim 8 , wherein the program instructions further include instructions that, when executed by the one or more processors, further cause the one or more processors to perform operations comprising: pre-generating a collection of embedding vectors from a collection of text-based requests, the pre-generating comprising: for each text-based request of the collection of text-based requests: generating one or more word pieces for the text-based request; generating a representative embedding vector for the text-based request from the word pieces; generating a request embedding vector that projects the representative embedding vector into the content item embedding space; and storing the text-based request and the generated request embedding vector. 14 . The system of claim 8 , wherein the program instructions further include instructi

Assignees

Inventors

Classifications

  • Clustering; Classification · CPC title

  • Spatial browsing, e.g. 2D maps, 3D or virtual spaces · CPC title

  • Natural language generation · CPC title

  • Machine learning · CPC title

  • Non-supervised learning, e.g. competitive learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025307306A1 cover?
Systems and methods for responding to a subscriber's text-based request for content items are presented. In response to a request from a subscriber, word pieces are generated from the text-based terms of the request. A request embedding vector of the word pieces is obtained from a trained machine learning model. Using the request embedding vector, a set of content items, from a corpus of conten…
Who is the assignee on this patent?
Pinterest Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/483. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 02 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).