Search platform for unstructured interaction summaries

US11620319B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11620319-B2
Application numberUS-202117319940-A
CountryUS
Kind codeB2
Filing dateMay 13, 2021
Priority dateMay 13, 2021
Publication dateApr 4, 2023
Grant dateApr 4, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and computer program products for search platforms for unstructured interaction summaries. An application executing on a processor may receive a query comprising a term. The application may generate, based on an embedding vector and the term, an expanded query comprising a plurality of additional terms. The application may generate, based on a term frequency inverse document frequency model, a vector for the expanded query and generate an entity vector for the query. The application may generate a combined vector for the query based on the entity vector and the vector for the expanded query. The application may compute, based on the combined vector for the query and a feature matrix of a corpus, a respective cosine similarity score for a plurality of results in the corpus. The application may return one or more of the plurality of results as responsive to the query based on the similarity scores.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: generating, by an application executing on a processor, a first vector for each of a plurality of text summaries in a corpus, wherein the first vector represents each term in the respective text summary as a respective feature of a plurality of features; generating, by the application, a second vector for the plurality of text summaries, wherein the second vector indicates whether each of a plurality of entities is present in the respective text summary; combining, by the application, the first vector and the second vector to produce a feature matrix for the corpus; receiving, by the application, a query comprising a term; generating, by the application based on an embedding vector and the term, an expanded query comprising a plurality of additional terms and the term; generating, by the application based on a term frequency-inverse document frequency (TF-IDF) model, a vector for the expanded query; generating, by the application, an entity vector for the query; generating, by the application, a combined vector for the query based on the entity vector and the vector for the expanded query; computing, by the application based on the combined vector for the query and the feature matrix for the corpus, a respective cosine similarity score for a plurality of results in the corpus; and returning, by the application, one or more of the plurality of results as responsive to the query based on the cosine similarity scores. 2. A non-transitory computer-readable storage medium, the computer-readable storage medium storing instructions that when executed by a processor, cause the processor to: generate, by an application executing on the processor, a first vector for each of a plurality of text summaries in a corpus, wherein the first vector represents each term in the respective text summary as a respective feature of a plurality of features; generate, by the application, a second vector for the plurality of text summaries, wherein the second vector indicates whether each of a plurality of entities is present in the respective text summary; combine, by the application, the first vector and the second vector to produce a feature matrix for the corpus; receive, by the application, a query comprising a term; generate, by the application based on an embedding vector and the term, an expanded query comprising a plurality of additional terms and the term; generate, by the application based on a based on a term frequency-inverse document frequency (TF-IDF) model, a vector for the expanded query; generate, by the application, an entity vector for the query; generate, by the application, a combined vector for the query based on the entity vector and the vector for the expanded query; compute, by the application based on the combined vector for the query and the feature matrix for the corpus, a respective cosine similarity score for a plurality of results in the corpus; and return, by the application, one or more of the plurality of results as responsive to the query based on the cosine similarity scores. 3. A computing apparatus comprising: a processor; and a memory storing instructions that, when executed by the processor, cause the processor to: generate, by an application executing on the processor, a first vector for each of a plurality of text summaries in a corpus, wherein the first vector represents each term in the respective text summary as a respective feature of a plurality of features; generate, by the application, a second vector for the plurality of text summaries, wherein the second vector indicates whether each of a plurality of entities is present in the respective text summary; combine, by the application, the first vector and the second vector to produce a feature matrix for the corpus; receive, by the application, a query comprising a term; generate, by the application based on an embedding vector and the term, an expanded query comprising a plurality of additional terms and the term; generate, by the application based on a based on a term frequency-inverse document frequency (TF-IDF) model, a vector for the expanded query; generate, by the application, an entity vector for the query; generate, by the application, a combined vector for the query based on the entity vector and the vector for the expanded query; compute, by the application based on the combined vector for the query and the feature matrix for the corpus, a respective cosine similarity score for a plurality of results in the corpus; and return, by the application, one or more of the plurality of results as responsive to the query based on the cosine similarity scores. 4. The computer-implemented method of claim 1 , wherein generating the entity vector comprises: identifying, by the application, a first entity of the plurality of entities in the corpus; and storing, by the application in the entity vector for the query, an indication that the query is associated with the first entity of the plurality of entities in the corpus. 5. The computer-implemented method of claim 1 , wherein generating the expanded query comprises: identifying, by the application based on the embedding vector and the term, a respective score for each of the plurality of additional terms; determining, by the application, a subset of the plurality of additional terms that have for which the score exceeds an expansion threshold; and adding, by the application, the subset of the plurality of additional terms having the score exceeding the expansion threshold to the query. 6. The computer-implemented method of claim 1 , wherein the combined vector for the query comprises a plurality of features, the method further comprising: receiving, by the application, input labeling a first feature of the plurality of features as relevant to the query; receiving, by the application, input labeling a second feature of the plurality of features as not relevant to the query; removing, by the application, the second feature from the combined vector for the query; and updating, by the application, the combined vector based on the remaining plurality of features and a respective weight for each remaining feature. 7. The computer-implemented method of claim 1 , wherein the cosine similarity scores are computed based on a product of the combined vector for the query and the feature matrix of the corpus. 8. The computer-implemented method of claim 1 , wherein the embedding vector comprises a plurality of entries, wherein each entry of the embedding vector is associated with a respective one of the additional terms and comprises a respective score for the additional term, wherein each score is based on a similarity between the respective additional term and the term, wherein the entity vector comprises a plurality of entries, wherein each entry of the entity vector is associated with a respective entity of the plurality of entities, wherein a value of the respective entry of the entity vector indicates that the respective entity is present in the query or that the respective entity is not present in the query. 9. The computer-readable storage medium of claim 2 , wherein the instructions to generate the entity vector comprise instructions that when executed by the processor cause the processor to: identify, by the application, a first entity of the plurality of entities in the corpus; and store, by the application in the entity vector for the query, an indication that the query is associated with the first entity of the plurality of entities in the corpus. 10. The computer-readable storage medium of claim 2 , wherein the instructions to generate the expanded query comprises instruction

Assignees

Inventors

Classifications

  • Machine learning · CPC title

  • Query expansion · CPC title

  • Presentation of query results · CPC title

  • using vector based model · CPC title

  • Architecture, e.g. interconnection topology · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11620319B2 cover?
Systems, methods, and computer program products for search platforms for unstructured interaction summaries. An application executing on a processor may receive a query comprising a term. The application may generate, based on an embedding vector and the term, an expanded query comprising a plurality of additional terms. The application may generate, based on a term frequency inverse document f…
Who is the assignee on this patent?
Capital One Services Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/3338. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 04 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).