What technology area does this patent fall under?

Primary CPC classification G06F16/3344. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Nov 30 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Search tool for identifying and sizing customer issues through interaction summaries and call transcripts

US2023385316A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2023385316-A1
Application number	US-202217970019-A
Country	US
Kind code	A1
Filing date	Oct 20, 2022
Priority date	May 26, 2022
Publication date	Nov 30, 2023
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The exemplary embodiments may provide a search tool that can locate customer issues in call transcripts and agent notes and can provide an accurate count of how often such issues appear in the call transcripts and agent notes. The exemplary embodiments may improve the speed with which the search of documents is performed. The exemplary embodiments rely upon a document matrix that is computed once for a given corpus of documents and a given vocabulary of the documents. The document matrix may be used across multiple queries. The exemplary embodiments also account for similar terms in processing a query. The exemplary embodiments may use a word coverage factor to improve the relevance of the search results returned by the search tool. The word coverage factor acts as a multiple factor that computes the fraction a query terms that are present in a document.

First claim

Opening claim text (preview).

1 . A method performed by a processor of a computing device, the method comprising: receiving a query, the query containing one or more terms; processing a corpus of documents with the processor to determine how relevant the documents are to the query, wherein: the processing comprises scoring the documents in the corpus with the processor for relevance and the scoring is a product of at least a sparse document matrix and a query vector, each entry in the sparse document matrix holds a contribution value of an associated term in an associated one of the documents in the corpus, and the query vector holds values for terms in the query; and sorting the documents in the corpus with the processor by scores assigned by the scoring; and responsive to the query, generating output, based on the sorting, identifying best scoring ones of the documents. 2 . The method of claim 1 , wherein columns in the sparse document matrix are associated with terms that are part of documents and rows are associated with documents in the corpus of the documents. 3 . The method of claim 1 , wherein the contribution value specifies a measure of a contribution the associated term contributes to a relevance of the associated one of the documents to the query. 4 . The method of claim 1 , wherein the contribution value is based in part on an inverse document frequency weight of the associated term. 5 . The method of claim 1 , wherein the query vector includes values for the terms in the vocabulary of the documents indicating if the terms are in the query. 6 . The method of claim 1 , wherein the query vector has a row per term in the vocabulary of the documents. 7 . A method performed by a processor of a computing device, the method comprising: receiving a query, the query containing one or more terms; processing a corpus of documents with the processor to determine relevance of the documents to the query, wherein: the processing comprises scoring the documents in the corpus with the processor for relevance and the scoring is a product of at least a sparse document matrix, a similarity matrix, and a query vector, each entry in the sparse document matrix holds a contribution value of an associated term in an associated one of the documents in the corpus, the similarity matrix holds values indicating a degree of similarity between term pairs, and the query vector holds values for terms in the query; and sorting the documents in the corpus with the processor by scores assigned by the scoring; and responsive to the query, generating output, based on the sorting, identifying best scoring ones of the documents. 8 . The method of claim 7 , wherein the values in the similarity matrix range from 0 to 1, wherein a value of 1 indicates that terms in the term pair are the same and 0 indicates that the terms in the term pair are dissimilar. 9 . The method of claim 7 , wherein the term pairs are in the vocabulary of the documents. 10 . The method of claim 7 , wherein the values in the similarity matrix are based on cosine similarity values, Levenshtein distance values and/or edit distance values. 11 . The method of claim 7 , wherein rows of the similarity matrix are associated with the terms in the vocabulary and the columns in the similarity matrix are associated with the terms in the vocabulary. 12 . The method of claim 7 , wherein columns in the sparse document matrix are associated with terms that are part of the vocabulary and rows in the sparse document matrix are associated with documents in the corpus of documents. 13 . The method of claim 7 , wherein the query vector includes values for the terms in the vocabulary of the documents indicating if the terms are in the query. 14 . A method performed by a processor of a computing device, the method comprising: receiving a query, the query containing one or more terms; processing a corpus of documents with the processor to determine if each of the documents is relevant to the query, wherein: the processing comprises scoring the documents in the corpus with the processor for relevance and the scoring is a product of at least a sparse document matrix, a query vector and word coverage factor vector, each entry in the sparse document matrix holds a contribution value of an associated term in an associated one of the documents in the corpus, and the query vector holds values for terms in the query, the word coverage factor vector holds a value for each of the documents in the corpus; and sorting the documents in the corpus with the processor by scores assigned by the scoring; and responsive to the query, generating output, based on the sorting, identifying best scoring ones of the documents. 15 . The method of claim 14 , wherein the values in the word coverage factor vector identify what fraction of terms in the query appear in the associated documents in the corpus. 16 . The method of claim 14 , wherein a one of the values in the word coverage factor vector for a selected document in the corpus is a sum of an incidence of each of the terms in the query in the selected document divided by a number of terms in the query. 17 . The method of claim 14 wherein the scoring is a product of the sparse document matrix, a similarity matrix, the query vector, and the word coverage factor vector and wherein the similarity matrix holds values indicating a degree of similarity between term pairs. 18 . The method of claim 17 , wherein the values in the similarity matrix range from 0 to 1, wherein a value of 1 indicates that terms in the term pair are the same and 0 indicates that the terms in the term pair are dissimilar. 19 . The method of claim 18 , wherein a one of the values in the word coverage factor vector for a given document in the corpus is a sum of an incidence of each of the terms in the query in the given document and each of the terms having a non-zero value in the similarity matrix with terms in the query in the given document divided by a number of terms in the query. 20 . The method of claim 14 , wherein the contribution value specifies a measure of a contribution the associated term contributes to a relevance of the associated one of the documents to the query.

Assignees

Capital One Services Llc

Inventors

Classifications

G06F16/3344Primary
using natural language analysis · CPC title
G06F16/316
Indexing structures · CPC title
G06F16/313
Selection or weighting of terms for indexing · CPC title
G06F40/20
Natural language analysis (semantic analysis of natural language G06F40/30) · CPC title
G06F16/3334
Selection or weighting of terms from queries, including natural language queries · CPC title

Patent family

Related publications grouped by family.

View patent family 88877340

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2023385316A1 cover?: The exemplary embodiments may provide a search tool that can locate customer issues in call transcripts and agent notes and can provide an accurate count of how often such issues appear in the call transcripts and agent notes. The exemplary embodiments may improve the speed with which the search of documents is performed. The exemplary embodiments rely upon a document matrix that is computed on…
Who is the assignee on this patent?: Capital One Services Llc
What technology area does this patent fall under?: Primary CPC classification G06F16/3344. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Nov 30 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).