What technology area does this patent fall under?

Primary CPC classification G06F40/30. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jan 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Determining semantic similarity of texts based on sub-sections thereof

US2022012431A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2022012431-A1
Application number	US-202117448667-A
Country	US
Kind code	A1
Filing date	Sep 23, 2021
Priority date	Mar 22, 2019
Publication date	Jan 13, 2022
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are provided to compare a target sample of text to a set of textual records, each textual record including a sample of text and an indication of one or more segments of text within the sample of text. Semantic similarity values between the target sample of text and each of the textual records are determined. Determining a particular semantic similarity value between the target sample of text and a particular textual record of the corpus includes: (i) determining individual semantic similarity values between the target sample of text and each of the segments of text indicated by the particular textual record, and (ii) generating the particular semantic similarity value between the target sample of text and the particular textual record based on the individual semantic similarity values. A textual record is then selected based on the semantic similarities.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system comprising: a processor; and a memory, accessible by the processor, the memory storing instructions that, when executed by the processor, cause the processor to perform operations comprising: accessing a corpus comprising a plurality of textual records; generating, via a machine learning model, indications of one or more respective segments of text within each of the textual records in the corpus; obtaining, from a client device, a target sample of text; generating respective record semantic similarity values between the target sample of text and each of the textual records in the corpus, comprising, for each of the textual records in the corpus: determining one or more respective segment semantic similarity values between the target sample of text and the one or more segments of text within the textual record; and generating the respective record semantic similarity value between the target sample of text and the textual record based on the one or more respective segment semantic similarity values; selecting from the corpus, based on the generated record semantic similarity values, a particular textual record having the highest respective record semantic similarity value for the target sample of text; and providing, to the client device, a representation of the particular textual record. 2 . The system of claim 1 , wherein determining the one or more respective segment semantic similarity values between the target sample of text and the one or more segments of text within the textual record comprises: receiving a vector representation of the target sample of text, wherein the vector representation of the target sample of text includes word vectors that describe, in a first semantically-encoded vector space, a meaning of respective words of the target sample of text, or a paragraph vector that describes, in a second semantically-encoded vector space, a meaning of multiple words of the target sample of text, or both; receiving one or more vector representations of the one or more segments of text within the textual record, wherein the one or more vector representations of the one or more segments of text within the textual record comprises word vectors that describe, in the first semantically-encoded vector space, a meaning of respective words of the one or more segments of text within the textual record, or a paragraph vector that describes, in the second semantically-encoded vector space, a meaning of multiple words within the one or more segments of text within the textual record; and determining a vector semantic similarity value between the vector representation of the target sample of text and the vector representation of the one or more segments of text within the textual record. 3 . The system of claim 1 , wherein generating the respective record semantic similarity value between the target sample of text and the textual record based on the one or more respective segment semantic similarity values comprises: comparing, to a threshold similarity level, each of the one or more segment semantic similarity values between the target sample of text and each of the one or more segments of text within the textual record; and determining a number of the one or more segment semantic similarity values that exceed the threshold similarity level as the respective record semantic similarity value. 4 . The system of claim 1 , wherein the indications of the one or more respective segments of text within each of the textual records comprise non-overlapping segments of text. 5 . The system of claim 1 , wherein the one or more respective segments of text within each of the textual records comprise one or more discrete sentences. 6 . The system of claim 1 , wherein generating the respective record semantic similarity value between the target sample of text and the textual record based on the one or more respective segment semantic similarity values comprises: weighting the one or more respective segment semantic similarity values based on a ranking of the one or more respective segment semantic similarity values; and generating a sum of the weighted one or more respective segment semantic similarity values between the target sample of text and each of the one or more segments of text within the textual record. 7 . The system of claim 1 , wherein each of the textual records comprises an indication of a time stamp within a predetermined time threshold. 8 . A computer-implemented method comprising: accessing, by a server device, a corpus comprising a plurality of textual records; generating indications of one or more respective segments of text within each of the textual records in the corpus; receiving, by the server device and from a client device, a target sample of text; generating respective record, by the server device, semantic similarity values between the target sample of text and each of the textual records in the corpus, comprising, for each of the textual records in the corpus: determining one or more respective segment semantic similarity values between the target sample of text and the one or more segments of text within the textual record; and generating the respective record semantic similarity value between the target sample of text and the textual record based on the one or more respective segment semantic similarity values; selecting from the corpus, based on the generated record semantic similarity values, a particular textual record having the highest respective semantic similarity value for the target sample of text; and providing, by the server device and to the client device, a representation of the particular textual record. 9 . The computer-implemented method of claim 8 , wherein determining the one or more respective segment semantic similarity values between the target sample of text and the one or more segments of text within the textual record comprises: receiving a vector representation of the target sample of text, wherein the vector representation of the target sample of text includes word vectors that describe, in a first semantically-encoded vector space, a meaning of respective words of the target sample of text, or a paragraph vector that describes, in a second semantically-encoded vector space, a meaning of multiple words of the target sample of text, or both; receiving one or more vector representations of the one or more segments of text within the textual record, wherein the one or more vector representations of the one or more segments of text within the textual record comprises word vectors that describe, in the first semantically-encoded vector space, a meaning of respective words of the one or more segments of text within the textual record, or a paragraph vector that describes, in the second semantically-encoded vector space, a meaning of multiple words within the one or more segments of text within the textual record; and determining a vector semantic similarity value between the vector representation of the target sample of text and the vector representation of the one or more segments of text within the textual record. 10 . The computer-implemented method of claim 8 , wherein generating the respective record semantic similarity value between the target sample of text and the textual record based on the one or more respective segment semantic similarity values comprises: comparing, to a threshold similarity level, each of the one or more segment semantic similarity values between the target sample of text and each of the one or more segments of text within the textual record; and determining a number of the one or more segment semantic similarity values that exceed the threshold similarity level as t

Assignees

Servicenow Inc

Inventors

Classifications

G06F16/36
Creation of semantic tools, e.g. ontology or thesauri · CPC title
G06F40/30Primary
Semantic analysis · CPC title
G06F16/3347Primary
using vector based model · CPC title
G06F40/284
Lexical analysis, e.g. tokenisation or collocates · CPC title
G06F40/205
Parsing · CPC title

Patent family

Related publications grouped by family.

View patent family 70289482

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022012431A1 cover?: Systems and methods are provided to compare a target sample of text to a set of textual records, each textual record including a sample of text and an indication of one or more segments of text within the sample of text. Semantic similarity values between the target sample of text and each of the textual records are determined. Determining a particular semantic similarity value between the targ…
Who is the assignee on this patent?: Servicenow Inc
What technology area does this patent fall under?: Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jan 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Retraining a lexical analysis model leveraging process of annotation operations created by a user

Method and system for assessing similarity of documents

Automatic questioning and answering processing method and automatic questioning and answering system

Text analyzing method and device, server and computer-readable storage medium

Machine reading comprehension system for answering queries related to a document

Systems and methods for categorizing content

Frequently asked questions