What technology area does this patent fall under?

Primary CPC classification G06F16/93. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Dec 29 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Document management using clause clusters

US2022414153A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2022414153-A1
Application number	US-202117362664-A
Country	US
Kind code	A1
Filing date	Jun 29, 2021
Priority date	Jun 29, 2021
Publication date	Dec 29, 2022
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A document management system analyzes document clauses using document clause clusters. The document management system uses measures of similarity between document clauses from different documents to assign clauses to clause clusters. Clause clusters may be used to perform various analyses, such as to assign clauses a classification corresponding to a relevant clause cluster. The document management system provides analyses performed using document clause clusters for user review, such as to approve clause clusters, classify clause clusters, modify clause clusters, or some combination thereof.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: accessing a plurality of documents each including a plurality of clauses; comparing the plurality of clauses included in the plurality of documents to a plurality of clusters of clauses, each cluster of clauses including clauses having a shared classification; identifying, based on the comparison of the plurality of clauses to the plurality of clusters of clauses, a set of clauses from the plurality of clauses that cannot be clustered with at least one of the plurality of clusters, each clause in the set of clauses within a threshold similarity measure to each other clause in the set of clauses, wherein identifying the set of clauses includes: determining a similarity measure for a clause of the plurality of clauses that is below the threshold similarity measure to each other clause in the set of clauses, the similarity measure below the threshold similarity measure based on least in part on the term of contextual significance; and excluding the clause from the set of clauses based on the similarity measure being below the threshold similarity measure; providing an interface for display to a client device including a clause from the set of clauses; determining, based on a user input received via the interface, a classification of the clause from the set of clauses; applying the determined classification to each clause in the set of clauses; and generating a new cluster in the plurality of clusters using the set of clauses having the applied classification. 2 . The method of claim 1 , further comprising: generating the plurality of clusters of clauses by: receiving a plurality of training documents; extracting a plurality of training clauses from the plurality of training documents; determining similarity measures between pairs of the plurality of training clauses; and determining the plurality of clusters of clauses based on the determined similarity measures between pairs of the plurality of training clauses. 3 . The method of claim 2 , further comprising: after the plurality of clusters of clauses is generated, identifying an unclassified cluster of the plurality of clusters of clauses; identifying a template clause corresponding to the unclassified cluster; and assigning a classification to the unclassified cluster using the template clause. 4 . The method of claim 1 , wherein identifying the set of clauses from the plurality of clauses comprises: identifying a template clause that is not clustered with one of the plurality of clusters; comparing an additional clause from the plurality of clauses that cannot be clustered with one of the plurality of clusters to the template clause; determining, based on the comparison of the additional clause to the template clause, a similarity measure between the template clause and the additional clause; and responsive to the similarity measure between the template clause and the additional clause exceeding a similarity measure threshold, adding the additional clause to the set of clauses. 5 . The method of claim 4 , wherein the interface includes the template clause, and wherein determining the classification of the clause from the set of clauses comprises: receiving user input via the interface indicating a classification for the template clause. 6 . The method of claim 5 , wherein the interface includes the additional clause, and wherein determining the classification of the clause from the set of clauses comprises: receiving user input via the interface indicating an approval to cluster the additional clause with the template clause. 7 . The method of claim 1 , wherein a cluster of the plurality of clusters includes a primary stack of clauses and a secondary stack of clauses, and further comprising: based on the comparison of the plurality of clauses to the plurality of clusters of clauses: identifying a first set of clauses matching the primary stack of clauses; and identifying a second of set of clauses matching the secondary stack of clauses; responsive to identifying the first set of clauses, assigning the shared classification of the cluster to the first set of clauses; and responsive to identifying the second set of clauses, providing an additional interface for display to the client device including a request for approval to assign the shared classification of the cluster to the second set of clauses. 8 . The method of claim 7 , wherein the primary stack of clauses includes clauses that are exact matches and the secondary stack of clauses includes clauses within a threshold similarity measure of the clauses in the primary stack of clauses. 9 . The method of claim 1 , further comprising: for a document of the plurality of documents: comparing one or more clauses of the document to one or more clauses of a template document; and determining, based on the comparison of the one or more clauses of the document to the one or more clauses of the template document, a measure of commonality indicative of a degree to which the one or more clauses of the document match the one or more clauses of the template document. 10 . The method of claim 9 , wherein determining the measure of commonality comprises: determining one or more similarity measures based on the comparison of the one or more clauses of the document to the one or more clauses of the template document; and computing the measure of commonality using the one or more similarity measures, the measure of commonality indicative of a percentage of the one or more similarity measures that are within an additional threshold similarity measure. 11 . The method of claim 1 , wherein a similarity measure of the clause of the set of clauses is determined by: performing one or more similarity analyses on the clause of the set of clauses and a template clause; and calculating the similarity measure using results of the one or more similarity analyses. 12 . The method of claim 11 , wherein the one or more similarity analyses include term frequency-inverse document frequency (TF-IDF), text near duplication, conceptual clustering, conceptual searching, or text redlining. 13 . The method of claim 1 , wherein determining the similarity measure for the clause of the plurality of clauses comprises: determining an initial similarity measure for the clause of the plurality of clause that is above the threshold similarity measure, the initial similarity measure determined without respect to the term of contextual significance; modifying the initial similarity measure for the clause based on the term of contextual significance, the modified similarity measure below the threshold similarity measure. 14 . The method of claim 1 , further comprising: determining, based on the comparison of the plurality of clauses to the plurality of clusters of clauses, one or more recommended template documents for the received documents; and providing the one or more recommended template documents to the client device. 15 . A system comprising a hardware processor and a non-transitory computer-readable storage medium storing instructions that, when executed by the hardware processor, cause the processor to perform steps comprising: accessing a plurality of documents each including a plurality of clauses; comparing the plurality of clauses included in the plurality of documents to a plurality of clusters of clauses, each cluster of clauses including clauses having a shared classification; identifying, based on the comparison of the plurality of clauses to the plurality of clusters of clauses, a set of clauses from the pl

Assignees

Docusign Inc

Inventors

Classifications

G06F18/22
Matching criteria, e.g. proximity measures · CPC title
G06F16/93Primary
Document management systems · CPC title
G06F16/906Primary
Clustering; Classification · CPC title
G06F18/214
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06K9/6256
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 84542207

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022414153A1 cover?: A document management system analyzes document clauses using document clause clusters. The document management system uses measures of similarity between document clauses from different documents to assign clauses to clause clusters. Clause clusters may be used to perform various analyses, such as to assign clauses a classification corresponding to a relevant clause cluster. The document manage…
Who is the assignee on this patent?: Docusign Inc
What technology area does this patent fall under?: Primary CPC classification G06F16/93. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Dec 29 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).