User support with integrated conversational user interfaces and social question answering
US-10878008-B1 · Dec 29, 2020 · US
US2022414153A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2022414153-A1 |
| Application number | US-202117362664-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 29, 2021 |
| Priority date | Jun 29, 2021 |
| Publication date | Dec 29, 2022 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A document management system analyzes document clauses using document clause clusters. The document management system uses measures of similarity between document clauses from different documents to assign clauses to clause clusters. Clause clusters may be used to perform various analyses, such as to assign clauses a classification corresponding to a relevant clause cluster. The document management system provides analyses performed using document clause clusters for user review, such as to approve clause clusters, classify clause clusters, modify clause clusters, or some combination thereof.
Opening claim text (preview).
What is claimed is: 1 . A method comprising: accessing a plurality of documents each including a plurality of clauses; comparing the plurality of clauses included in the plurality of documents to a plurality of clusters of clauses, each cluster of clauses including clauses having a shared classification; identifying, based on the comparison of the plurality of clauses to the plurality of clusters of clauses, a set of clauses from the plurality of clauses that cannot be clustered with at least one of the plurality of clusters, each clause in the set of clauses within a threshold similarity measure to each other clause in the set of clauses, wherein identifying the set of clauses includes: determining a similarity measure for a clause of the plurality of clauses that is below the threshold similarity measure to each other clause in the set of clauses, the similarity measure below the threshold similarity measure based on least in part on the term of contextual significance; and excluding the clause from the set of clauses based on the similarity measure being below the threshold similarity measure; providing an interface for display to a client device including a clause from the set of clauses; determining, based on a user input received via the interface, a classification of the clause from the set of clauses; applying the determined classification to each clause in the set of clauses; and generating a new cluster in the plurality of clusters using the set of clauses having the applied classification. 2 . The method of claim 1 , further comprising: generating the plurality of clusters of clauses by: receiving a plurality of training documents; extracting a plurality of training clauses from the plurality of training documents; determining similarity measures between pairs of the plurality of training clauses; and determining the plurality of clusters of clauses based on the determined similarity measures between pairs of the plurality of training clauses. 3 . The method of claim 2 , further comprising: after the plurality of clusters of clauses is generated, identifying an unclassified cluster of the plurality of clusters of clauses; identifying a template clause corresponding to the unclassified cluster; and assigning a classification to the unclassified cluster using the template clause. 4 . The method of claim 1 , wherein identifying the set of clauses from the plurality of clauses comprises: identifying a template clause that is not clustered with one of the plurality of clusters; comparing an additional clause from the plurality of clauses that cannot be clustered with one of the plurality of clusters to the template clause; determining, based on the comparison of the additional clause to the template clause, a similarity measure between the template clause and the additional clause; and responsive to the similarity measure between the template clause and the additional clause exceeding a similarity measure threshold, adding the additional clause to the set of clauses. 5 . The method of claim 4 , wherein the interface includes the template clause, and wherein determining the classification of the clause from the set of clauses comprises: receiving user input via the interface indicating a classification for the template clause. 6 . The method of claim 5 , wherein the interface includes the additional clause, and wherein determining the classification of the clause from the set of clauses comprises: receiving user input via the interface indicating an approval to cluster the additional clause with the template clause. 7 . The method of claim 1 , wherein a cluster of the plurality of clusters includes a primary stack of clauses and a secondary stack of clauses, and further comprising: based on the comparison of the plurality of clauses to the plurality of clusters of clauses: identifying a first set of clauses matching the primary stack of clauses; and identifying a second of set of clauses matching the secondary stack of clauses; responsive to identifying the first set of clauses, assigning the shared classification of the cluster to the first set of clauses; and responsive to identifying the second set of clauses, providing an additional interface for display to the client device including a request for approval to assign the shared classification of the cluster to the second set of clauses. 8 . The method of claim 7 , wherein the primary stack of clauses includes clauses that are exact matches and the secondary stack of clauses includes clauses within a threshold similarity measure of the clauses in the primary stack of clauses. 9 . The method of claim 1 , further comprising: for a document of the plurality of documents: comparing one or more clauses of the document to one or more clauses of a template document; and determining, based on the comparison of the one or more clauses of the document to the one or more clauses of the template document, a measure of commonality indicative of a degree to which the one or more clauses of the document match the one or more clauses of the template document. 10 . The method of claim 9 , wherein determining the measure of commonality comprises: determining one or more similarity measures based on the comparison of the one or more clauses of the document to the one or more clauses of the template document; and computing the measure of commonality using the one or more similarity measures, the measure of commonality indicative of a percentage of the one or more similarity measures that are within an additional threshold similarity measure. 11 . The method of claim 1 , wherein a similarity measure of the clause of the set of clauses is determined by: performing one or more similarity analyses on the clause of the set of clauses and a template clause; and calculating the similarity measure using results of the one or more similarity analyses. 12 . The method of claim 11 , wherein the one or more similarity analyses include term frequency-inverse document frequency (TF-IDF), text near duplication, conceptual clustering, conceptual searching, or text redlining. 13 . The method of claim 1 , wherein determining the similarity measure for the clause of the plurality of clauses comprises: determining an initial similarity measure for the clause of the plurality of clause that is above the threshold similarity measure, the initial similarity measure determined without respect to the term of contextual significance; modifying the initial similarity measure for the clause based on the term of contextual significance, the modified similarity measure below the threshold similarity measure. 14 . The method of claim 1 , further comprising: determining, based on the comparison of the plurality of clauses to the plurality of clusters of clauses, one or more recommended template documents for the received documents; and providing the one or more recommended template documents to the client device. 15 . A system comprising a hardware processor and a non-transitory computer-readable storage medium storing instructions that, when executed by the hardware processor, cause the processor to perform steps comprising: accessing a plurality of documents each including a plurality of clauses; comparing the plurality of clauses included in the plurality of documents to a plurality of clusters of clauses, each cluster of clauses including clauses having a shared classification; identifying, based on the comparison of the plurality of clauses to the plurality of clusters of clauses, a set of clauses from the pl
Matching criteria, e.g. proximity measures · CPC title
Document management systems · CPC title
Clustering; Classification · CPC title
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.