What technology area does this patent fall under?

Primary CPC classification G06Q50/18. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu May 11 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Legal document search based on legal similarity

US2017132730A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2017132730-A1
Application number	US-201514938041-A
Country	US
Kind code	A1
Filing date	Nov 11, 2015
Priority date	Nov 11, 2015
Publication date	May 11, 2017
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system are provided for performing a legal document search. The method includes finding, by a processor, for each of a plurality of documents, a respective law clause related thereto, to obtain a plurality of related law clauses. The method further includes constructing, by the processor, a graph having nodes defined by the plurality of documents and the plurality of related law clauses and having edges defined by (1) relations between the plurality of documents and the plurality of related law clauses and (2) relations between the plurality of documents. The method further includes identifying, by the processor, from the plurality of documents, one or more candidate documents that are similar to an input query document by mining the graph using similarity criteria.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for performing a legal document search, the method comprising: finding, by a processor, for each of a plurality of documents, a respective law clause related thereto, to obtain a plurality of related law clauses; constructing, by the processor, a graph having nodes defined by the plurality of documents and the plurality of related law clauses and having edges defined by (1) relations between the plurality of documents and the plurality of related law clauses and (2) relations between the plurality of documents; and identifying, by the processor, from the plurality of documents, one or more candidate documents that are similar to an input query document by mining the graph using similarity criteria. 2 . The method of claim 1 , wherein said finding step comprises determining a respective confidence score for each pairing of a given one of the plurality of documents and the respective law clause related thereto, the respective confidence score serving as a ranking for the given one of the plurality of documents with respect to the respective law clause related thereto. 3 . The method of claim 2 , wherein the one or more candidate documents comprise a plurality of candidate documents, and the method further comprises re-ranking the plurality of candidate documents based on a number of the plurality of related law clauses that occur in both the plurality of candidate documents and the input query document. 4 . The method of claim 3 , wherein said re-ranking step re-ranks the plurality of candidate documents using at least one of a mean reciprocal rank and a mean averaged precision. 5 . The method of claim 3 , wherein the re-ranking step re-ranks the plurality of documents using an Integer Linear Programming (ILP) solver. 6 . The method of claim 1 , wherein said finding step comprises: representing law clause and document combinations as respective vectors; and measuring a cosine similarity between the respective vectors to identify the respective law clause for the each of the documents with a respective confidence score. 7 . The method of claim 6 , wherein the vectors are Term Frequency-Inverse Document Frequency vectors. 8 . The method of claim 1 , further comprising training a word embedding model using only law clauses and omitting documents, wherein the respective vectors are formed using the word embedding model. 9 . The method of claim 1 , wherein said identifying step comprises displaying, on a hardware display device, the one or more candidate documents. 10 . The method of claim 1 , wherein said identifying step comprises transmitting, over one or more networks by a hardware transmission device, the one or more candidate documents to a remote computing device. 11 . The method of claim 1 , wherein the graph is mined using a random walk path formulation. 12 . The method of claim 11 , wherein the random walk path formulation includes a restart component. 13 . The method of claim 11 , wherein the random walk path formulation includes indirect relations between the law clauses and the plurality of documents. 14 . The method of claim 1 , wherein the graph is mined using an approach that considers only the nodes defined by the plurality of documents while omitting the nodes defined by the plurality of related law clauses. 15 . The method of claim 1 , wherein the graph is mined using an approach that considers only (1) the relations between the plurality of documents and the plurality of related law clauses while omitting (2) the relations between the plurality of documents. 16 . The method of claim 1 , wherein the graph is mined using an approach that considers both (1) the relations between the plurality of documents and the plurality of related law clauses and (2) the relations between the plurality of documents. 17 . A computer program product for performing a legal document search, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform a method comprising: finding, by a processor, for each of a plurality of documents, a respective law clause related thereto, to obtain a plurality of related law clauses; constructing, by the processor, a graph having nodes defined by the plurality of documents and the plurality of related law clauses and having edges defined by relations between (1) the plurality of documents and the plurality of related law clauses and (2) the plurality of documents; and identifying, by the processor, from the plurality of documents, one or more candidate documents that are similar to an input query document by mining the graph using similarity criteria. 18 . The system of claim 17 , wherein the processor determines a respective confidence score for each pairing of a given one of the plurality of documents and the respective law clause related thereto, the respective confidence score serving as a ranking for the given one of the plurality of documents with respect to the respective law clause related thereto. 19 . The system of claim 18 , wherein the one or more candidate documents comprise a plurality of candidate documents, and the processor re-ranks the plurality of candidate documents based on a number of the plurality of related law clauses that occur in both the plurality of candidate documents and the input query document. 20 . A system for performing a legal document search, the system comprising: a hardware processor and a memory device, configured to: find, for each of a plurality of documents, a respective law clause related thereto, to obtain a plurality of related law clauses; construct a graph having nodes defined by the plurality of documents and the plurality of related law clauses and having edges defined by (1) relations between the plurality of documents and the plurality of related law clauses and (2) relations between the plurality of documents; and identify, from the plurality of documents, one or more candidate documents that are similar to an input query document by mining the graph using similarity criteria; and a transmission server for transmitting, over one or more networks, the one or more candidate documents to a remote computing device.

Assignees

Inventors

Classifications

G06F16/26
Visual data mining; Browsing structured data · CPC title
G06F16/24578
using ranking · CPC title
G06F16/93
Document management systems · CPC title
G06Q50/18Primary
Legal services · CPC title
G06F17/30011
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 58663615

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2017132730A1 cover?: A method and system are provided for performing a legal document search. The method includes finding, by a processor, for each of a plurality of documents, a respective law clause related thereto, to obtain a plurality of related law clauses. The method further includes constructing, by the processor, a graph having nodes defined by the plurality of documents and the plurality of related law cl…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06Q50/18. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu May 11 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).