Systems and methods for automatic identification of potential material facts in documents

US2016140210A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016140210-A1
Application numberUS-201514944692-A
CountryUS
Kind codeA1
Filing dateNov 18, 2015
Priority dateNov 19, 2014
Publication dateMay 19, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods to identify potential material fact sentences in electronic legal documents obtained from electronic repositories are disclosed. A system includes a processing device and a storage medium in communication with the processing device. The storage medium includes programming instructions that cause the processing device to obtain a document and parse text within the document to determine whether each paragraph in the document is a fact paragraph, a discussion paragraph, or an outcome paragraph based on at least one of a heading associated with the paragraph and features of the paragraph. The storage medium further includes programming instructions that cause the processing device to extract each sentence in the fact paragraph, direct a trained sentence classifier to determine whether each sentence is a potential material fact sentence or a non-material fact sentence based on features of the sentence, and identify potential material fact sentences.

First claim

Opening claim text (preview).

1 . A system to identify potential material fact sentences in electronic legal documents obtained from electronic repositories, the system comprising: a processing device; and a non-transitory, processor-readable storage medium in communication with the processing device, the non-transitory, processor-readable storage medium comprising one or more programming instructions that, when executed, cause the processing device to: obtain an electronic legal document from a repository, parse text within the electronic legal document to determine whether each one of one or more paragraphs in the legal document is a fact paragraph, a discussion paragraph, or an outcome paragraph based on at least one of a heading associated with the paragraph and one or more features of the paragraph, and for each one of the one or more paragraphs that is a fact paragraph: extract each one of one or more sentences in the fact paragraph, direct a trained sentence classifier to determine whether each one of the one or more sentences is a potential material fact sentence or a non-material fact sentence based on one or more features of the sentence, and identify one or more potential material fact sentences from the one or more sentences based on the determination. 2 . The system of claim 1 , wherein the one or more features of the sentence is selected from a group consisting of a number of noun phrases, a number of verb phrases, a number of dates, a number of time stamps, a number of monetary values, a number of lower court actions, a number of present court actions, a number of plaintiff actions, a number of legal phrases, a number of legal concepts, a number of non-material fact words, and a number of non-material fact phrases. 3 . The system of claim 1 , wherein the trained sentence classifier determines whether each one of the one or more sentences is a potential material fact sentence or a non-material fact sentence by running a natural language parser on each one of the one or more sentences to determine the one or more features of the sentence. 4 . The system of claim 1 , wherein the trained sentence classifier determines whether each one of the one or more sentences is a potential material fact sentence or a non-material fact sentence by scoring the one or more features based on a trained model generated by a support vector machine algorithm from training data. 5 . The system of claim 1 , wherein the trained sentence classifier determines whether each one of the one or more sentences is a potential material fact sentence or a non-material fact sentence by scoring the one or more features based on a trained model generated by a decision tree algorithm from training data. 6 . The system of claim 1 , wherein the trained sentence classifier determines whether each one of the one or more sentences is a potential material fact sentence or a non-material fact sentence by scoring the one or more features based on a trained model generated by a naïve Bayes algorithm from training data. 7 . The system of claim 1 , wherein the trained sentence classifier determines whether each one of the one or more sentences is a potential material fact sentence or a non-material fact sentence by scoring the one or more features based on a trained model generated from a stacking committee of classifiers algorithm from training data and data outputted from one or more base classifiers. 8 . The system of claim 1 , wherein the heading is a facts heading, a discussion heading, or a outcome heading. 9 . The system of claim 1 , wherein the one or more features of the paragraph is selected from a group consisting of a position of the paragraph, a number of cases, a number of statutes, a number of past tense verbs, a number of present court words, a number of lower court words, a number of legal phrases, a number of defendant words, a number of plaintiff words, a number of dates, a number of signal words, and a number of footnotes. 10 . A method to identify potential material fact sentences in electronic legal documents obtained from electronic repositories, the method comprising: obtaining, by a processing device, an electronic legal document from a repository; parsing, by the processing device, text within the electronic legal document to determine whether each one of one or more paragraphs in the legal document is a fact paragraph, a discussion paragraph, or an outcome paragraph based on at least one of a heading associated with the paragraph and one or more features of the paragraph; and for each one of the one or more paragraphs that is a fact paragraph: extracting, by the processing device, each one of one or more sentences in the fact paragraph, directing, by the processing device, a trained sentence classifier to determine whether each one of the one or more sentences is a potential material fact sentence or a non-material fact sentence based on one or more features of the sentence, and identifying, by the processing device, one or more potential material fact sentences from the one or more sentences based on the determination. 11 . The method of claim 10 , wherein the one or more features of the sentence is selected from a group consisting of a number of noun phrases, a number of verb phrases, a number of dates, a number of time stamps, a number of monetary values, a number of lower court actions, a number of present court actions, a number of plaintiff actions, a number of legal phrases, a number of legal concepts, a number of non-material fact words, and a number of non-material fact phrases. 12 . The method of claim 10 , wherein the trained sentence classifier determines whether each one of the one or more sentences is a potential material fact sentence or a non-material fact sentence by running a natural language parser on each one of the one or more sentences to determine the one or more features of the sentence. 13 . The method of claim 10 , wherein the trained sentence classifier determines whether each one of the one or more sentences is a potential material fact sentence or a non-material fact sentence by scoring the one or more features based on a trained model generated by a support vector machine algorithm from training data. 14 . The method of claim 10 , wherein the trained sentence classifier determines whether each one of the one or more sentences is a potential material fact sentence or a non-material fact sentence by scoring the one or more features based on a trained model generated by a decision tree algorithm from training data. 15 . The method of claim 10 , wherein the trained sentence classifier determines whether each one of the one or more sentences is a potential material fact sentence or a non-material fact sentence by scoring the one or more features based on a trained model generated by a naïve Bayes algorithm from training data. 16 . The method of claim 10 , wherein the trained sentence classifier determines whether each one of the one or more sentences is a potential material fact sentence or a non-material fact sentence by scoring the one or more features based on a trained model generated from a stacking committee of classifiers algorithm from training data and data outputted from one or more base classifiers. 17 . The method of claim 10 , wherein the heading is a facts heading, a discussion heading, or a outcome heading. 18 . The method of claim 10 , wherein the one or more features of the paragraph is selected from a group consisting of a position of the paragraph, a number of cases, a number of statutes, a number of past tense verbs, a number of

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016140210A1 cover?
Systems and methods to identify potential material fact sentences in electronic legal documents obtained from electronic repositories are disclosed. A system includes a processing device and a storage medium in communication with the processing device. The storage medium includes programming instructions that cause the processing device to obtain a document and parse text within the document to…
Who is the assignee on this patent?
Lexisnexis Division Of Reed Elsevier Inc
What technology area does this patent fall under?
Primary CPC classification G06F40/205. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu May 19 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).