Citation and policy based document classification

US11941565B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11941565-B2
Application numberUS-202117345907-A
CountryUS
Kind codeB2
Filing dateJun 11, 2021
Priority dateJun 11, 2020
Publication dateMar 26, 2024
Grant dateMar 26, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein are system, method, and computer program product embodiments for rapid identification and access to relevant regulatory documents. A data model relating regulatory mandates and requirements to citations appearing within an enforcement document is used to rapidly access specific citations within an enforcement document. In the case of image-based enforcement documents, the originality of these documents is preserved while allowing a user to see where the relevant citations appear in the document images. The relevant citations are further compared to business policies to identify potential impacts of regulatory mandates and requirements.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: extracting, by at least one processor, a citation from a document; parsing, by the at least one processor, the extracted citation into first one or more sub-citations; comparing, by the at least one processor, the first one or more sub-citations to second one or more sub-citations of a reference citation identified in a business policy document responsive to a business requirement, wherein the comparing identifies a relationship between the citation and the business policy document based on a level of matching between the first one or more sub-citations and the second one or more sub-citations; mapping, by the at least one processor and in response to identifying the relationship, the first one or more sub-citations to one or more policies in the business policy document; and classifying, by the at least one processor, the document as relevant to the business policy document based on the level of matching between the first one or more sub-citations and the second one or more sub-citations. 2. The method of claim 1 , wherein extracting the citation from the document comprises: comparing text in the document to a citation key; and in response to identifying a first string of characters from the text that matches the citation key, extracting a second string of characters comprising the first string of characters, a third string of characters located in the text immediately before the first string and a fourth string of characters located in the text immediately after the first string. 3. The method of claim 1 , wherein extracting the citation from the document comprises: comparing text in the document to a citation key, wherein the citation key comprises a first string of characters; and in response to identifying a second string of characters from the text that differs from the citation key by at least one character and having a matching metric greater than a threshold, marking the second string of characters as a citation extraction error. 4. The method of claim 1 , wherein: the first one or more sub-citations comprises a first root and a first subsection; the second one or more sub-citations comprises a second root and a second subsection; and classifying the document based on the level of matching between the first one or more sub-citations and the second one or more sub-citations comprises: in response to the first one or more sub-citations matching the second one or more sub-citations, classifying the document as an exact match for the business policy document; in response to the first root matching the second root and the first subsection matching the second subsection, classifying the document as a subsection match for the business policy document; in response to the first root matching the second root and the first subsection being different than the second subsection, classifying the document as a root match for the business policy document; and in response to the first root being different than the second root, classifying the document as a non-match for the business policy document. 5. The method of claim 1 , further comprising: extracting, by the at least one processor, one or more fine amounts from the document, wherein the one or more fine values correspond to the citation; removing, by the at least one processor, each duplicate fine amount from the one or more fine amounts to form a set of fine amounts; and combining, by the at least one processor, the set of fine amounts into a total fine amount corresponding to the citation. 6. The method of claim 1 , further comprising: extracting, by the at least one processor, a fine amount from the document, wherein the fine amount comprises a number prefix and a word suffix; and combining, by the at least one processor, the number prefix and the word suffix to form a number amount for the fine amount. 7. The method of claim 1 , further comprising: determining, by the at least one processor, a score describing an impact of the document on a business based on the business policy document, the citation, the classification of the document, and text in the document related to the citation; and in response to the score being greater than a threshold, presenting, by the at least one processor, a summary of the impact of the document on the business, wherein the summary comprises one or more of a link to the business policy document, the citation, a link to a reference document described by the citation, a classification of the document, a link to the document, or the score. 8. The method of claim 7 , further comprising: extracting, by the at least one processor, a total fine amount corresponding to the citation, wherein determining the score describing the impact of the document on the business is further based on the total fine amount; and wherein the summary further comprises the total fine amount. 9. A system, comprising: one or more processors; memory communicatively coupled to the one or more processors, the memory storing instructions which, when executed by the one or more processors, cause the one or more processors to: extract a citation from a document; parse the extracted citation into first one or more sub-citations; compare the first one or more sub-citations to second one or more sub-citations of a reference citation identified in a business policy document responsive to a business requirement, wherein the comparing identifies a relationship between the citation and the business policy document based on a level of matching between the first one or more sub-citations and the second one or more sub-citations; map, in response to identifying the relationship, the first one or more sub-citations to one or more policies in the business policy document; and classify the document as relevant to the business policy document based on the first one or more sub-citations and the second one or more sub-citations. 10. The system of claim 9 , wherein the instructions are further configured to extract the citation from the document by: comparing text in the document to a citation key; and in response to identifying a first string of characters from the text that matches the citation key, extracting a second string of characters comprising the first string of characters, a third string of characters located in the text immediately before the first string and a fourth string of characters located in the text immediately after the first string. 11. The system of claim 9 , wherein the instructions are further configured to extract the citation from the document by: comparing text in the document to a citation key, wherein the citation key comprises a first string of characters; and in response to identifying a second string of characters from the text that differs from the citation key by at least one character and having a matching metric greater than a threshold, marking the second string of characters as a citation extraction error. 12. The system of claim 9 , wherein: the first one or more sub-citations comprises a first root and a first subsection; the second one or more sub-citations comprises a second root and a second subsection; and the instructions are further configured to classify the document based on the level of matching between the first one or more sub-citations and the second one or more sub-citations by: in response to the first one or more sub-citations matching the second one or more sub-citations, classifying the document as an exact match for the business policy document; in response to the first root matching the second root and the first subsection matching the second subsection, classifying the document as

Assignees

Inventors

Classifications

  • Prediction of business process outcome or impact based on a proposed change · CPC title

  • Summarisation for human users · CPC title

  • Indexing; Web crawling techniques · CPC title

  • Extracting the logical structure, e.g. chapters, sections or page numbers; Identifying elements of the document, e.g. authors · CPC title

  • Document matching, e.g. of document images · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11941565B2 cover?
Disclosed herein are system, method, and computer program product embodiments for rapid identification and access to relevant regulatory documents. A data model relating regulatory mandates and requirements to citations appearing within an enforcement document is used to rapidly access specific citations within an enforcement document. In the case of image-based enforcement documents, the origi…
Who is the assignee on this patent?
Capital One Services Llc
What technology area does this patent fall under?
Primary CPC classification G06Q10/06375. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 26 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).