Training and domain adaptation for supervised text segmentation
US-12204856-B1 · Jan 21, 2025 · US
US12437572B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12437572-B2 |
| Application number | US-202217991321-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 21, 2022 |
| Priority date | Nov 21, 2022 |
| Publication date | Oct 7, 2025 |
| Grant date | Oct 7, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and a system for preparing a response to a subpoena by using an automated subpoena processing and handling model is provided. The method includes: receiving a first subpoena; extracting a set of informational requests from the first subpoena; retrieving a data set from a memory; analyzing the first data set with respect to the set of informational requests in order to identify items from within the data set that are responsive to items included within the set of informational requests; and generating a report that includes a result of the analysis. The analysis is performed by providing the first subpoena as an input to a Robustly optimized Bidirectional Encoder Representations from Transformers pre-training approach (RoBERTa) model that uses a sub-word approach to modeling words in a sequential format.
Opening claim text (preview).
What is claimed is: 1. A method for preparing a response to a subpoena, the method being implemented by at least one processor, the method comprising: receiving, by the at least one processor, a first subpoena; extracting, by the at least one processor from the first subpoena, a first plurality of informational requests; retrieving, by the at least one processor from a memory, a first data set; analyzing, by the at least one processor, the first data set with respect to the first set of informational requests in order to identify items from within the first data set that are responsive to items included within the first set of informational requests; generating, by the at least one processor, a report that includes a result of the analyzing; and transmitting, by the at least one processor to a predetermined destination, the report, wherein the analyzing comprises providing the first subpoena as an input to a Robustly optimized Bidirectional Encoder Representations from Transformers pre-training approach (ROBERTa) model that uses a sub-word approach to modeling words in a sequential format, and wherein the ROBERTa model is generated by: retrieving a plurality of second subpoenas for which responses have previously been generated; extracting, from each respective second subpoena of the plurality of second subpoenas, a first set of tokens; determining, for each respective token from within the first set of tokens, a corresponding set of spatial information that relates to a location of the respective token within the respective second subpoena, the corresponding set of spatial information including a pixel offset, a width, a height, a line number, a block association, and an extraction confidence; and assigning, for each respective token from within the first set of tokens, a corresponding tag that indicates a type of information from among a predetermined set of information types. 2. The method of claim 1 , wherein the analyzing comprises applying a fuzzy matching algorithm to link items from within the first data set with items from within the first set of informational requests. 3. The method of claim 1 , wherein the ROBERTa model uses a sliding window that includes 64 tokens that include 48 tokens included at an end of a previous window and 16 tokens that occur sequentially after the 48 tokens included at the end of the previous window. 4. The method of claim 1 , wherein the ROBERTa model generates, as an output, a calibration score that relates to a confidence level of a linkage of the items from within the first data set with items from within the first set of informational requests. 5. The method of claim 1 , wherein the first set of informational requests includes first information that relates to an authority associated with the first subpoena, second information that relates to a subject of the first subpoena, and third information that relates to at least one account associated with the subject of the first subpoena. 6. The method of claim 5 , wherein the first information includes at least one from among a requestor name and a requestor address. 7. The method of claim 5 , wherein the second information includes at least one from among a name of a person, a name of an organization, an address, a social security number, a tax identification number, and a date of birth. 8. The method of claim 5 , wherein the third information includes at least one from among information that relates to a credit card account, information that relates to a debit card account, information that relates to a checking account, and a financial identifier. 9. A computing apparatus for preparing a response to a subpoena, the computing apparatus comprising: a processor; a memory; and a communication interface coupled to each of the processor and the memory, wherein the processor is configured to: receive, via the communication interface, a first subpoena; extract, from the first subpoena, a first plurality of informational requests; retrieve, from the memory, a first data set; analyze the first data set with respect to the first set of informational requests in order to identify items from within the first data set that are responsive to items included within the first set of informational requests; generate a report that includes a result of the analysis; and transmit the report to a predetermined destination via the communication interface, wherein the processor is further configured to provide the first subpoena as an input to a Robustly optimized Bidirectional Encoder Representations from Transformers pre-training approach (ROBERTa) model that uses a sub-word approach to modeling words in a sequential format, and wherein the processor is further configured to generate the ROBERTa model by: retrieving a plurality of second subpoenas for which responses have previously been generated; extracting, from each respective second subpoena of the plurality of second subpoenas, a first set of tokens; determining, for each respective token from within the first set of tokens, a corresponding set of spatial information that relates to a location of the respective token within the respective second subpoena, the corresponding set of spatial information including a pixel offset, a width, a height, a line number, a block association, and an extraction confidence; and assigning, for each respective token from within the first set of tokens, a corresponding tag that indicates a type of information from among a predetermined set of information types. 10. The computing apparatus of claim 9 , wherein the processor is further configured to apply a fuzzy matching algorithm to link items from within the first data set with items from within the first set of informational requests. 11. The computing apparatus of claim 9 , wherein the ROBERTa model uses a sliding window that includes 64 tokens that include 48 tokens included at an end of a previous window and 16 tokens that occur sequentially after the 48 tokens included at the end of the previous window. 12. The computing apparatus of claim 9 , wherein the ROBERTa model generates, as an output, a calibration score that relates to a confidence level of a linkage of the items from within the first data set with items from within the first set of informational requests. 13. The computing apparatus of claim 9 , wherein the first set of informational requests includes first information that relates to an authority associated with the first subpoena, second information that relates to a subject of the first subpoena, and third information that relates to at least one account associated with the subject of the first subpoena. 14. The computing apparatus of claim 13 , wherein the first information includes at least one from among a requestor name and a requestor address. 15. The computing apparatus of claim 13 , wherein the second information includes at least one from among a name of a person, a name of an organization, an address, a social security number, a tax identification number, and a date of birth. 16. The computing apparatus of claim 13 , wherein the third information includes at least one from among information that relates to a credit card account, information that relates to a debit card account, information that relates to a checking account, and a financial identifier.
Lexical analysis, e.g. tokenisation or collocates · CPC title
Document management systems · CPC title
Legal services · CPC title
Extracting the logical structure, e.g. chapters, sections or page numbers; Identifying elements of the document, e.g. authors · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.