Method and apparatus for privacy audit support via provenance-aware systems
US-9811669-B1 · Nov 7, 2017 · US
US11768954B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11768954-B2 |
| Application number | US-202016902535-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 16, 2020 |
| Priority date | Jun 16, 2020 |
| Publication date | Sep 26, 2023 |
| Grant date | Sep 26, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The exemplary embodiments provide real-time data capture and processing which improves data processing performance and speed and facilitate passing of the processed data to various analytical sources, while maintaining superior data quality checks, particularly with respect to data elements associated with multiple data types. The proposed system and process can be used to continuously consume and listen to multiple events while mapping the events to appropriate schemas provided in a separate schema stream. The schema stream is provided once and cached to minimize bandwidth consumed by the transaction stream. The schema information is then further enriched with information from a metadata registry. The event data may then be compressed and aligned in memory tables based on the enriched schema. Once events are decoded and sorted into memory tables in accordance to the identified schema, each memory table can be processed in parallel.
Opening claim text (preview).
What is claimed is: 1. A non-transitory computer-accessible medium having stored thereon computer-executable instructions for optimal batch processing of transaction streams in real-time, wherein, when a computer arrangement executes the instructions, the computer arrangement is configured to perform procedures comprising: decomposing each transaction in a transaction stream into one or more discrete events, wherein the transaction stream is decomposed into a plurality discrete events; determining a fingerprint for each of the plurality of discrete events, by mapping one or more event data items into shorter bit string that uniquely identifies a corresponding event data item; receiving a plurality of schemas in a scheme stream transmitted via a distinct schema channel, wherein the plurality of schemas are transmitted once and stored in a cache memory; determining at least one schema for each of the plurality of discrete events based on a corresponding fingerprint, wherein a change in one or more data values associated with a discrete event results in a new schema; compressing, two or more discrete events, corresponding to a common row, into a single discrete event, the common row being identified based on the at least one schema, wherein the at least one schema is enhanced with one or more supplemental metadata provided by a metadata registry; storing the plurality of discrete events across a parallel arrangement of one or more source tables, based on a schema hash value associated with the at least one schema; scanning, in parallel, one or more source tables for one or more untokenized data elements corresponding to a sensitive information record, wherein each of the one or more source tables is associated with a distinct schema hash value one or more discrete events; and tokenizing the one or more untokenized data in parallel across the one or more source tables; and writing back the one or more tokenized data to one or more corresponding source tables in parallel. 2. The computer-accessible medium of claim 1 , wherein the computer arrangement is configured to determine the at least one schema by: sending the fingerprint to a fingerprint module; and receiving the at least one schema from the fingerprint module. 3. The computer-accessible medium of claim 1 , wherein each of the one or more source tables includes insert events, update events, and delete events. 4. The computer-accessible medium of claim 1 , wherein the computer arrangement is configured to determine the data quality using a data quality module. 5. The computer-accessible medium of claim 1 , wherein the computer arrangement is configured to tokenize each of the events by: using a tokenization module. 6. The computer-accessible medium of claim 1 , wherein the computer arrangement is configured to, receive one or more new schema, associated with the change in the one or more data values, via the distinct schema channel, and compute a new fingerprint for each of the one or more new schemas, wherein the one or more new schemas are stored in a cache indexed by the new fingerprint. 7. The computer-accessible medium of claim 1 , wherein the computer arrangement is further configured to compress the one or more source tables prior to storing it in the at least one database. 8. The computer-accessible medium of claim 1 , wherein the at least one database includes at least two databases, and wherein the computer arrangement is further configured to: store the one or more source tables in a first database of the at least two databases; and mirror the first database in a second database of the at least two databases. 9. The computer-accessible medium of claim 1 , wherein the computer arrangement is further configured to: split the plurality of discrete transaction events, associated with a transaction stream, into a first set of events and a second set of events; batch process the first set of events using a first processor; and batch process the second set of events using a second processor. 10. The computer-accessible medium of claim 1 , wherein the discrete transaction events are related to a loan application for a person. 11. The computer-accessible medium of claim 10 , wherein the loan application is for an automobile loan. 12. A method for real-time capturing of data changes in a transaction stream, the method comprising: decomposing each transaction in a transaction stream into one or more discrete events, wherein the transaction stream is decomposed into a plurality discrete events; determining a fingerprint for each of the plurality of discrete events, by mapping one or more event data items into shorter bit string that uniquely identifies a corresponding event data item; receiving a plurality of schemas in a scheme stream transmitted via a distinct schema channel, wherein the plurality of schemas are transmitted once and stored in a cache memory; determining at least one schema for each of the plurality of discrete events based on a corresponding fingerprint, wherein a change in one or more data values associated with a discrete event results in a new schema; compressing, two or more of discrete events, corresponding to a common row, into a single discrete event, the common row being identified based on the at least one schema, wherein the at least one schema is enhanced with one or more supplemental metadata provided by a metadata registry; storing the plurality discrete events across a parallel arrangement of one or more source tables, based on a schema hash value associated with the at least one schema; scanning, in parallel, one or more source tables for one or more untokenized data elements corresponding to a sensitive information record, wherein each of the one or more source tables is associated with a distinct schema hash value one or more discrete events; and tokenizing the one or more untokenized data in parallel across the one or more source tables; and writing back the one or more tokenized data to one or more corresponding source tables in parallel. 13. The method of claim 12 , wherein each of the one or more source tables include multiple insert events, update events, and delete events. 14. The method of claim 12 , wherein the computer arrangement is configured to determine the at least one schema by: sending the at least one fingerprint to a fingerprint module; and receiving the at least one schema from the fingerprint module. 15. The method of claim 12 , wherein the computer arrangement is further configured to identify one or more first data fields for each transaction in the transaction stream, and tokenize each of the one or more first data fields. 16. The method of claim 12 , wherein the computer arrangement is further configured to compress each transaction prior to storing them in a database. 17. The method of claim 16 , wherein the computer arrangement is configured to receive one or more new schema, associated with the change in the one or more data values, via the distinct schema channel, and compute a new fingerprint for each of the one or more new schemas, wherein the one or more new schemas are stored in a cache indexed by the new fingerprint. 18. The method of claim 12 , wherein the fingerprint is attached to each discrete transaction event in the transaction stream at transmission.
Protecting personal data, e.g. for financial or medical purposes · CPC title
Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs · CPC title
Transactional file systems · CPC title
Schema design and management · CPC title
Tablespace storage structures; Management thereof · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.