System and method for tracing data streamed across different platforms and identifying data manipulations performed across different platforms

US2025272213A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025272213-A1
Application numberUS-202418584687-A
CountryUS
Kind codeA1
Filing dateFeb 22, 2024
Priority dateFeb 22, 2024
Publication dateAug 28, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system for tracing data streamed across differing different system platforms are disclosed. The method includes providing and storing context data corresponding to a data event published to a streaming service, extracting a data classifier block from the stored context data, and extracting a lineage tracer block from the stored context data. The method further includes converting the lineage tracer block into a linked lineage triple, and generating a lineage graph using the linked lineage triple for visualization.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for tracing data streamed across differing different system platforms, the method comprising: providing, by each of a plurality of devices and to a database, context data corresponding to a data event published to a streaming service including a streaming pipeline; storing, in the database, the context data provided by each of the plurality of devices; reading, by a data classifier, the context data stored in the database and extracting a data classifier block from the stored context data; gathering, by the data classifier and from the extracted data classifier block, personal identification information and sensitive data elements; reading, by a lineage processor, the context data stored in the database and extracting a lineage tracer block from the stored context data; converting, by the lineage processor, the lineage tracer block into a linked lineage triple; processing, by the lineage processor, the linked lineage triple by tokenizing and deduplicating the linked lineage triple; and generating, by the data trace builder, a lineage graph using the tokenized and deduplicated linked lineage triple for visualization. 2 . The method according to claim 1 , wherein the lineage tracer block includes one or more of an origin data object, a transform data object and a destination data object. 3 . The method according to claim 2 , wherein the origin data object includes information related to an entity being sourced. 4 . The method according to claim 2 , wherein the destination data object includes information related to the data event being published. 5 . The method according to claim 2 , wherein the transform data object includes one or more transformations that occurred. 6 . The method according to claim 5 , wherein at least one of the one or more transformations is performed offline from the streaming pipeline. 7 . The method according to claim 5 , wherein the one or more transformations include a transformation at an entity level or a transformation at a column level. 8 . The method according to claim 1 , wherein the plurality of devices includes a data publisher device that is configured as a dedicated data publisher. 9 . The method according to claim 1 , wherein the plurality of devices includes a data publisher device that is configured to jointly operate as a data publisher and a data consumer. 10 . The method according to claim 1 , wherein the plurality of devices includes a data consumer device that is configured as a dedicated data consumer. 11 . The method according to claim 1 , wherein the lineage tracer block includes a mode type. 12 . The method according to claim 11 , wherein the mode type includes one of a streaming type and a batch type. 13 . The method according to claim 11 , wherein the lineage tracer block further includes a mode sub-type. 14 . The method according to claim 13 , wherein the sub-mode type includes one of a system of record and derived. 15 . The method according to claim 1 , wherein the linked lineage triple includes at least two nodes and an edge that connects the at least two nodes. 16 . The method according to claim 1 , further comprising: deriving at least one insight specific to a node by applying a graphic machine learning algorithm on the lineage graph. 17 . The method according to claim 1 , wherein the lineage tracer block is a JSON object qualified with prove ontology. 18 . The method according to claim 5 , wherein at least one the one or more transformations is determined based on property attributes on nodes present in the lineage graph. 19 . A system for tracing data streamed across differing different system platforms, the system comprising: a memory; and a processor, wherein the system is configured to perform: providing, by each of a plurality of devices and to a database, context data corresponding to a data event published to a streaming service including a streaming pipeline; storing, in the database, the context data provided by each of the plurality of devices; reading, by a data classifier, the context data stored in the database and extracting a data classifier block from the stored context data; gathering, by the data classifier and from the extracted data classifier block, personal identification information and sensitive data elements; reading, by a lineage processor, the context data stored in the database and extracting a lineage tracer block from the stored context data; converting, by the lineage processor, the lineage tracer block into a linked lineage triple; processing, by the lineage processor, the linked lineage triple by tokenizing and deduplicating the linked lineage triple; and generating, by the data trace builder, a lineage graph using the tokenized and deduplicated linked lineage triple for visualization. 20 . A non-transitory computer readable storage medium that stores a computer program for tracing data streamed across differing different system platforms, the computer program, when executed by a processor, causing a system to perform a plurality of processes comprising: providing, by each of a plurality of devices and to a database, context data corresponding to a data event published to a streaming service including a streaming pipeline; storing, in the database, the context data provided by each of the plurality of devices; reading, by a data classifier, the context data stored in the database and extracting a data classifier block from the stored context data; gathering, by the data classifier and from the extracted data classifier block, personal identification information and sensitive data elements; reading, by a lineage processor, the context data stored in the database and extracting a lineage tracer block from the stored context data; converting, by the lineage processor, the lineage tracer block into a linked lineage triple; processing, by the lineage processor, the linked lineage triple by tokenizing and deduplicating the linked lineage triple; and generating, by the data trace builder, a lineage graph using the tokenized and deduplicated linked lineage triple for visualization.

Assignees

Inventors

Classifications

  • Data logging (G06F11/14, G06F11/2205 take precedence) · CPC title

  • Visualisation of programs or trace data · CPC title

  • where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems (multiprogramming arrangements G06F9/46; allocation of resources G06F9/50) · CPC title

  • Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation {; Recording or statistical evaluation of user activity, e.g. usability assessment} · CPC title

  • with visual {or acoustical} indication of the functioning of the machine · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025272213A1 cover?
A method and system for tracing data streamed across differing different system platforms are disclosed. The method includes providing and storing context data corresponding to a data event published to a streaming service, extracting a data classifier block from the stored context data, and extracting a lineage tracer block from the stored context data. The method further includes converting t…
Who is the assignee on this patent?
Jpmorgan Chase Bank Na
What technology area does this patent fall under?
Primary CPC classification G06F11/3476. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 28 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).