Accessing siloed data across disparate locations via a unified metadata graph systems and methods

US12361001B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12361001-B2
Application numberUS-202418617305-A
CountryUS
Kind codeB2
Filing dateMar 26, 2024
Priority dateDec 20, 2023
Publication dateJul 15, 2025
Grant dateJul 15, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for reducing usage of computational resources when accessing siloed data across disparate locations via a unified metadata graph are disclosed. The system receives a user-specified query indicating a request to access a set of data objects. The system then performs natural language processing on the user-specified query to determine a set of phrases corresponding to the user-specified query. The system then accesses a metadata graph to determine a node corresponding to the set of phrases. Using a location identifier corresponding to the determined node, the system determines a data silo storing at least one data object of the set of data objects. The system then generates for display, on a graphical user interface, a visual representation of the at least one data object.

First claim

Opening claim text (preview).

We claim: 1. A system for reducing usage of computational resources when accessing siloed data across disparate locations via a unified metadata graph, the system comprising: at least one hardware processor; and at least one non-transitory memory storing instructions, which, when executed by the at least one hardware processor, cause the system to: identifying a set of keywords associated with a request to access a set of data objects; performing natural language processing on the set of keywords to determine a set of semantically similar phrases corresponding to each keyword of the set of keywords; accessing a metadata graph to determine a node corresponding to the set of semantically similar phrases, wherein the metadata graph comprises (i) a set of nodes indicating (a) metadata of internal data objects stored in data silos and (b) location identifiers of the data silos, and (ii) edges indicating a data lineage between a first node and a second node of the set of nodes, wherein the metadata graph is generated using a metadata data structure that is based on file-level and container-level metadata identifiers; determining a data silo storing at least one data object of the set of data objects using the location identifier corresponding to the determined node to obtain the at least one data object of the set of data objects via the data silo; and generating, for display, on a graphical user interface (GUI), a visual representation of the at least one data object, wherein the visual representation of the at least one data object comprises lineage information of the at least one data object. 2. The system of claim 1 , wherein the metadata graph is generated by: retrieving (i) a set of file-level metadata identifiers and (ii) a set of container-level metadata identifiers from a second set of data silos, wherein each file-level metadata identifier of the set of file-level metadata identifiers indicates metadata of a given data object stored within a respective data silo, and wherein each container-level metadata identifier of the set of container-level metadata identifiers indicates metadata of the respective data silo of the second set of data silos; generating a set of semantically similar metadata identifiers corresponding to each file-level and container-level metadata identifiers, respectively; generating the metadata data structure to map each semantically similar metadata identifier of the set of semantically similar metadata identifiers to normalized file-level metadata identifiers and normalized container-level metadata identifiers; and generating the metadata graph using the generated metadata data structure. 3. The system of claim 1 , further comprising the instructions to: receiving, via a second the GUI, a second user-specified query indicating a request to generate an intended result; providing the second user-specified query to an artificial intelligence model to generate a recommendation, wherein the recommendation comprises (i) a second artificial intelligence model to be used to generate the intended result and (ii) a second set of data objects to be used when training the second artificial intelligence model; in response to receiving a user selection indicating acceptance of the recommendation, (i) accessing a database to obtain the second artificial intelligence model and (ii) obtaining the second set of data objects using the metadata graph; training the second artificial intelligence model using the set of data objects; and applying the second artificial intelligence model to generate the intended result. 4. The system of claim 3 , further comprising the instructions to: accessing a governance database to obtain a set of policies indicating usage criteria corresponding to the second set of data objects; determining whether the second set of data objects are approved to be used to train the second artificial intelligence model using the set of policies indicating usage criteria corresponding to the set of second data objects; determining whether an output of the second artificial intelligence model is approved to be provided to one or more computing systems using a second set of policies indicating usage criteria corresponding to artificial intelligence model predictions; and in response to (i) the second set of data objects being approved to be used to train the second artificial intelligence model and (ii) the output of the second artificial intelligence model is approved to be provided to one or more computing systems, applying the second artificial intelligence model to generate the intended result. 5. A method for reducing usage of computational resources when accessing siloed data across disparate locations via a unified metadata graph, the method comprising: identifying a set of keywords associated with a user-specified query to access a set of data objects; performing natural language processing on the user-specified query to determine a set of phrases corresponding to the user-specified query; accessing a metadata graph to determine a node corresponding to the set of phrases, wherein the metadata graph comprises (i) a set of nodes comprising (a) metadata indicating internal data objects stored in data silos and (b) location identifiers of the data silos, and (ii) edges indicating data lineages of the set of nodes, wherein the metadata graph is generated using a metadata data structure that is based on file-level and container-level metadata identifiers; determining a data silo storing at least one data object of the set of data objects using the location identifier corresponding to the determined node to obtain the at least one data object of the set of data objects via the data silo; and generating a representation of the at least one data object. 6. The method of claim 5 , wherein the metadata graph is generated by: retrieving (i) a set of file-level metadata identifiers and (ii) a set of container-level metadata identifiers from a second set of data silos, wherein each file-level metadata identifier of the set of file-level metadata identifiers indicates metadata of a given data object stored within a respective data silo, and wherein each container-level metadata identifier of the set of container-level metadata identifiers indicates metadata of the respective data silo of the second set of data silos; generating a set of semantically similar metadata identifiers corresponding to each file-level and container-level metadata identifiers, respectively; generating the metadata data structure to map each semantically similar metadata identifier of the set of semantically similar metadata identifiers to normalized file-level metadata identifiers and normalized container-level metadata identifiers; and generating the metadata graph using the generated metadata data structure. 7. The method of claim 5 , further comprising: receiving, via a second GUI, a second user-specified query indicating a request to generate an intended result; providing the second user-specified query to an artificial intelligence model to generate a recommendation, wherein the recommendation comprises (i) a second artificial intelligence model to be used to generate the intended result and (ii) a second set of data objects to be used when training the second artificial intelligence model; in response to receiving a user selection indicating acceptance of the recommendation, (i) accessing a database to obtain the second artificial intelligence model and (ii) obtaining the second set of data objects using the metadata graph; training the second artificial intelligence model using the set of data objects; and applying the second artificial intelligence model to generate the intended result. 8. The method of claim 7 , furthe

Assignees

Inventors

Classifications

  • Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

  • Visual data mining; Browsing structured data · CPC title

  • Selectivity estimation or determination · CPC title

  • Distributed queries · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12361001B2 cover?
Systems and methods for reducing usage of computational resources when accessing siloed data across disparate locations via a unified metadata graph are disclosed. The system receives a user-specified query indicating a request to access a set of data objects. The system then performs natural language processing on the user-specified query to determine a set of phrases corresponding to the user…
Who is the assignee on this patent?
Citibank Na
What technology area does this patent fall under?
Primary CPC classification G06F16/24545. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 15 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).