What technology area does this patent fall under?

Primary CPC classification G06F16/22. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu May 29 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Resolving dataset corruption of transferred datasets using programming language-agnostic data modeling platforms

US2025173318A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2025173318-A1
Application number	US-202519038671-A
Country	US
Kind code	A1
Filing date	Jan 27, 2025
Priority date	Jun 22, 2023
Publication date	May 29, 2025
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for a programming language-agnostic data modeling platform that is both less resource intensive and scalable. Additionally, the programming language-agnostic data modeling platform allows for advanced analytics to be run on descriptions of the known logical data models, to generate data offerings describing underlying data, and to easily format data for compatibility with artificial intelligence systems. The systems and methods use a supplemental data structure that comprises logical data modeling metadata, in which the logical data modeling metadata describes the logical data model in a common, standardized language. For example, the logical data modeling metadata may comprise a transformer lineage of the logical data model.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system for resolving corrupted datasets transferred to data repositories having differing physical data models using programming language-agnostic data modeling platforms, the system comprising: one or more processors; and a non-transitory, computer-readable medium comprising instructions recorded thereon that, when executed by the one or more processors, cause operations comprising: receiving a data transfer request to perform a transfer of a dataset from a first data repository to a second data repository, wherein the first data repository is associated with a first physical data model of a first entity, and wherein the second data repository is associated with a second physical data model that is different from the first physical data model; in response to receiving the data transfer request, identifying, based on a dataset description of the dataset, a first logical data model to be used in connection with performing the transfer of the dataset from the first data repository to the second data repository; determining a first supplemental data structure for the first logical data model, wherein the first supplemental data structure is expressed in a standardized language; performing the transfer of the dataset from the first data repository to the second data repository using the first supplemental data structure; in connection with performing the transfer of the dataset, receiving a data transfer error message from a second entity that is associated with the second data repository; and in response to receiving the data transfer error message, transmitting executable code to the second entity that is associated with the second data repository, wherein the executable code corresponds to a data analytic operation to be performed on the dataset to resolve a dataset error. 2 . A method for resolving corrupted datasets using programming language-agnostic data modeling platforms, the method comprising: receiving a request to perform a first data operation on a first dataset from a first data source of a first entity; in response to receiving the request, identifying, based on a first dataset description of the first dataset, a first logical data model to be used in connection with performing the first data operation on the first dataset; determining a first supplemental data structure for the first logical data model, wherein the first supplemental data structure is expressed in a standardized language; performing the first data operation on the first dataset using the first supplemental data structure; receiving a first data operation error message that indicates an error that occurred during performance of the first data operation on the first dataset; and transmitting, to a second entity, executable code corresponding to a second data operation to be performed on the first dataset to resolve the error. 3 . The method of claim 2 , wherein identifying the first logical data model further comprises: providing the first dataset description of the first dataset as input to a first artificial intelligence model trained to identify logical data models to perform data operations on datasets; receiving, from the first artificial intelligence model, a ranked set of logical data models, wherein each ranked logical data model of the ranked set of logical data models are ranked based on a confidence value indicating a likelihood that the first dataset uses a respective ranked logical data model of the ranked set of logical data models; and identifying the first logical data model based on a selection of a respective logical data model that satisfies a threshold confidence value from the ranked set of logical data models. 4 . The method of claim 3 , wherein the first artificial intelligence model comprises a Large Language Model (LLM), and wherein the LLM is trained, the training of the LLM comprising: obtaining a set of training datasets and a set of training logical data model descriptions, wherein each training dataset of the set of training datasets corresponds to a training logical data model description of the set of training logical data model descriptions, and wherein each training logical data model description of the set of training logical data model descriptions is associated with a metadata schema; providing the set of training datasets and the set of training logical data model descriptions to the LLM during a training routine, the LLM being communicatively coupled to a retrieval component configured to retrieve (i) similar logical data models historically used in connection with a dataset and (ii) metadata schemas associated with the similar logical data models to be provided to the LLM; receiving, from the LLM during the training routine, a set of candidate logical data models and corresponding metadata schemas based on (i) the similar logical data models historically used in connection with a dataset and (ii) the metadata schemas associated with the similar logical data models; and in response to receiving the set of candidate logical data models and the corresponding metadata schemas, providing a message, during the training routine, to the LLM comprising an accuracy value corresponding to each candidate logical data model and corresponding metadata schemas. 5 . The method of claim 2 , further comprising: extracting, from the request to perform the first data operation on the first dataset from the first data source of the first entity, an identifier associated with the first dataset; obtaining, based on the identifier, the first dataset from a data repository storing datasets; and determining the first dataset description of the first dataset based on metadata associated with the first dataset. 6 . The method of claim 2 , wherein determining the first supplemental data structure for the first logical data model further comprises: providing an identifier associated with the first logical data model as input to a second artificial intelligence model configured to determine supplemental data structures for logical data models; and receiving, from the second artificial intelligence model, the first supplemental data structure for the first logical data model, wherein the first supplemental data structure comprises a first attribute, and wherein the first attribute comprises a first transformer lineage of the first logical data model. 7 . The method of claim 6 , wherein the second artificial intelligence model comprises a transformer model, and wherein the transformer model is trained, the training of the transformer model comprising: obtaining (i) a set of training logical data model descriptions and (ii) a training set of supplemental data structures expressed in a standardized language each comprising a second attribute, wherein each training logical data model description of the set of training logical data model descriptions is associated with a metadata schema, and wherein the second attribute of each supplemental data structure of the training set of supplemental data structures comprises a second transformer lineage of a training logical data model corresponding to a respective training logical data model description of the set of training logical data model descriptions; providing the set of training logical data model descriptions and the training set of supplemental data structures as input to the transformer model during a self-supervised training routine; and generating, during the self-supervised training routine, a set of candidate supplemental data structures expressed in a standardized language each comprising a third attribute, wherein the third attribute of each candidate supplemental data structure of the set of candidate supplemental data structures comprises a third transformer lineage of

Assignees

Citibank Na

Inventors

Classifications

G06F16/22Primary
Indexing; Data structures therefor; Storage structures · CPC title

Patent family

Related publications grouped by family.

View patent family 94175382

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025173318A1 cover?: Systems and methods for a programming language-agnostic data modeling platform that is both less resource intensive and scalable. Additionally, the programming language-agnostic data modeling platform allows for advanced analytics to be run on descriptions of the known logical data models, to generate data offerings describing underlying data, and to easily format data for compatibility with ar…
Who is the assignee on this patent?: Citibank Na
What technology area does this patent fall under?: Primary CPC classification G06F16/22. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu May 29 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method for dynamic data blocking in a database system

Systems and methods for managing data

Managing multiple data models over data storage system

Frequently asked questions