Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06F16/212. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 21 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Systems and methods for authorizing workflows from a large-scale dataset using a metadata schema

US12450207B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12450207-B2
Application number	US-202217865945-A
Country	US
Kind code	B2
Filing date	Jul 15, 2022
Priority date	Jul 15, 2022
Publication date	Oct 21, 2025
Grant date	Oct 21, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for authoring workflows for processing data from a large-scale dataset include defining a metadata schema for the large-scale dataset, and receiving user input defining a workflow as a plurality of operations to be performed on the data. Each of the operations includes input metadata formatted according to the metadata schema. The input metadata describes input data to be processed by the operation and identifying a location for the input data in the data storage system, programmed instructions for performing an atomic operation on the input data to generate output data; and output metadata formatted according to the metadata schema. The output metadata describes the output data and identifying a location for the output data in the data storage system.

First claim

Opening claim text (preview).

What is claimed is: 1. A data processing system for processing large-scale datasets, the data processing system comprising: at least one processor; and a machine-readable medium storing executable instructions that, when executed, cause the processor to perform operations comprising: receiving user input defining a workflow definition at a front-end of a workflow authoring and execution system, the workflow definition defining a plurality of operations to be performed on data from a large-scale dataset stored in a data storage system as a workflow, wherein each of the operations includes: input metadata for each of the operations formatted according to a predefined metadata schema, the input metadata including first input metadata for a first operation of the workflow, the first input metadata identifying a location of input data in the data storage system for the first operation of the workflow; programmed instructions for performing an atomic operation on the input data to generate output data; and output metadata formatted according to the metadata schema, the output metadata describing the output data and identifying a location of the output data in the data storage system; and validating a configuration of the workflow definition; sending the workflow definition to a backend of the workflow authoring and execution system in response to the validation; and executing the workflow with the backend of the workflow authoring and execution system by: retrieving the input data for each of the respective operations from the location of the input data identified by the input metadata as each of the respective operations is executed; determining a storage location for storing the output data for each of the respective operations in the data storage system as each of the respective operations is executed; adding the determined storage location used to store the output data for each of the respective operations in the data storage system to the output metadata for each of the respective operations as the workflow is being executed; using the output metadata from each of the respective operations as the input metadata for a next operation after each of the respective operations is executed until a last operation is reached; and when the last operation is executed, providing the output metadata to the front-end of the workflow authoring and execution system. 2. The data processing system of claim 1 , further comprising: retrieving the output data for the last operation from the location identified by the output metadata for the last operation; and displaying the output data on a display device. 3. The data processing system of claim 1 , wherein each of the operations is selected from a library of predefined operations for use in authoring workflows for processing the data in the large-scale dataset. 4. The data processing system of claim 1 , further comprising: validating the workflow definition before the workflow definition is received by the backend of the workflow authoring and execution system. 5. The data processing system of claim 1 , wherein the backend of the workflow authoring and execution system is implemented on a server of a cloud-based service, and wherein the workflow definition is received from a client device. 6. A method of executing a workflow for processing data from a large-scale dataset stored in a data storage system, the method comprising: receiving user input defining a workflow definition at a front-end of a workflow authoring and execution system, the workflow definition defining a plurality of operations to be performed on the data from the large-scale dataset as the workflow, wherein each of the operations includes: input metadata for each of the operations formatted according to a predefined metadata schema, the input metadata, the input metadata including first input metadata for a first operation of the workflow, the first input metadata identifying a location of input data in the data storage system for the first operation of the workflow; programmed instructions for performing an atomic operation on the input data to generate output data; and output metadata formatted according to the metadata schema, the output metadata describing the output data and identifying a location of the output data in the data storage system; and validating a configuration of the workflow definition; sending the workflow definition to a backend of the workflow authoring and execution system in response to the validation; and executing the workflow with the backend of the workflow authoring and execution system by: retrieving the input data for each of the respective operations from the location of the input data identified by the input metadata as each of the respective operations is executed; determining a storage location for storing the output data for each of the respective operations in the data storage system as each of the respective operations is executed; adding the determined storage location used to store the output data for each of the respective operations in the data storage system to the output metadata for each of the respective operations as the workflow is being executed; using the output metadata from each of the respective operations as the input metadata for a next operation after each of the respective operations is executed until a last operation is reached; and when the last operation is executed, providing the output metadata to the front-end of the workflow authoring and execution system. 7. The method of claim 6 , further comprising: retrieving the output data for the last operation from the location identified by the output metadata for the last operation; and displaying the output data on a display device. 8. The method of claim 6 , wherein each of the operations is selected from a library of predefined operations for use in authoring workflows for processing the data in the large-scale dataset. 9. The method of claim 6 , further comprising: validating the workflow definition before the workflow definition is received by the backend of the workflow authoring and execution system. 10. The method of claim 6 , wherein the backend of the workflow authoring and execution system is implemented on a server of a cloud-based service, and wherein the workflow definition is received from a client device. 11. A method of authoring a workflow for processing data from a large-scale dataset stored in a data storage system, the method comprising: defining a metadata schema for the large-scale dataset; receiving user input at a front-end of a workflow authoring and execution system, the user input defining a workflow definition, the workflow definition including a plurality of operations to be performed as a workflow, wherein each of the operations includes: input metadata formatted according to the metadata schema, the input metadata describing input data to be processed by the operation and identifying a location for the input data in the data storage system; programmed instructions for performing an atomic operation on the input data to generate output data; and output metadata formatted according to the metadata schema, the output metadata describing the output data and identifying a location for the output data in the data storage system; receiving user input defining input parameters for the operations; validating a configuration of the workflow definition and the input parameters; and sending the workflow definition to a backend of the workflow authoring and execution system in response to the validation, wherein: the input metadata for a first operation is defined by the user input, the backend of the workflo

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06F16/258
Data format conversion from or to a database · CPC title
G06F9/544
Buffers; Shared memory; Pipes · CPC title
G06F16/212Primary
with details for data modelling support · CPC title
G06F9/5038Primary
considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title

Patent family

Related publications grouped by family.

View patent family 87060108

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12450207B2 cover?: Systems and methods for authoring workflows for processing data from a large-scale dataset include defining a metadata schema for the large-scale dataset, and receiving user input defining a workflow as a plurality of operations to be performed on the data. Each of the operations includes input metadata formatted according to the metadata schema. The input metadata describes input data to be pr…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06F16/212. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 21 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Scalable geospatial platform for an integrated data synthesis and artificial intelligence based exploration

Automated systems and methods for generating executable workflows

Metadata-driven workflows and integration with genomic data processing systems and techniques

Method of processing big data, apparatus performing the same and storage media storing the same

Frequently asked questions