Extensible data platform with database domain extensions

US11768849B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11768849-B2
Application numberUS-202117351969-A
CountryUS
Kind codeB2
Filing dateJun 18, 2021
Priority dateMar 15, 2021
Publication dateSep 26, 2023
Grant dateSep 26, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computing system that includes one or more server computing devices including one or more processors configured to execute instructions for a domain extensibility module that provides software development tools for building domain extensions for a database platform, and a data ingestion module that provides software development tools for defining a metadata schema for extracting metadata from data files. The one or more processors are configured to receive a set of data from a user computing device, define a target metadata schema that includes one or more metadata fields that will be populated during a data ingestion process, define a target domain extension that defines one or more data types for storing the received set of data after performing the data ingestion process, and ingest the received set of data using a metadata extraction pipeline to generate metadata files based on the target metadata schema.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computing system comprising: one or more server computing devices including one or more processors configured to execute instructions for: a domain extensibility module that provides software development tools for building domain extensions for a database platform of the computing system, wherein the domain extensions define a data type for data to be stored on the database platform, and storage and infrastructure components for the database platform for storing that defined data type; and a data ingestion module that provides software development tools for defining a metadata schema for extracting metadata from data files stored on the database platform, and generating a metadata extraction pipeline to extract metadata based on the defined metadata schema; wherein the one or more processors are configured to: receive a set of data from a domain-specific data platform, the domain-specific data platform being configured to aggregate data detected by one or more sensors operating in a domain associated with the domain-specific data platform; define a target metadata schema that includes one or more metadata fields that will be populated during a data ingestion process; define a target domain extension that defines one or more data types for storing the received set of data after performing the data ingestion process; ingest the received set of data using a metadata extraction pipeline to generate metadata files based on the target metadata schema; store the ingested set of data and the generated metadata files based on the target domain extension; and provide a network accessible endpoint for accessing the ingested set of data and the metadata file. 2. The computing system of claim 1 , wherein to define the target metadata schema, the one or more processors are configured to: classify the received set of data to determine a file format for the received set of data; and define the target metadata schema based on the determined file format for the received set of data. 3. The computing system of claim 2 , wherein to define the target metadata schema, the one or more processors are further configured to: identify a plurality of types of metadata that can be extracted from the received set of data; present a list of the plurality of types of metadata to a user; receive user input of one or more user selected types of metadata; and define the target metadata schema based on the one or more user selected types of metadata. 4. The computing system of claim 1 , wherein to define the target metadata schema, the one or more processors are configured to receive a new target metadata schema from a user. 5. The computing system of claim 1 , wherein the one or more processors are configured to execute instructions for a client application module that provides software development tools for integrating other application programs executed on client computing devices with the computing system. 6. The computing system of claim 5 , wherein the one or more processors are configured to: receive requests from an integrated application program to retrieve target data stored on the database platform; retrieve the target data from the database platform; and provide the integrated application program with a network accessible endpoint to retrieve the target data. 7. The computing system of claim 6 , wherein to retrieve the target data form the database platform, the one or more processors are configured to: receive a search parameter for the target data with the received request from the integrated application program; and search the ingested set of data and the stored metadata files based on the received search parameter to identify the target data. 8. The computing system of claim 6 , wherein the requests received from the integrated application program further include a target file system for receiving the target data, and wherein the one or more processors are further configured to: retrieve the target data from the database platform; mount the target data to the target file system; and provide the integrated application program with the network accessible endpoint to retrieve the target data mounted to the target file system. 9. The computing system of claim 8 , wherein to mount the target data to the target file system, the one or more processors are further configured to: emulate a file architecture of the target file system at the network accessible endpoint, the emulated file architecture including a target file path; and provide the target data to the integrated application program using the emulated file architecture. 10. The computing system of claim 1 , wherein the one or more processors are configured to execute instructions for a machine learning model module that provides software development tools for integrating one or more third party machine learning models executed by other computing devices with the computing system. 11. The computing system of claim 10 , wherein the received set of data is one of a plurality of sets of data, each set of data having a legacy file format, wherein each set of data of the plurality of sets of data are received from different respective domain-specific data platforms, each domain-specific data platform being configured to aggregate data detected by sensors operating in a domain associated with that domain-specific data platform, and wherein the one or more processors are further configured to: ingest the plurality of sets of data using the metadata extraction pipeline; store the ingested plurality of sets of data in a new file format that is different than the legacy file format and requires different storage and infrastructure components for the database platform for storing the new file format, the ingested plurality of sets of data being indexed for search; provide a network accessible endpoint for accessing the ingested plurality of sets of data; and provide the ingested plurality of sets of data to the one or more machine learning models using the network accessible endpoint. 12. The computing system of claim 11 , wherein the plurality of sets of data include data collected by sensors selected from the group consisting of wellhead sensors, seismic sensors, tank sensors, rolling stock sensors, and pipeline flow sensors. 13. A method comprising: at one or more processors of a computing system: providing software development tools for building domain extensions for a database platform of the computing system, wherein the domain extensions include defining a data type for data to be stored on the database platform, and storage and infrastructure components for the database platform for storing that defined data type; providing software development tools for defining a metadata schema for extracting metadata from data files stored on the database platform, and generating a metadata extraction pipeline to extract metadata based on the defined metadata schema; receiving a set of data from a domain-specific data platform, the domain-specific data platform being configured to aggregate data detected by one or more sensors operating in a domain associated with the domain-specific data platform; defining a target metadata schema that includes one or more metadata fields that will be populated during a data ingestion process; defining a target domain extension that defines one or more data types for storing the received set of data after performing the data ingestion process; ingesting the received set of data using a metadata extraction pipeline to generate metadata files based on the target metadata schema; storing the ingested set of data and the gene

Assignees

Inventors

Classifications

  • G06F16/258Primary

    Data format conversion from or to a database · CPC title

  • File meta data generation · CPC title

  • Specific adaptations of the file system to access devices and non-file objects via standard file system access operations, e.g. pseudo file systems (dedicated interfaces to storage systems G06F3/0601) · CPC title

  • G06F16/211Primary

    Schema design and management · CPC title

  • Ensuring data consistency and integrity · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11768849B2 cover?
A computing system that includes one or more server computing devices including one or more processors configured to execute instructions for a domain extensibility module that provides software development tools for building domain extensions for a database platform, and a data ingestion module that provides software development tools for defining a metadata schema for extracting metadata from…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/258. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 26 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).