Batch data ingestion in database systems
US-10896172-B2 · Jan 19, 2021 · US
US12067029B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12067029-B2 |
| Application number | US-202117507779-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 21, 2021 |
| Priority date | Dec 9, 2020 |
| Publication date | Aug 20, 2024 |
| Grant date | Aug 20, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed is an apparatus for metadata management and collection, which includes a settings managing unit that generates setting information of data obtained from a data source, a source managing unit that generates source information associated with the data source, a job managing unit that starts or stops a data collection job based on the source information, an object collecting unit that requests an external system for a list of metadata based on the setting information and the source information, a metadata importing unit that imports metadata from the list of the metadata based on the setting information and the source information, a data downloading unit that downloads target metadata of the imported metadata based on the setting information and the source information, and a queue managing unit that generates a data queue depending on a request of the job managing unit.
Opening claim text (preview).
What is claimed is: 1. An apparatus for metadata management and collection, comprising at least one processor configured to execute program code to implement the following units, the units including the program code and being configured to respond to a user request for a data collection job: a settings managing unit configured to generate setting information of data obtained from a data source; a source managing unit configured to generate source information associated with the data source; a job managing unit configured to start the data collection job based on the source information and in response to the user request and selection of the data source by the user, and to stop the data collection job in response to a user request to stop the data collection job; an object collecting unit configured to request an external system for a list of metadata based on the setting information and the source information; a metadata importing unit configured to import metadata from the list of the metadata based on the setting information and the source information; a data downloading unit configured to download target metadata of the imported metadata based on the setting information and the source information; a data storage device configured to store the downloaded target metadata; and a queue managing unit configured to generate a data queue to store data collection jobs being executed, depending on a request of the job managing unit, the request corresponding to the data collection job; wherein the metadata importing unit is further configured to store the imported metadata in a database after performing mapping on the imported metadata so as to coincide with a system standard, and the data downloading unit is further configured to listen a job metadata queue, obtain and download the target metadata based on a listening result, the setting information and the source information, and store the downloaded target metadata in a specified storage system of the data storage device to complete the data collection job requested by the user. 2. The apparatus of claim 1 , wherein the setting information includes a name of the data, a type of the data, and a length of the data. 3. The apparatus of claim 1 , wherein the data downloading unit stores the target metadata in the data storage device. 4. The apparatus of claim 1 , wherein the data queue is a plurality of queues including: a first queue configured to store a job being executed by the job managing unit; a second queue configured to store jobs whose executions are interrupted by the job managing unit; a third queue configured to store the metadata; and a fourth queue configured to store the target metadata. 5. A method for metadata management and collection, comprising: in response to a request by a user for a data collection job: requesting to obtain a list of data sources; starting the data collection job based on source information associated with one data source among the data sources; requesting an external system for a list of metadata based on setting information and the source information of data obtained from the one data source; importing metadata from the list of the metadata based on the setting information and the source information and storing the imported metadata in a first queue; monitoring the first queue and storing target metadata in a second queue based on a result of the monitoring; and listening the second queue and downloading the target metadata based on the setting information, the source information, and a result of the listening; wherein the storing of the target metadata in the second queue includes: performing mapping on the metadata based on a system standard; wherein the downloading the target metadata includes storing the downloaded target metadata in a specified storage system of a data storage device to complete the data collection job requested by the user.
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title
Task life-cycle, e.g. stopping, restarting, resuming execution (G06F9/4881 takes precedence) · CPC title
Data format conversion from or to a database · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.