Systems and Methods for Efficient Data Preprocessing of Machine Learning Workloads
US-2024403138-A1 · Dec 5, 2024 · US
US8996469B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-8996469-B2 |
| Application number | US-87170110-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 30, 2010 |
| Priority date | Aug 30, 2010 |
| Publication date | Mar 31, 2015 |
| Grant date | Mar 31, 2015 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of a state tracking technique may enable real-time tracking of jobs in a computer cluster. A state object is provided that allows a job to be implemented as a distributable database. The job may be tracked while the job is processing via the state tracking technique. Using the state tracking technique, the cluster may track the location of the state objects for jobs in a database. However, only location information for the state object, and not the job metadata itself, is stored in the central database. This reduces the amount of data stored in the central database, distributing the metadata across the cluster, thus improving database performance and reducing bandwidth requirements on the network. Information about a job may be acquired via a query to the central database to find the location of the respective state object, and then a query to the state object (or to a proxy).
Opening claim text (preview).
What is claimed is: 1. A method, comprising: receiving, by a first compute node of a plurality of compute nodes on a cluster computing system, a state object for a respective job to be executed by a distributed application on the cluster computing system, wherein the state object is a relational database file that includes a job metadata database that stores status and history information for the respective job as executed by the distributed application on the plurality of compute…
Physics · mapped topic
Related publications grouped by family.
Free tools are coming soon. Tell us what you want to track and we'll notify you.
Answers are generated from the same data shown on this page.