Deep learning job scheduling method and system and related device
US-11954521-B2 · Apr 9, 2024 · US
US12438943B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12438943-B2 |
| Application number | US-202217981077-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 4, 2022 |
| Priority date | Nov 17, 2021 |
| Publication date | Oct 7, 2025 |
| Grant date | Oct 7, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An illustrative embodiment disclosed herein is an apparatus including a processor having programmed instructions to place a first compute resource in a storage node of an object storage platform and to place a second compute resource in a compute node in a client coupled to the object storage platform via a public network. In some embodiments, unstructured data is stored in the storage node. In some embodiments, the first compute resource of the storage node preprocesses the unstructured data. In some embodiments, the preprocessed unstructured data is sent to the compute node. In some embodiments, the second compute resource trains a machine learning (ML) model using the preprocessed unstructured data.
Opening claim text (preview).
What is claimed: 1. An apparatus comprising a processor and a memory, wherein the memory includes programmed instructions that, when executed by the processor, cause the apparatus to: assign, by a resource scheduler, a first virtualized compute resource to a storage node of an object store on a first cloud, the storage node including a virtualized storage resource, wherein unstructured data is stored in the storage node; preprocess, by the first virtualized compute resource of the storage node, at the storage node on the first cloud, the unstructured data stored in the storage node to generate preprocessed data; transfer, via a public network, the preprocessed data generated by the first virtualized compute resource of the storage node to a compute node on a client system separate from the first cloud; and assign, by the resource scheduler, a second virtualized compute resource to the compute node of the client system, wherein the second compute resource trains a machine learning (ML) model using the preprocessed data. 2. The apparatus of claim 1 , wherein the storage node comprises a hyper-converged infrastructure (HCI) node. 3. The apparatus of claim 1 , wherein the preprocessing the unstructured data comprises at least two preprocessing steps, wherein a first preprocessing step of the at least two preprocessing steps includes parsing the unstructured data, and wherein a second preprocessing step of the at least two preprocessing steps includes filtering the unstructured data. 4. The apparatus of claim 1 , wherein the client system is on a second cloud. 5. The apparatus of claim 1 , wherein the unstructured data is partitioned into a first chunk on the storage node and a second chunk on a second storage node of the object storage platform, wherein a third virtualized compute resource preprocesses the second chunk. 6. The apparatus of claim 1 , wherein the storage node is an accelerator-enabled node. 7. The apparatus of claim 1 , wherein the memory includes the programmed instructions that, when executed by the processor, further cause the apparatus to generate a template in which the first virtualized compute resource of the storage node preprocesses the unstructured data. 8. The apparatus of claim 1 , wherein the memory includes the programmed instructions that, when executed by the processor, further cause the apparatus to determine, based on a mapping operation of the preprocessing the unstructured data resulting in an increase of data volume, to execute the mapping operation at the compute node of the client system. 9. A non-transitory computer readable storage medium comprising instructions stored thereon that, when executed by a processor, cause the processor to: assign, by a resource scheduler, a first virtualized compute resource to a storage node of an object storage platform on a first cloud, the storage node including a virtualized storage resource, wherein unstructured data is stored in the storage node; preprocess, by the first virtualized compute resource of the storage node, at the storage node on the first cloud, the unstructured data stored in the storage node to generate preprocessed data; transfer, via a public network, the preprocessed data generated by the first virtualized compute resource of the storage node to a compute node on a client system separate from the first cloud; and assign, by the resource scheduler, a second virtualized compute resource to the compute node on the client system, wherein the second compute resource trains a machine learning (ML) model using the preprocessed data. 10. The medium of claim 9 , wherein the storage node comprises a hyper-converged infrastructure (HCI) node. 11. The medium of claim 9 , wherein preprocessing includes at least two preprocessing steps, wherein a first preprocessing step of the at least two preprocessing steps includes parsing the unstructured data, and wherein a second preprocessing step of the at least two preprocessing steps includes filtering the unstructured data. 12. The medium of claim 9 , wherein the second virtualized compute resource further preprocesses the preprocessed unstructured data before using the preprocessed unstructured data to train the ML model. 13. The medium of claim 9 , wherein the unstructured data is partitioned into a first chunk on the storage node and a second chunk on a second storage node of the object storage platform, wherein a third virtualized compute resource preprocesses the second chunk. 14. The medium of claim 9 , wherein the storage node is an accelerator-enabled node. 15. The medium of claim 9 , comprising the instructions stored thereon that, when executed by the processor, further cause the processor to generate a template in which the first virtualized compute resource of the storage node preprocesses the unstructured data. 16. The medium of claim 9 , comprising the instructions stored thereon that, when executed by the processor, further cause the processor to determine, based on a mapping operation of the preprocessing the unstructured data resulting in an increase of data volume, to execute the mapping operation at the compute node of the client system. 17. A computer-implemented method, comprising: assigning, by a processor associated with a resource scheduler, a first virtualized compute resource to a storage node of an object store on a first cloud, the storage node including a virtualized storage resource, wherein unstructured data is stored in the storage node; preprocessing, by the first virtualized compute resource of the storage node, at the storage node on the first cloud, the unstructured data stored in the storage node to generate preprocessed data; transferring, via a public network, the preprocessed data generated by the first virtualized compute resource of the storage node to a compute node on a client system separate from the first cloud; and assigning, by the processor associated with the resource scheduler, a second virtualized compute resource to the compute node on the client system, wherein the second virtualized compute resource trains a machine learning (ML) model using the preprocessed data. 18. The method of claim 17 , wherein a first preprocessing step of the preprocessing includes parsing the unstructured data, and wherein a second preprocessing step of the preprocessing includes filtering the unstructured data. 19. The method of claim 17 , wherein the second virtualized compute resource further preprocesses the preprocessed unstructured data before using the preprocessed unstructured data to train the ML model. 20. The method of claim 17 , wherein the unstructured data is partitioned into a first chunk on the storage node and a second chunk on a second storage node of the object storage platform, wherein a third virtualized compute resource preprocesses the second chunk. 21. The method of claim 17 , wherein the storage node is an accelerator-enabled node. 22. The method of claim 17 , further comprising generating a template in which the first virtualized compute resource of the storage node preprocesses the unstructured data. 23. The computer-implemented method of claim 17 , further comprising determining, based on a mapping operation of the preprocessing the unstructured data resulting in an increase of data volume, to execute the mapping operation at the compute node of the client system.
Hypervisors; Virtual machine monitors · CPC title
Machine learning · CPC title
for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.