File system provisioning for workload
US-2024037067-A1 · Feb 1, 2024 · US
US12222899B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12222899-B2 |
| Application number | US-202217950330-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 22, 2022 |
| Priority date | Sep 22, 2022 |
| Publication date | Feb 11, 2025 |
| Grant date | Feb 11, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system and a method are disclosed for automatic content upload and process. The system retrieves a set of files from a source location based on instructions received from a client device of a user. The system then classifies the set of files into a plurality of categories corresponding to a sequence of one or more services configured to process or store files. The system then generates a data structure storing key values, where the key values are derived based on respective processing of subsets of files. Responsive to receiving an input to execute logic relating to the set of files, the system determines that the input is associated with one or more of the key values, retrieves the one or more of the key values, and executing the logic using the one or more retrieved key values.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: retrieving, using at least one processor, a set of files from a source location based on instructions received from a client device; classifying, using the at least one processor, the set of files into a plurality of categories corresponding to a plurality of channels, each of the plurality of channels comprising a sequence of one or more services configured to process or store files; generating, using the at least one processor, a data structure storing key values, the key values derived, using one or more machine learning models, based on respective processing of subsets of files, wherein the subsets of files are processed by different ones of the channels in the plurality of channels, wherein processing of the subset of files includes receiving one or more input features associated with the subset of files; predicting, using the one or more machine learning models, based on the one or more input features associated with the subset of files, a scaling associated with at least one of: at least one channel in the plurality of channels and at least one service in the one or more services, the scaling including at least one of a processing workload and a processing speed associated with at least one of: the at least one channel and the at least one service; processing, based on the predicting, at least one file in the subset of files; receiving, using the at least one processor, an input to execute logic relating to the set of files; determining, using the at least one processor, that the input is associated with one or more of the key values; retrieving, using the at least one processor, the one or more of the key values; and executing, using the at least one processor, the logic using the one or more retrieved key values. 2. The method of claim 1 , further comprising: monitoring a workload or a processing speed of each of the plurality of channels; and automatically scaling up a particular channel in the plurality of channels based in part on the workload or the processing speed of the particular channel. 3. The method of claim 2 , wherein the scaling up the particular channel includes adding a new channel that performs same processing as the particular channel. 4. The method of claim 2 , wherein the scaling up the particular channel includes allocating additional hardware resources to the particular channel. 5. The method of claim 1 , wherein the classifying the set of files into the plurality of categories includes: monitoring the set of files for a particular extension; and responsive to identifying the particular extension in a given file of the set of files, separating the given file into a plurality of files. 6. The method of claim 1 , wherein the classifying the set of files into the plurality of categories includes: monitoring the set of files for a threshold file size; and responsive to identifying a given file having at least the threshold file size, preventing the given file from being further processed until a given condition is met or delaying further processing of the given file to a later time. 7. The method of claim 1 , wherein the classifying the set of files into the plurality of categories includes: identifying that a file in the set of files is password protected or encrypted; and responsive to identifying that the file is password protected or encrypted, obtaining a password or decryption key; and removing the password or encryption from the file or decrypting the file based on the password or decryption key. 8. The method of claim 1 , wherein the plurality of channels includes an artificial intelligence (AI) channel and a non-AI channel, and the one or more services of the AI channel includes an OCR processor configured to recognize text from a PDF file or an image file. 9. The method of claim 8 , wherein the AI channel further includes a data extractor configured to: parse the text recognized by the OCR processor to generate cognitive data having a set of key values; and generating a data structure storing the set of key values. 10. The method of claim 9 , wherein the AI channel further includes a table extractor configured to: parse the text recognized by the OCR processor to identify a table having a set of key values; and generate a data structure storing the set of key values. 11. A non-transitory computer-readable medium comprising memory with instructions encoded thereon, the instructions, when executed, causing one or more processors to: retrieve a set of files from a source location based on instructions received from a client device; classify the set of files into a plurality of categories corresponding to a plurality of channels, each of the plurality of channels comprising a sequence of one or more services configured to process or store files; generate a data structure storing key values, the key values derived, using one or more machine learning models, based on respective processing of subsets of files, wherein the subsets of files are processed by different ones of the channels in the plurality of channels, wherein processing of the subset of files includes receiving one or more input features associated with the subset of files; predicting, using the one or more machine learning models, based on the one or more input features associated with the subset of files, a scaling associated with at least one of: at least one channel in the plurality of channels and at least one service in the one or more services, the scaling including at least one of a processing workload and a processing speed associated with at least one of: the at least one channel and the at least one service; processing, based on the predicting, at least one file in the subset of files; receive an input to execute logic relating to the set of files; determine that the input is associated with one or more of the key values; retrieve the one or more of the key values; and execute the logic using the one or more retrieved key values. 12. The non-transitory computer readable medium of claim 11 , wherein the one or more processors are further configured to monitor a workload of each of the plurality of channels; and automatically scale up or down a particular channel in the plurality of channels based in part on the workload of the particular channel. 13. The non-transitory computer readable medium of claim 12 , wherein scaling up the particular channel includes adding a new channel that performs same processing as the particular channel. 14. The non-transitory computer readable medium of claim 12 , wherein scaling up or down the particular channel includes allocating more hardware resources to the particular channel, and allowing the particular channel to have a greater processing power. 15. The non-transitory computer readable medium of claim 11 , wherein classifying the set of files into the plurality of categories includes: identifying an extension of a file in the set of files; and responsive to identifying a particular extension, separating the file into a plurality of files. 16. The non-transitory computer readable medium of claim 11 , wherein classifying the set of files into the plurality of categories includes: identifying a size of a file in the set of files; and responsive to identifying that the size is greater than a threshold, setting the file aside. 17. The non-transitory computer readable medium of claim 11 , wherein classifying the set of files into the plurality of categories includes: identifying that a file in the set of files is password protected or e
Knowledge engineering; Knowledge acquisition · CPC title
Extracting the logical structure, e.g. chapters, sections or page numbers; Identifying elements of the document, e.g. authors · CPC title
Parsing · CPC title
Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation · CPC title
Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.