Protection of water providing entities from loss due to environmental events
US-11164266-B2 · Nov 2, 2021 · US
US11544502B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11544502-B2 |
| Application number | US-201916721652-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 19, 2019 |
| Priority date | Dec 19, 2019 |
| Publication date | Jan 3, 2023 |
| Grant date | Jan 3, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure relates to processing operations configured to uniquely utilize indexing of content to improve content retrieval processing, particularly when working with large data sets. The techniques described herein enables efficient content retrieval when working with large data sets such as those that may be associated with a plurality of tenants of a data storage application/service. Among other technical advantages, the present disclosure is applicable to train a classifier using relevant samples based on text search in tenant-specific scenarios, where accurate searching can be executed for content associated with one or more tenant accounts of an application/service concurrently in milliseconds even in instances where there may be millions of documents to be searched. As an example, exemplary data shards may be generated and managed for efficient and scalable content retrieval processing including training of a classifier (e.g., artificial intelligence classifier) and real-time (or near real-time) query processing.
Opening claim text (preview).
What is claimed is: 1. A method comprising: retrieving indexing of file content associated with a tenant of an application or service; generating a plurality of data shards usable for training of an artificial intelligence classifier, wherein each of the plurality of data shards comprises a plurality of indexes from the indexing of the file content that are representative of a randomized sampling of the file content associated with the tenant; generating a processing queue that groups generated data shards for processing during execution of rounds of training of the artificial intelligence classifier, wherein the processing queue prioritizes the plurality of data shards as a first grouping that is processed during a round of training of the artificial intelligence classifier; pre-loading, prior to executing of the training of the artificial intelligence classifier, the plurality of data shards into a memory of a computing device that is configured to execute the training of the artificial intelligence classifier, wherein the pre-loading propagates the plurality of data shards together as the first grouping for training of the artificial intelligence classifier; and reading, from the memory, the plurality of data shards during executing of the training of the artificial intelligence classifier. 2. A system comprising: at least one processor; and a memory, operatively connected with the at least one processor, storing computer-executable instructions that, when executed by the at least one processor, causes the at least one processor to execute a method that comprises: retrieving indexing of file content associated with a tenant of an application or service; generating a plurality of data shards usable for training of an artificial intelligence classifier, wherein each of the plurality of data shards comprises a plurality of indexes from the indexing of the file content that are representative of a randomized sampling of the file content associated with the tenant; generating a processing queue that groups generated data shards for processing during execution of rounds of training of the artificial intelligence classifier, wherein the processing queue prioritizes the plurality of data shards as a first grouping that is processed during a round of training of the artificial intelligence classifier; pre-loading, prior to executing of the training of the artificial intelligence classifier, the plurality of data shards into a memory of a computing device that is configured to execute the training of the artificial intelligence classifier, wherein the pre-loading propagates the plurality of data shards together as the first grouping for training of the artificial intelligence classifier; and reading, from the memory, the plurality of data shards during executing of the training of the artificial intelligence classifier. 3. A computer-readable memory device having stored thereon instructions that, upon execution by one or more processors, cause the one or more processors to: retrieve indexing of file content for each of a plurality of tenant accounts of an application or service; generate a plurality of data shards usable for training an artificial intelligence classifier, wherein each of the plurality of data shards comprises a plurality of indexes from the indexing of the file content for a specific tenant account of the plurality of tenant accounts, the plurality of indexes being representative of a randomized sampling of the file content of the specific tenant account; generate processing queue that groups generated data shards for processing during execution of rounds of training of the artificial intelligence classifier, wherein the processing queue prioritizes the plurality of data shards as a first grouping that is processed during a round of training of the artificial intelligence classifier; pre-loading, prior to executing of the training of the artificial intelligence classifier, the plurality of data shards into a memory of a computing device that is configured to execute the training of the artificial intelligence classifier, wherein the pre-loading propagates the plurality of data shards together as the first grouping for training of the artificial intelligence classifier; and reading, from the memory, the plurality of data shards during executing of the training of the artificial intelligence classifier. 4. The method of claim 1 , wherein the generating of the plurality of data shards further comprises identifying a predetermined number of files for a size of each of the plurality of data shards, and randomly selecting, as the randomized sampling, indexes associated with the predetermined number of files from the file content. 5. The method of claim 1 , wherein the generating of the plurality of data shards further comprises applying preset rules to create the randomized sampling of the file content, and wherein the preset rules comprise a first rule that identifies a predetermined number of files for a size of a data shard, and a second rule that randomizes file types of the file content represented in the randomized sampling of file content. 6. The method of claim 1 , wherein the processing queue further comprises a second grouping of a plurality of data shards specific to a second tenant of the application or service, and wherein the pre-loading further comprises preloading the second grouping of the plurality of data shards into the memory for reading that occurs during a second round of the training of the artificial intelligence classifier. 7. The method of claim 1 , wherein each of the plurality of data shards is specific to a single tenant of the application or service, and wherein the method further comprising: executing a round of training of the artificial intelligence classifier using the plurality of data shards that are specific to the single tenant. 8. The method of claim 1 , wherein the pre-loading automatically occurs based on a detection of user access to an artificial intelligence processing application or service that is configured for the training of the artificial intelligence classifier. 9. The method of claim 1 , wherein the pre-loading automatically occurs based on a detection of a search query, entered into a user interface of an artificial intelligence processing application or service, that is used for the training of the artificial intelligence classifier, and wherein the method further comprising: executing a round of training of the artificial intelligence classifier based on the search query and the plurality of data shards. 10. The system of claim 2 , wherein the generating of the plurality of data shards further comprises identifying a predetermined number of files for a size of each of the plurality of data shards, and randomly selecting, as the randomized sampling, indexes associated with the predetermined number of files from the file content. 11. The system of claim 2 , wherein the generating of the plurality of data shards further comprises applying preset rules to create the randomized sampling of the file content, and wherein the preset rules comprise a first rule that identifies a predetermined number of files for a size of a data shard, and a second rule that randomizes file types of the file content represented in the randomized sampling of file content. 12. The system of claim 5 , wherein the processing queue further comprises a second grouping of a plurality of data shards specific to a second tenant of the application or service, and wherein the pre-loading further comprises preloading the second grouping of the plurality of data shards into the memory for reading that occurs during a second round of the training of the
Indexing structures · CPC title
Machine learning · CPC title
Management thereof · CPC title
Run-time optimisation · CPC title
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.