Search filtered file system using secondary storage, including multi-dimensional indexing and searching of archived files
US-9367548-B2 · Jun 14, 2016 · US
US10846266B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10846266-B2 |
| Application number | US-201816130873-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 13, 2018 |
| Priority date | Sep 14, 2017 |
| Publication date | Nov 24, 2020 |
| Grant date | Nov 24, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An improved content indexing (CI) system is disclosed herein. For example, the improved CI system may include a distributed architecture of client computing devices, media agents, a single backup and CI database, and a pool of servers. After a file backup occurs, the backup and CI database may include file metadata indices and other information associated with backed up files. Servers in the pool of servers may, in parallel, query the backup and CI database for a list of files assigned to the respective server that have not been content indexed. The servers may then request a media agent to restore the assigned files from secondary storage and provide the restored files to the servers. The servers may then content index the received restored files. Once the content indexing is complete, the servers can send the content index information to the backup and CI database for storage.
Opening claim text (preview).
What is claimed is: 1. A networked information management system for content indexing emails, the networked information management system comprising: a content indexing proxy having one or more first hardware processors, wherein the content indexing proxy is configured with first computer-executable instructions that, when executed, cause the content indexing proxy to: receive, by a first thread executing on the content indexing proxy, identification of emails assigned to the content indexing proxy by a master content indexing proxy, wherein the identified emails are each associated with an email page in a plurality of email pages, and wherein an email page in the plurality of email pages comprises multiple emails; and for each email page in the plurality of email pages, transmit, by the first thread to an indexing storage system, a query for secondary copy location data corresponding to the emails associated with the respective email page, receive, by the first thread, the secondary copy location data, transmit, by a second thread executing on the content indexing proxy, an instruction to a first computing device that executes a media agent to restore secondary copies stored at locations indicated by the secondary copy location data, receive, by a third thread executing on the content indexing proxy, an acknowledgment from the first computing device that a restoration of the secondary copies is complete, and transmit, by a fourth thread executing on the content indexing proxy, a request to content index the restored secondary copies; and one or more computing devices in communication with the content indexing proxy, wherein the one or more computing devices each have one or more second hardware processors, wherein the one or more computing devices are configured with second computer-executable instructions that, when executed, cause the one or more computing devices to content index the restored secondary copies. 2. The networked information management system of claim 1 , wherein the first computer-executable instructions, when executed, further cause the content indexing proxy to simultaneously transmit an instruction to the first computing device to restore secondary copies of emails associated with a first email page in the plurality of email pages and transmit a query for secondary copy location data corresponding to emails associated with a second email page in the plurality of email pages. 3. The networked information management system of claim 1 , wherein the first computer-executable instructions, when executed, further cause the content indexing proxy to: for an attachment file associated with a first email in a first email page in the plurality of email pages, transmit, by the first thread to the indexing storage system, a query for secondary copy location data corresponding to the attachment file; receive, by the first thread, the secondary copy location data corresponding to the attachment file; transmit, by the second thread, an instruction to the first computing device to restore a secondary copy of the attachment file stored at a location indicated by the secondary copy location data corresponding to the attachment file; receive, by the third thread, an acknowledgment from the first computing device that a restoration of the secondary copy of the attachment file is complete; and transmit, by the fourth thread, a request to content index the restored secondary copy of the attachment file. 4. The networked information management system of claim 3 , wherein the secondary copy of the attachment file is stored separately from a secondary copy of the first email in a secondary storage device. 5. The networked information management system of claim 1 , wherein the secondary copy location data comprises at least one of logical paths to secondary copies stored in a secondary storage device or offsets indicating where the secondary copies are stored in the secondary storage device. 6. The networked information management system of claim 1 , wherein the emails assigned to the content indexing proxy are emails that have not yet been content indexed. 7. A networked information management system for content indexing emails, the networked information management system comprising: a content indexing proxy having one or more first hardware processors, wherein the content indexing proxy is configured with first computer-executable instructions that, when executed, cause the content indexing proxy to: receive, by a first thread executing on the content indexing proxy, identification of emails assigned to the content indexing proxy by a master content indexing proxy, wherein the identified emails are each associated with an email page in a plurality of email pages; and for each email page in the plurality of email pages, transmit, by the first thread to an indexing storage system, a query for secondary copy location data corresponding to the emails associated with the respective email page, receive, by the first thread, the secondary copy location data, transmit, by a second thread executing on the content indexing proxy, an instruction to a first computing device that executes a media agent to restore secondary copies stored at locations indicated by the secondary copy location data, receive, by a third thread executing on the content indexing proxy, an acknowledgment from the first computing device that a restoration of the secondary copies is complete, and transmit, by a fourth thread executing on the content indexing proxy, a request to content index the restored secondary copies; and one or more computing devices in communication with the content indexing proxy, wherein the one or more computing devices each have one or more second hardware processors, wherein the one or more computing devices are configured with second computer-executable instructions that, when executed: cause the one or more computing devices to content index the restored secondary copies; and extract one or more keywords and generate one or more previews using the restored secondary copies. 8. The networked information management system of claim 7 , wherein the second computer-executable instructions, when executed, further cause the one or more computing devices to store the one or more keywords and the one or more previews in different databases. 9. The networked information management system of claim 7 , wherein the second computer-executable instructions, when executed, further cause the one or more computing devices to store the one or more keywords and a path to a storage location of the one or more previews in a backup and content indexing database. 10. The networked information management system of claim 1 , wherein the restored secondary copies are in a markup language format. 11. A computer-implemented method for content indexing emails, the computer-implemented method comprising: receiving, by a first thread executing on a content indexing proxy, identification of emails assigned to the content indexing proxy by a master content indexing proxy, wherein the identified emails are each associated with an email page in a plurality of email pages, and wherein an email page in the plurality of email pages comprises multiple emails; and for each email page in the plurality of email pages, transmitting, by the first thread to an indexing storage system, a query for secondary copy location data corresponding to the emails associated with the respective email page, receiving, by the first thread, the secondary copy location data, transmitting, by a second thread executing on the content indexing proxy, an instruction to a first computing device that executes a media agent to restore secondary
Computer-aided management of electronic mailing [e-mailing] · CPC title
Handling conversation history, e.g. grouping of messages in sessions or threads · CPC title
Storing data temporarily at an intermediate stage, e.g. caching · CPC title
using de-duplication of the data · CPC title
Backup restoration techniques · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.