Archiving indexed data

US10152480B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10152480-B2
Application numberUS-201514611225-A
CountryUS
Kind codeB2
Filing dateJan 31, 2015
Priority dateJan 31, 2015
Publication dateDec 11, 2018
Grant dateDec 11, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Raw data in distributed servers is divided into groups of data called buckets containing raw data that have timestamps that fall within a specific time range. When a bucket becomes inactive a server can archive the bucket to an external storage system. The external storage system containing archived data may be specified in a search query. Archived data from the external storage system is obtained, processed, and a search performed on the processed archived data using the search query.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: organizing a plurality of timestamped events into groups of events, each timestamped event in the plurality of timestamped events including a portion of raw machine data associated with a timestamp, wherein the portion of raw machine data reflects activity of a data source and is produced by the data source, wherein timestamped events in a group of events have associated timestamps that fall within a specific time frame; storing the groups of events to a field-searchable data store; and archiving a first group of events and a second group of events by sending data associated with the first group of events and the second group of events to an external storage system that is external to the data store, wherein the first group of events comprises timestamped events having associated timestamps that fall within a first time frame, and the second group of events comprises timestamped events having associated timestamps that fall within a second time frame, wherein the first group of events is archived to the external storage system based, at least in part, on a time span that the first group of events has been stored in the data store. 2. The method of claim 1 , wherein the sending the data associated with the first group of events sends raw data associated with the first group of events to the external storage system. 3. The method of claim 1 , wherein the sending the data associated with the first group of events sends timestamped raw data associated with the first group of events to the external storage system. 4. The method of claim 1 , wherein the sending the data associated with the first group of events sends the timestamped events in the first group of events to the external storage system. 5. The method of claim 1 , wherein the sending the data associated with the first group of events sends event data for each timestamped event in the first group of events to the external storage system. 6. The method of claim 1 , wherein the external storage system is a Hadoop distributed file system. 7. The method of claim 1 , wherein the external storage system is a Hadoop compatible file system. 8. The method of claim 1 , wherein the archiving step further comprises: sending the data associated with the first group of events to the external storage system after a total memory space consumed by all groups of events exceeds a limit. 9. The method of claim 1 , further comprising: receiving a search query, the search query specifying an external storage system storing the data associated with the group of events; sending a request to the external storage system for the data associated with the group of events; receiving the data associated with the group of events from the external storage system; parsing the received data into a plurality of second timestamped events; applying a late binding schema to the plurality of second timestamped events; searching the plurality of second timestamped events using parameters from the search query; causing results from the searching the plurality of second timestamped events to be displayed. 10. The method of claim 1 , further comprising: wherein the sending the data associated with the group of events sends raw data associated with the group of events to the external storage system; receiving a search query, the search query specifying an external storage system storing the raw data associated with the group of events; sending a request to the external storage system for the raw data associated with the group of events; receiving the raw data associated with the group of events from the external storage system; parsing the received raw data associated with the group of events into a plurality of second timestamped events, each timestamped event in the plurality of second timestamped events comprising at least a portion of the parsed received raw data associated with the group of events; applying a late binding schema to the plurality of second timestamped events; searching the plurality of second timestamped events using parameters from the search query; causing results from the searching the plurality of second timestamped events to be displayed. 11. The method of claim 1 , further comprising: receiving a path name for the external storage system via a graphical user interface. 12. The method of claim 1 , further comprising: limiting bandwidth used to send the data associated with the first group of events to the external storage system by lowering a rate of data being sent to the external storage system. 13. The method of claim 1 , further comprising, prior to archiving the first group of events, determining that the first group of events is to be archived. 14. The method of claim 13 , wherein determining that the first group of events is to be archived comprises determining that the first group of events is inactive. 15. The method of claim 13 , wherein determining that the first group of events is to be archived comprises determining that the first group of events is to be archived based on a user setting. 16. A non-transitory computer readable storage medium, storing software instructions, which when executed by one or more processors cause performance of: organizing a plurality of timestamped events into groups of events, each timestamped event in the plurality of timestamped events including a portion of raw machine data associated with a timestamp, wherein the portion of raw machine data reflects activity of a data source and is produced by the data source, wherein timestamped events in a group of events have associated timestamps that fall within a specific time frame; storing the groups of events to a field-searchable data store; and archiving a first group of events and a second group of events by sending data associated with the first group of events and the second group of events to an external storage system that is external to the data store, wherein the first group of events comprises timestamped events having associated timestamps that fall within a first time frame, and the second group of events comprises timestamped events having associated timestamps that fall within a second time frame, wherein the first group of events is archived to the external storage system based, at least in part, on a time span that the first group of events has been stored in the data store. 17. The non-transitory computer readable storage medium of claim 16 , wherein the sending the data associated with the first group of events sends raw data associated with the first group of events to the external storage system. 18. The non-transitory computer readable storage medium of claim 16 , wherein the sending the data associated with the first group of events sends event data for each timestamped event in the first group of events to the external storage system. 19. The non-transitory computer readable storage medium of claim 16 , wherein the external storage system is a Hadoop distributed file system. 20. The non-transitory computer readable storage medium of claim 16 , wherein the external storage system is a Hadoop compatible file system. 21. The non-transitory computer readable storage medium of claim 16 , further comprising: receiving a search query, the search query specifying an external storage system storing the data associated with the group of events; sending a request to the external storage system for the data associated with the group of events; receiving the data associated with the grou

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Physics · mapped topic

  • G06F16/113Primary

    Details of archiving (lifecycle management in storage systems G06F3/0649; point-in-time backing up or restoration of persistent data G06F11/1446) · CPC title

  • Management thereof · CPC title

  • Query translation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10152480B2 cover?
Raw data in distributed servers is divided into groups of data called buckets containing raw data that have timestamps that fall within a specific time range. When a bucket becomes inactive a server can archive the bucket to an external storage system. The external storage system containing archived data may be specified in a search query. Archived data from the external storage system is obtai…
Who is the assignee on this patent?
Splunk Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/30073. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 11 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).