Site-based search affinity

US9130971B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9130971-B2
Application numberUS-201414266812-A
CountryUS
Kind codeB2
Filing dateApr 30, 2014
Priority dateMay 15, 2012
Publication dateSep 8, 2015
Grant dateSep 8, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to various embodiments, techniques are described for managing data within a multi-site clustered data intake and query system. A data intake and query system as described herein generally refers to a system for collecting, retrieving, and analyzing data. In this context, a clustered data intake and query system generally refers to a system environment that is configured to provide data redundancy and other features that improve the availability of data stored by the system. For example, a clustered data intake and query system may be configured to store multiple copies of data stored by the system across multiple components such that recovery from a failure of one or more of the components is possible by using copies of the data stored elsewhere in the cluster.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, at an indexer, a set of search affinity identifiers; wherein each search affinity identifier of the set of search affinity identifiers indicates, for a particular subset of data accessible to the indexer and for a particular site of a plurality of sites from which a query may originate, whether the indexer has primary responsibility for responding to queries originating from the particular site based on data from the particular subset of data; wherein each site of the plurality of sites represents a user-specified grouping of one or more computing resources corresponding to a particular geographic location; receiving, from a first search head located at a first site, (i) a first query to search a subset of data accessible to the indexer, and (ii) a first site identifier identifying the first site at which the first search head is located; determining, based on both the first site identifier and a particular search affinity identifier of the set of search affinity identifiers, that the indexer is to respond to the first query with a result from searching the subset of data; in response to determining that the indexer is to respond to the first query, sending, to the first search head, the result from searching the subset of data. 2. The method of claim 1 , wherein determining that the indexer is to respond to the first query with a result from searching the subset of data includes determining that the indexer has primary responsibility for responding to queries for the subset of data for the first site. 3. The method of claim 1 , further comprising: receiving, at the indexer, a second query from a second search head to search the subset of data, the second query including a second site identifier identifying a second site at which the second search head is located; determining, based on both the second site identifier and the particular search affinity identifier of the set of search affinity identifiers, that the indexer is not to respond to the second query with a result from searching the subset of data. 4. The method of claim 1 , wherein the particular search affinity identifier is a bitmask, and wherein each digit of the bitmask represents a particular site of the plurality of sites from which a query may originate. 5. The method of claim 1 , further comprising: receiving, at the indexer, a second query from a second search head to search the subset of data, the second query including a second site identifier identifying a second site at which the second search head is located; determining, based on both the second site identifier and the particular search affinity identifier of the set of search affinity identifiers, that the indexer is not to respond to the second query with a result from searching the subset of data; wherein the first query and the second query are identical. 6. The method of claim 1 , further comprising: receiving, at the indexer, a second query from a second search head to search the subset of data, the second query including a second site identifier identifying a second site at which the second search head is located; determining, based on both the second site identifier and the particular search affinity identifier of the set of search affinity identifiers, that the indexer is not to respond to the second query with a result from searching the subset of data; wherein the first query and the second query are different. 7. The method of claim 1 , further comprising: receiving, at the indexer, raw data; separating the raw data into a plurality of events included in the subset of data; determining, for each event in the plurality of events, a time stamp; and storing the subset of data in a data store. 8. The method of claim 1 , further comprising: receiving, at the indexer, raw data; separating the raw data into a plurality of events included in the subset of data; storing the subset of data in a data store; identifying a replication factor that indicates a number of times that the subset of data is to be replicated; and sending the subset of data to a number of other indexers, wherein the number corresponds to the replication factor. 9. The method of claim 1 , further comprising: receiving, at the indexer, raw data; separating the raw data into a plurality of events included in the subset of data; storing the subset of data in a data store; identifying a site replication factor that indicates a number of sites at which the subset of data is to be replicated; and sending the subset of data to second indexers located at the number of sites. 10. The method of claim 1 , further comprising: wherein the set of search affinity identifiers is associated with a first generation identifier; receiving, at the indexer, a second set of search affinity identifiers associated with a second generation identifier. 11. The method of claim 1 , further comprising: wherein the indexer stores a plurality of sets of search affinity identifiers, and wherein each set of search affinity identifiers of the plurality of sets of search affinity identifiers is associated with a generation identifier; receiving, from the first search head, a particular generation identifier identifying a particular set of search affinity identifiers of the plurality of sets of search affinity identifiers. 12. One or more non-transitory computer-readable storage media, storing software instructions, which when executed by one or more processors cause performance of steps of: receiving, at an indexer, a set of search affinity identifiers; wherein each search affinity identifier of the set of search affinity identifiers indicates, for a particular subset of data accessible to the indexer and for a particular site of a plurality of sites from which a query may originate, whether the indexer has primary responsibility for responding to queries originating from the particular site based on data from the particular subset of data; wherein each site of the plurality of sites represents a user-specified grouping of one or more computing resources corresponding to a particular geographic location; receiving, from a first search head located at a first site, (i) a first query to search a subset of data accessible to the indexer, and (ii) a first site identifier identifying the first site at which the first search head is located; determining, based on both the first site identifier and a particular search affinity identifier of the set of search affinity identifiers, that the indexer is to respond to the first query with a result from searching the subset of data; in response to determining that the indexer is to respond to the first query, sending, to the first search head, the result from searching the subset of data. 13. The one or more non-transitory computer-readable storage media of claim 12 , wherein determining that the indexer is to respond to the first query with a result from searching the subset of data includes determining that the indexer has primary responsibility for responding to queries for the subset of data for the first site. 14. The one or more non-transitory computer-readable storage media of claim 12 , wherein the instructions, when executed by the one or more computing devices, further cause performance of: receiving, at the indexer, a second query from a second search head to search the subset of data, the second query including a second site identifier identifying a second site at which the second search head is located; determining, based on both the second site identifier and the particular search affinity identifier of the set of

Assignees

Inventors

Classifications

  • Redundant storage or storage space (G06F11/2056 takes precedence) · CPC title

  • in relation to availability · CPC title

  • Geographical information databases · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • Synchronous replication · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9130971B2 cover?
According to various embodiments, techniques are described for managing data within a multi-site clustered data intake and query system. A data intake and query system as described herein generally refers to a system for collecting, retrieving, and analyzing data. In this context, a clustered data intake and query system generally refers to a system environment that is configured to provide dat…
Who is the assignee on this patent?
Splunk Inc, Splunk Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/24575. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 08 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).