Indexing and relaying data to hot storage

US11669507B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11669507-B2
Application numberUS-202117518407-A
CountryUS
Kind codeB2
Filing dateNov 3, 2021
Priority dateMay 9, 2018
Publication dateJun 6, 2023
Grant dateJun 6, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, performed by one or more processors, is disclosed, the method comprising receiving a stream of log data from one or more applications and indexing a plurality of different portions of the received stream to respective locations of a cold storage system. The method may also comprise storing, in an index, catalog pointers to the respective locations of the indexed portions in the cold storage system. One or more requests for log data may be received, and the method may also comprise subsequently identifying from the index catalog one or more pointers to respective indexed portions appropriate to at least part of the one or more requests, and sending of the identified one or more indexed portions to one or more hot storage systems each associated with a respective search node for processing of one or more search requests.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: indexing, by one or more indexing nodes, a plurality of different portions of a stream of log data to obtain a plurality of indexed portions; storing the plurality of indexed portions in one or more cold storage systems; storing, in an index catalog, a pointer to a location of each of the plurality of indexed portions stored in the one or more cold storage systems; receiving, by one or more search nodes, one or more requests for log data, wherein the indexing is performed by the one or more indexing nodes independently from the receiving by the one or more search nodes; in response to determining that a particular search node of the one or more search nodes cannot process a particular request of the one or more requests based on data stored in one or more hot storage systems associated with the particular search node: identifying, from the index catalog, a pointer to a location of an indexed portion based on at least part of the particular request; and sending the indexed portion to the one or more hot storage systems associated with the particular search node; wherein the method is performed using one or more processors. 2. The method of claim 1 , further comprising: monitoring the stream of log data with respect to a predetermined quantity and allocating the stream of log data to the plurality of different portions for indexing based on the predetermined quantity being reached; and wherein each portion of the plurality of different portions represents a discretely identifiable section of the stream of log data. 3. The method of claim 2 , wherein the predetermined quantity is one or both of an amount of log data and a time period over which the log data is received to provide a plurality of time and/or space bounded portions. 4. The method of claim 2 , further comprising, prior to the predetermined quantity being reached, temporarily indexing the stream of log data into an indexing hot storage system and storing in the index catalog a pointer to the temporarily indexed stream of log data in the indexing hot storage system. 5. The method of claim 4 , wherein the temporarily indexed stream of log data is overwritten by additional temporarily indexed stream of log data subsequent to the predetermined quantity being reached. 6. The method of claim 1 , wherein a number of indexing nodes in the one or more indexing nodes increases or decreases in dependence on an amount or a rate of log data in the stream of log data. 7. The method of claim 1 , wherein the plurality of different portions of the stream of log data is immutable. 8. The method of claim 1 , wherein the plurality of different portions of the stream of log data is time ordered. 9. The method of claim 8 , wherein one or more of the plurality of indexed portions are automatically deleted from, or overwritten in, the one or more cold storage systems after a predetermined period of time. 10. The method of claim 1 , further comprising storing, in the index catalog, metadata associated with each pointer, wherein the metadata is indicative of the log data stored in the plurality of indexed portions. 11. The method of claim 10 , wherein the plurality of indexed portions comprises discrete lines of log data, each line conforming to a known schema, and wherein the metadata comprises a portion of the log data from one or more fields of each line defined by the known schema. 12. The method of claim 1 , wherein the one or more requests for log data are received through one or more Application Programming Interfaces (API). 13. The method of claim 1 , further comprising determining which of the one or more search nodes to send one or more of the plurality of indexed portions to, based on available capacity of the one or more hot storage systems associated with each of the one or more search nodes. 14. The method of claim 1 , further comprising adjusting a number of allocated search nodes, of the one or more search nodes, for receiving and processing the one or more requests for log data based on a variable parameter, wherein the variable parameter is based on one or more of a number of requests for log data received over a predetermined period of time and a time for which the sent indexed portions have been stored at the one or more search nodes. 15. A non-transitory storage media storing instructions which, when executed using one or more processors, cause: indexing, by one or more indexing nodes, a plurality of different portions of a stream of log data to obtain a plurality of indexed portions; storing the plurality of indexed portions in one or more cold storage systems; storing, in an index catalog, a pointer to a location of each of the plurality of indexed portions stored in the one or more cold storage systems; receiving, by one or more search nodes, one or more requests for log data, wherein the indexing is performed by the one or more indexing nodes independently from the receiving by the one or more search nodes; in response to determining that a particular search node of the one or more search nodes cannot process a particular request of the one or more requests based on data stored in one or more hot storage systems associated with the particular search node: identifying, from the index catalog, a pointer to a location of an indexed portion based on at least part of the particular request; and sending the indexed portion to the one or more hot storage systems associated with the particular search node. 16. The non-transitory storage media of claim 15 , further storing instructions which, when executed using the one or more processors, cause: monitoring the stream of log data with respect to a predetermined quantity and allocating the stream of log data to the plurality of different portions for indexing based on the predetermined quantity being reached; wherein each portion of the plurality of different portions represents a discretely identifiable section of the stream of log data; and wherein the predetermined quantity is one or both of an amount of log data and a time period over which the log data is received to provide a plurality of time and/or space bounded portions. 17. The non-transitory storage media of claim 15 , further storing instructions which, when executed using the one or more processors, cause: adjusting a number of allocated search nodes, of the one or more search nodes, for receiving and processing the one or more requests for log data based on a variable parameter; and wherein the variable parameter is based on one or more of a number of requests for log data received over a predetermined period of time and a time for which the sent indexed portions have been stored at the one or more search nodes. 18. A computing system comprising: one or more processors; storage media; and instructions stored in the storage media and which, when executed using the one or more processors, cause: indexing, by one or more indexing nodes, a plurality of different portions of a stream of log data to obtain a plurality of indexed portions; storing the plurality of indexed portions in one or more cold storage systems; storing, in an index catalog, a pointer to a location of each of the plurality of indexed portions stored in the one or more cold storage systems; receiving, by one or more search nodes, one or more requests for log data, wherein the indexing is performed by the one or more indexing nodes independently from the receiving by the one or more search nodes; in response to determining that a particular search n

Assignees

Inventors

Classifications

  • Updates performed during online database operations; commit processing · CPC title

  • Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs · CPC title

  • Data stream processing; Continuous queries · CPC title

  • Indexing structures · CPC title

  • with details for data modelling support · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11669507B2 cover?
A method, performed by one or more processors, is disclosed, the method comprising receiving a stream of log data from one or more applications and indexing a plurality of different portions of the received stream to respective locations of a cold storage system. The method may also comprise storing, in an index, catalog pointers to the respective locations of the indexed portions in the cold s…
Who is the assignee on this patent?
Palantir Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/1734. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 06 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).