What technology area does this patent fall under?

Primary CPC classification G06F16/2425. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Dec 08 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Dynamically converting search-time fields to ingest-time fields

US2016357809A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2016357809-A1
Application number	US-201514728292-A
Country	US
Kind code	A1
Filing date	Jun 2, 2015
Priority date	Jun 2, 2015
Publication date	Dec 8, 2016
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Large amounts of unstructured or semi-structured log data generated by software and infrastructure components of a computing system are processed to identify anomalies and potential problems within the computing system. Stored log messages may be queried and analyzed according to dynamic fields constructed from the content of the log messages. As time goes on, the dynamic fields may be converted into static fields which are extracted and indexed at the time of ingestion of the log messages.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for processing semi-structured data comprising: storing a first plurality of log messages in a first data store during a first time period; responsive to receiving a first query having a field, extracting field values for the field from the first plurality of log messages at the time of issuance of the first query; storing a second plurality of log messages in a second data store during a second time period subsequent to the first time period; updating an index of the second data store for the field extracted from the second plurality of log messages at the time of storing the second plurality of log messages; and responsive to receiving a second query having the field during the second time period, retrieving field values for the field from the index of the second data store. 2 . The method of claim 1 , further comprising: generating a definition of the field comprising a field name and one or more parsing rules; and wherein the field values for the field are extracted from the first and second plurality of log messages according to the definition of the field. 3 . The method of claim 2 , wherein updating the index of the second data store for the field extracted from the second plurality of log messages at the time of storing the second plurality of log messages further comprises: storing the field values extracted from the second plurality of log messages under an internal column for the field in the index of the second data store, wherein a name of the internal column is based on a transformation of the field name and the parsing rule of the definition of the field. 4 . The method of claim 2 , further comprising: modifying, in response to user input, the parsing rule associated with the field; receiving a third query for the field during the second time period; responsive to determining that the index of the second data store does not contain the field, extracting field values for the field from the second plurality of log messages at the time of issuance of the third query. 5 . The method of claim 4 , wherein determining that the index of the second data store does not contain the field further comprises: determining that an internal column for the field does not exist in the index of the second data store having a name based on a transformation of the field name and the modified parsing rule. 6 . The method of claim 1 , further comprising: responsive to determining the first data store has reached a threshold size: modifying the first data store to be read-only; instantiating the second data store for storing the second plurality of log messages; selecting the field from a plurality of saved fields for conversion from a search-time field to an ingestion-time field based on usage of the field in prior time periods; and generating the index of the second data store having the field as a column in the index. 7 . The method of claim 1 , further comprising: selecting an ingestion-time field from a plurality of saved fields for conversion to a search-time field based on usage of the ingestion-time field in the first time period, wherein the ingestion-time field is a column in an index of the first data store; and generating the index of the second data store not having the ingestion-time field as a column in the index of the second data store. 8 . The method of claim 1 , wherein the second query specifies a time range including the first and second time periods, and wherein responsive to receiving the second query having the field during the second time period, retrieving field values for the field from the index of the second data store further comprises: splitting the second query into a first sub-query of the first data store and a second sub-query of the second data store; and combining a return set comprised of a first set of field values extracted from the first plurality of log messages at the time of issuance of the first sub-query and a second set of field values retrieved from the index of the second data store. 9 . A non-transitory computer-readable storage medium comprising instructions that, when executed in a computing device, process semi-structured data, by performing the steps of: storing a first plurality of log messages in a first data store during a first time period; responsive to receiving a first query having a field, extracting field values for the field from the first plurality of log messages at the time of issuance of the first query; storing a second plurality of log messages in a second data store during a second time period subsequent to the first time period; updating an index of the second data store for the field extracted from the second plurality of log messages at the time of storing the second plurality of log messages; and responsive to receiving a second query having the field during the second time period, retrieving field values for the field from the index of the second data store. 10 . The non-transitory computer-readable storage medium of claim 9 , wherein the steps further comprise: generating a definition of the field comprising a field name and one or more parsing rules; wherein the field values for the field are extracted from the first and second plurality of log messages according to the definition of the field. 11 . The non-transitory computer-readable storage medium of claim 10 , wherein updating the index of the second data store for the field extracted from the second plurality of log messages at the time of storing the second plurality of log messages further comprises: storing the field values extracted from the second plurality of log messages under an internal column for the field in the index of the second data store, wherein a name of the internal column is based on a transformation of the field name and the parsing rule of the definition of the field. 12 . The non-transitory computer-readable storage medium of claim 10 , wherein the steps further comprise: modifying, in response to user input, the parsing rule associated with the field; receiving a third query for the field during the second time period; responsive to determining that an internal column for the field does not exist in the index of the second data store having a name based on a transformation of the field name and the modified parsing rule, extracting field values for the field from the second plurality of log messages at the time of issuance of the third query. 13 . The non-transitory computer-readable storage medium of claim 9 , wherein the steps further comprise, responsive to determining the first data store has reached a threshold size: modifying the first data store to be read-only; instantiating the second data store for storing the second plurality of log messages; selecting the field from a plurality of saved fields for conversion from a search-time field to an ingestion-time field based on usage of the field in prior time periods; and generating the index of the second data store having the field as a column in the index. 14 . The non-transitory computer-readable storage medium of claim 9 , wherein the steps further comprise: selecting an ingestion-time field from a plurality of saved fields for conversion to a search-time field based on usage of the ingestion-time field in the first time period, wherein the ingestion-time field is a column in an index of the first data store; and generating the index of the second data store not having the ingestion-time field as a column in the index of the second data store. 15 . The non-transitory computer-readable storag

Assignees

Vmware Inc

Inventors

Classifications

G06F16/2425Primary
Iterative querying; Query formulation based on the results of a preceding query · CPC title
G06F16/2471
Distributed queries · CPC title
G06F16/2272
Management thereof · CPC title
G06F17/30395Primary
Physics · mapped topic
G06F17/30545
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 57451508

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016357809A1 cover?: Large amounts of unstructured or semi-structured log data generated by software and infrastructure components of a computing system are processed to identify anomalies and potential problems within the computing system. Stored log messages may be queried and analyzed according to dynamic fields constructed from the content of the log messages. As time goes on, the dynamic fields may be converte…
Who is the assignee on this patent?: Vmware Inc
What technology area does this patent fall under?: Primary CPC classification G06F16/2425. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Dec 08 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Monitoring overall service-level performance using an aggregate key performance indicator derived from machine data

Configuration replication in a search head cluster

Thresholds for key performance indicators derived from machine data

Frequently asked questions