Automated data-generation for event-based system

US2018032861A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2018032861-A1
Application numberUS-201615224489-A
CountryUS
Kind codeA1
Filing dateJul 29, 2016
Priority dateJul 29, 2016
Publication dateFeb 1, 2018
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described herein is a technology that facilitates the production of and the use of automated datagens for event-based. A datagen (i.e., data-generator or data generation system) is a component, module, or subsystem of computer systems that searches, monitors, and analyzes machine data. A datagen produces events that are further processed in various ways for subsequent use (such as searching, monitoring, and analysis).

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method comprising: obtaining a training corpus that contains example structured events, each structured event having at least one data field configured to contain data derived from machine-generated data; training a datagen that is configured to generate new structured events in accordance with the example structured events of the training corpus, the training includes employment of deep-learning technique to train the datagen using the training corpus. 2 . A method as recited in claim 1 , wherein the deep-learning technique includes a character-based recurrent neural network. 3 . A method as recited in claim 1 further comprising producing new events by the trained datagen. 4 . A method as recited in claim 1 , wherein the example structured events of the training corpus is a sequence of textual characters and the training of the datagen includes calculating statistical predictions based upon the sequence of the textual characters of the example structured events of the training corpus. 5 . A method as recited in claim 1 , wherein the example structured events of the training corpus is a sequence of textual characters and the training of the datagen includes calculating statistical predictions based upon the sequence of the textual characters of the example structured events of the training corpus, the method further comprising: producing new events by the trained datagen, the producing includes generating a sequence of characters of the new events based upon the calculated statistical predictions of the sequence of the textual characters of the example structured events of the training corpus. 6 . A method as recited in claim 1 , wherein the example structured events of the training corpus is a sequence of textual characters and the training of the datagen includes calculating statistical predictions based upon the sequence of the textual characters of the example structured events of the training corpus, the method further comprising: producing new events by the trained datagen, the producing includes generating a sequence of characters of the new events based upon the calculated statistical predictions of the sequence of the textual characters of the example structured events of the training corpus, wherein the generating includes selecting a next character of a new event based upon greatest likelihood of the selected character appearing next in the generated sequence as determined by the calculated statistical predictions of the sequence of the textual characters of the example structured events of the training corpus. 7 . A method as recited in claim 1 further comprising: obtaining a dataset of machine-generated data; generating, by the trained datagen and in accordance with the example structured events of the training corpus, multiple events from the obtained dataset of machine-generated data. 8 . A method as recited in claim 1 further comprising: obtaining a dataset of machine-generated data, wherein the dataset of machine-generated data includes a sequence of textual characters; processing the obtained dataset of machine-generated data by the trained datagen in order of the sequence of textual characters; during the processing of the obtained dataset, calculating a statistical prediction of a next yet-to-be-processed group of one or more textual characters of the sequence of the textual characters; generating, by the trained datagen, an event from the obtained dataset of machine-generated data one textual character at a time, wherein each generated character is generated based on the calculated statistical prediction of the next yet-to-be-processed group of one or more textual characters. 9 . One or more computer-readable media storing instructions thereon that, when executed by one or more processors, direct the one or more processors to perform operations comprising: obtaining a training corpus that contains example structured events, each structured event having at least one data field configured to contain data derived from machine-generated data; training a datagen that is configured to generate new structured events in accordance with the example structured events of the training corpus, the training includes employment of deep-learning technique to train the datagen using the training corpus. 10 . One or more computer-readable media as recited in claim 9 , wherein the deep-learning technique includes a character-based recurrent neural network. 11 . One or more computer-readable media as recited in claim 9 , wherein the operations further comprise producing new events by the trained datagen. 12 . One or more computer-readable media as recited in claim 9 , wherein the example structured events of the training corpus is a sequence of textual characters and the training of the datagen includes calculating statistical predictions based upon the sequence of the textual characters of the example structured events of the training corpus, the operations further comprise: producing new events by the trained datagen, the producing includes generating a sequence of characters of the new events based upon the calculated statistical predictions of the sequence of the textual characters of the example structured events of the training corpus. 13 . One or more computer-readable media as recited in claim 9 , wherein the example structured events of the training corpus is a sequence of textual characters and the training of the datagen includes calculating statistical predictions based upon the sequence of the textual characters of the example structured events of the training corpus, the operations further comprise: producing new events by the trained datagen, the producing includes generating a sequence of characters of the new events based upon the calculated statistical predictions of the sequence of the textual characters of the example structured events of the training corpus, wherein the generating includes selecting a next character of a new event based upon greatest likelihood of the selected character appearing next in the generated sequence as determined by the calculated statistical predictions of the sequence of the textual characters of the example structured events of the training corpus. 14 . One or more computer-readable media as recited in claim 9 , the operations further comprise: obtaining a dataset of machine-generated data; generating, by the trained datagen and in accordance with the example structured events of the training corpus, multiple events from the obtained dataset of machine-generated data. 15 . One or more computer-readable media as recited in claim 9 , the operations further comprise: obtaining a dataset of machine-generated data, wherein the dataset of machine-generated data includes a sequence of textual characters; processing the obtained dataset of machine-generated data by the trained datagen in order of the sequence of textual characters; during the processing of the obtained dataset, calculating a statistical prediction of a next yet-to-be-processed group of one or more textual characters of the sequence of the textual characters; generating, by the trained datagen, an event from the obtained dataset of machine-generated data one textual character at a time, wherein each generated character is generated based on the calculated statistical prediction of the next yet-to-be-processed group of one or more textual characters. 16 . An automated data generation system comprising: a training data handler configured to obtain a training corpus that contains exampl

Assignees

Inventors

Classifications

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals · CPC title

  • G06F40/274Primary

    Converting codes to words; Guess-ahead of partial word inputs · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018032861A1 cover?
Described herein is a technology that facilitates the production of and the use of automated datagens for event-based. A datagen (i.e., data-generator or data generation system) is a component, module, or subsystem of computer systems that searches, monitors, and analyzes machine data. A datagen produces events that are further processed in various ways for subsequent use (such as searching, mo…
Who is the assignee on this patent?
Splunk Inc
What technology area does this patent fall under?
Primary CPC classification G06Q10/0637. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Feb 01 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).