Processing event log data
US-10275496-B2 · Apr 30, 2019 · US
US10929763B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10929763-B2 |
| Application number | US-201715684293-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 23, 2017 |
| Priority date | Aug 26, 2016 |
| Publication date | Feb 23, 2021 |
| Grant date | Feb 23, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A heterogeneous log pattern editing recommendation system and computer-implemented method are provided. The system has a processor configured to identify, from heterogeneous logs, patterns including variable fields and constant fields. The processor is also configured to extract a category feature, a cardinality feature, and a before-after n-gram feature by tokenizing the variable fields in the identified patterns. The processor is additionally configured to generate target similarity scores between target fields to be potentially edited and other fields from among the variable fields in the heterogeneous logs using pattern editing operations based on the extracted category feature, the extracted cardinality feature, and the extracted before-after n-gram feature. The processor is further configured to recommend, to a user, log pattern edits for at least one of the target fields based on the target similarity scores between the target fields in the heterogeneous logs.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented heterogeneous log pattern editing recommendation method performed in a network having network devices that generate heterogeneous logs, the method comprising: identifying, by a processor from the heterogeneous logs, patterns comprising variable fields and constant fields; extracting, by the processor, a category feature, a cardinality feature, and a before-after n-gram feature by tokenizing the variable fields in the identified patterns; generating, by the processor, target similarity scores between target fields to be potentially edited and other fields from among the variable fields in the heterogeneous logs using pattern editing operations based on the extracted category feature, the extracted cardinality feature, and the extracted before-after n-gram feature using a combined field similarity matrix Θ comb generated by fusing a plurality of similarity matrices by: Θ comb =Θ category ⊙(α*Θ cardinality +(1−α)*Θ before-after-n-grams ), where Θ category is a category similarity matrix, Θ cardinality is a cardinality similarity matrix, and Θ before-after-n-grams is a before-after n-grams similarity matrix for fields in the patterns, α is a contribution parameter using to balance the weights of similarity matrices generated from cardinality and before-after n-grams, and ⊙ is the element-wise matrix multiplication; wherein the category similarity matrix includes a category similarity score determined for groupings of the target fields, and wherein the category similarity score has a first value responsive to the target fields in a particular one of the groupings belonging to a same category and a second value responsive to the target fields in the particular one of the groupings belonging to different categories, and wherein the category similarity score is used in the category similarity matrix to generate the combined field similarity matrix; and recommending, by the processor to a user, log pattern edits for at least one of the target fields based on the target similarity scores between the target fields in the heterogeneous logs; and auto-implementing recommended log pattern edits; and controlling one or more systems, machines, or devices in the network using the auto-implemented recommended log pattern edits. 2. The computer-implemented method of claim 1 , wherein the tokenized variable fields are based on a delimiter. 3. The computer-implemented method of claim 1 , wherein the category feature is selected from the group consisting of only numbers, only non-space characters, an internet protocol address, only letters, and date and time information. 4. The computer-implemented method of claim 1 , wherein the cardinality feature is a total number of unique values in one of the variable fields across the heterogeneous logs. 5. The computer-implemented method of claim 1 , wherein the before-after n-gram feature is determined by: locating one of the target fields; extracting before n-grams tokens and after n-grams tokens for fields adjacent to the one of the target fields; and concatenating the extracted before n-grams tokens and the extracted after n-grams tokens into a string. 6. The computer-implemented method of claim 1 , further comprises performing, by the processor, the recommended log pattern edits on the identified patterns after confirmation by the user. 7. The computer-implemented method of claim 1 , wherein the pattern editing operations include variable-level operations, wherein the variable-level operations generate the combined field similarity matrix by fusing the category similarity matrix, the cardinality similarity matrix, and the before-after n-gram similarity matrix, wherein the combined field similarity matrix is used to generate the target similarity scores. 8. The computer-implemented method of claim 7 , wherein the cardinality similarity matrix includes a cardinality similarity score determined for groupings of the target fields, wherein the cardinality similarity score for a particular on of the groupings is determined by a quantity subtracted from one, wherein the quantity is a normalized difference of cardinalities of the target fields in the particular one of the groupings, and wherein the cardinality similarity score is used in the cardinality similarity matrix to generate the combined field similarity matrix. 9. The computer-implemented method of claim 7 , wherein the before-after n-gram similarity matrix includes a respective before-after similarity score determined for each of groupings of the target fields, wherein the before-after similarity score for a given one of the groupings of the target fields is determined by a quantity subtracted from one, wherein the quantity is an edit difference between the before-after n-gram features of the target fields in the given one of the groupings, and wherein the respective before-after similarity score is used in the before-after n-gram similarity matrix to generate the combined field similarity matrix. 10. The computer-implemented method of claim 1 , wherein the pattern editing operations include constant-level operations, wherein the constant-level operations include a merge operation, and wherein the merge operation calculates a merge similarity score between various ones of the constant fields the user designates to merge and various other ones of the constant fields in the patterns, and wherein the merge similarity score is determined by an edit distance between the various ones of the constant fields the user designates to merge and the various other ones of the constant fields in the patterns, and wherein the merge similarity score is used to generate the target similarity scores. 11. The computer-implemented method of claim 1 , wherein the pattern editing operations include constant-level operations, wherein the constant-level operations include a generalization operation, and wherein the generalization operation calculates a generalization similarity score between various ones of the constant fields the user designates and various other ones of the constant fields in the patterns, and wherein the generalization similarity score is determined by a quantity subtracted from one, wherein the quantity is an edit distance normalized by dividing the before-after n-gram features of the various other ones of the constant fields in the patterns by a maximum number of characters between the before-after n-gram features of the various ones of the constant fields the user designates, wherein the generalization similarity score is used to generate the target similarity scores. 12. The computer-implemented method of claim 1 , wherein the pattern editing operations include pattern-level operations, wherein the pattern-level operations calculate a pattern similarity matrix between patterns, wherein the pattern similarity matrix being a total number of pairs of tokens with a same type in a same position in each of the patterns, and wherein the pattern similarity matrix is used to generate the target similarity scores. 13. A non-transitory article of manufacture tangibly embodying a computer readable program for heterogeneous log pattern editing recommendation performed in a network having network devices that generate heterogeneous logs, which when executed causes a computer to: identify, by a processor from the heterogeneous logs, patterns comprising variable fields and constant fields; extract, by the processor, a category feature, a cardinality feature, and a before-after n-gram feature by tokenizing the variable fields in the identified patterns; generate, by the processor, target similarity scores between target fields to be potenti
Automatic learning of transformation rules, e.g. from examples · CPC title
Handling natural language data (speech analysis or synthesis, speech recognition G10L) · CPC title
Lexical analysis, e.g. tokenisation or collocates · CPC title
Pattern matching networks; Rete networks · CPC title
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.