Data processing method, data processing apparatus, and non-transitory computer-readable storage medium
US-2024320235-A1 · Sep 26, 2024 · US
US9607059B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9607059-B2 |
| Application number | US-201414169389-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 31, 2014 |
| Priority date | Jan 31, 2014 |
| Publication date | Mar 28, 2017 |
| Grant date | Mar 28, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
According to some embodiments, a method and an apparatus of analyzing log files comprises sampling a log and determining a structure associated with the log file based on the sampling and a pattern within the structure. If the structure and the pattern are stored in a repository, data from the log file will be exported into a database based on the determined pattern.
Opening claim text (preview).
What is claimed is: 1. A method of analyzing log files, the method comprising: sampling a log file comprising a plurality of structures; determining, via a processor, one of the plurality of structures associated with the log file based on the sampling and a pattern within the one of the plurality of structures; determining a type of delimiter associated with the log file; determining if the one of the plurality of structures and the pattern are stored in a repository; parsing the log fields based on the type of delimiter; discovering log field content types based on the log file's data, patterns, distinct values and regular expressions; assigning log field content types to the parsed log fields; determining that a variety of field names are possible based on content from previously stored log file patterns within the repository; presenting field name options to a user to select a field name based on the determined variety of field names; standardizing the parsed log fields based on a selected field name from the previously stored log file patterns within the repository; and exporting data from the log file into a database. 2. The method of claim 1 , wherein sampling comprises analyzing the log file line by line and exporting is based on the schema embedded within a start of the log file. 3. The method of claim 1 , further comprising determining a format of the log file based on a location of the log file. 4. The method of claim 1 , wherein the method further comprises: standardizing log fields further based on receiving a selection of field names from a variety of possible field names for the log fields that are stored within the repository; and saving the pattern in the repository. 5. The method of claim 4 , wherein the method further comprises: proposing enhancements. 6. The method of claim 5 , wherein proposing enhancements comprises: presenting related fields to a user, the related fields associated with data contained within the log file. 7. The method of claim 4 , wherein the method further comprises: presenting related fields to a user, the related fields associated with one or more other log files. 8. A non-transitory computer-readable medium comprising instructions that when executed by a processor perform a method of analyzing log files, the method comprising: sampling a log file; determining, via a processor, a structure associated with the log file based on the sampling and a pattern within the structure; determining if the structure and the pattern are stored in a repository; determining a type of delimiter associated with the log file; parsing the log fields based on the type of delimiter; discovering log field content types based on the log file's data, patterns, distinct values and regular expressions; assigning log field content types to the parsed log fields; determining that a variety of field names are possible based on content from previously stored log file patterns within the repository; presenting field name options to a user to select a field name based on the determined variety of field names; standardizing the parsed log fields based on a selected field name from the previously stored log file patterns within the repository; and exporting data from the log file into a database. 9. The medium of claim 8 , wherein sampling comprises analyzing the log file line by line and exporting is based on the schema embedded within a start of the log file. 10. The medium of claim 8 further comprising determining a format of the log file based on a location of the log file. 11. The medium of claim 8 , wherein the method further comprises: standardizing log fields further based on receiving a selection of field names from a variety of possible field names for the log fields that are stored within the repository; and saving the pattern in the repository. 12. The medium of claim 11 , wherein the method further comprises: proposing enhancements. 13. The medium of claim 12 , wherein proposing enhancements comprises: presenting related fields to a user, the related fields associated with data contained within the log file. 14. The medium of claim 11 , wherein the method further comprises: presenting related fields to a user, the related fields associated with one or more other log files. 15. A system comprising: a processor; and a non-transitory computer-readable medium comprising instructions that when executed by a processor perform a method of analyzing log files, the method comprising: sampling a log file; determining a structure associated with the log file based on the sampling and a pattern within the structure; determining if the structure and the pattern are stored in a repository; determining a type of delimiter associated with the log file; parsing the log fields based on the type of delimiter; discovering log field content types based on the log file's data, patterns, distinct values and regular expressions; assigning log field content types to the parsed log fields; determining that a variety of field names are possible based on content from previously stored log file patterns within the repository; presenting field name options to a user to select a field name based on the determined variety of field names; standardizing the parsed log fields based on a selected field name from the previously stored log file patterns within the repository; and exporting data from the log file into a database. 16. The system of claim 15 , wherein sampling comprises analyzing the log file line by line and exporting is based on the schema embedded within a start of the log file. 17. The system of claim 15 further comprising determining a format of the log file based on a location of the log file. 18. The system of claim 15 , wherein the method further comprises: standardizing log fields further based on receiving a selection of field names from a variety of possible field names for the log fields that are stored within the repository; and saving the pattern in the repository. 19. The system of claim 15 , wherein the schema is determined by analyzing a nested structure within the log file. 20. The system of claim 15 , wherein the method further comprises presenting related fields to a user, the related fields associated with data contained within the log file.
Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title
Data acquisition and logging (for input to computer G06F3/00) · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.