Analyzing tracking requests generated by client devices interacting with a website
US-2019007506-A1 · Jan 3, 2019 · US
US11501112B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11501112-B1 |
| Application number | US-201815967435-A |
| Country | US |
| Kind code | B1 |
| Filing date | Apr 30, 2018 |
| Priority date | Apr 30, 2018 |
| Publication date | Nov 15, 2022 |
| Grant date | Nov 15, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computerized method of diagnosing a mislabeling of a source type of a received event. The method comprising operations of receiving an event by a source type analysis logic with a data index and query system, wherein the event includes a portion of raw machine data and is associated with a specific point in time, obtaining an original source type assigned to the event and one or more predicted source types. The one or more predicted source types are determined by analysis of a data representation of the event in view of training data and the training data includes a plurality of data representations corresponding to known source types. Additionally, the computerized method also includes an operation of, determining whether the event has been mislabeled and in response to determining the event has been mislabeled, diagnosing a source of the mislabeling.
Opening claim text (preview).
What is claimed is: 1. A computerized method of diagnosing a labeling of a source type of an event using machine learning techniques, the method comprising: receiving the event by a source type analysis logic with a data index and query system, wherein the event includes a portion of raw machine data and is associated with a specific point in time; obtaining one or more predicted source types of the event, the one or more predicted source types being determined by analyzing a data representation of the event in view of training data, wherein the training data includes a plurality of data representations corresponding to known source types; determining whether the event has been mislabeled by determining whether an original source type of the event is one or more of empty, missing, or incorrect; and responsive to determining the event has been mislabeled based on a discrepancy between the original source type and the predicted source type, diagnosing a source of the mislabeling. 2. The computerized method of claim 1 , wherein the original source type is assigned according to one of a configuration file, one or more predefined rules, or a predetermined signature. 3. The computerized method of claim 1 , wherein each of the one or more predicted source types includes a probability, wherein a first probability of a first predicted source type indicates a likelihood that the first predicted source type is a correct source type of the event. 4. The computerized method of claim 1 , wherein the determining of whether the event has been mislabeled comprises comparing the original source type to a first predefined source type of the one or more predicted source types to determine whether a match exists. 5. The computerized method of claim 1 , wherein the determining of whether the event has been mislabeled comprises determining that the original source type was not assigned to the event, and selecting a predicted source type to assign to the event. 6. The computerized method of claim 1 , wherein the determining of whether the event has been mislabeled comprises: (i) determining whether a first probability of a first predicted source type of the one or more source types is greater than or equal to a first threshold, and (ii) responsive to determining the first probability is greater than or equal to the first threshold, comparing the first predicted source type with the original source type to determine whether a match exists, wherein the first probability of the first predicted source type indicates a likelihood that the event corresponds to the first predicted source. 7. The computerized method of claim 1 , wherein the determining of whether the event has been mislabeled comprises: (i) determining whether a first probability of a first predicted source type of the one or more source types is greater than or equal to a first threshold, and (ii) responsive to determining the first probability is greater than or equal the first threshold, comparing the first predicted source type with the original source type to determine whether a match exists, wherein the first probability of the first predicted source type indicates a likelihood that the event corresponds to the first predicted source, and wherein the first threshold is determined by a source type of at least one of the one or more predicted source types. 8. The computerized method of claim 1 , wherein the determining of whether at least two predicted source types of the one or more predicted source types each correspond to probabilities that are greater than or equal to a first threshold; and responsive to determining the at least two predicted source types of the one or more predicted source types each correspond to probabilities that are greater than or equal to the first threshold, generating and providing an alert to an analyst indicating the at least two predicted source types of the one or more predicted source types each correspond to probabilities that are greater than or equal to the first threshold. 9. The computerized method of claim 1 , wherein the determining of whether at least two predicted source types of the one or more predicted source types each correspond to probabilities that are greater than or equal to a first threshold; and responsive to determining the at least two predicted source types of the one or more predicted source types each correspond to probabilities that are greater than or equal to the first threshold, determining the event has been mislabeled when the original source type does not match a first source type of the at least two predicted source types having a highest probability. 10. The computerized method of claim 1 , wherein the obtaining of the one or more predicted source types of the event comprises: generating the data representation of the event includes, wherein the data representation of the event includes content of the event other than personally identifiable information, and wherein the computerized method further comprises: determining, from the data representation including the content of the event other than the personally identifiable information, that the original source type is an indicator other than a known source type, the indicator representing that the original source type is not one of a plurality of known source types. 11. The computerized method of claim 1 , wherein the obtaining of the one or more predicted source types of the event comprises: generating the data representation of the event, wherein the data representation of the event includes content of the event other than personally identifiable information, and wherein the computerized method further comprises: assigning, based on the one or more predicted source types of the event that are determined from the data representation including the content of the event other than the personally identifiable information, a predicted source type of the event when the original source type for the event is blank or missing. 12. The computerized method of claim 1 , further comprising: determining the original source type is an indicator other than a known source type, the indicator representing that the original source type is not one of a plurality of known source types; and responsive to determining the original source is the indicator other than the known source type, generating and providing an alert to an analyst indicating that the original source type is not one of a plurality of known source types. 13. The computerized method of claim 1 , further comprising: responsive to determining the event has been mislabeled, generating and providing an alert to an analyst, the alert including at least (i) the event, (ii) the original source type, and (iii) a first predicted source type. 14. The computerized method of claim 1 , further comprising: responsive to determining the event has been mislabeled, generating and providing an alert to an analyst, the alert including at least (i) the event, (ii) the original source type, (iii) the one or more predicted source types, and (iv) probabilities corresponding to each of the one or more predicted source types, wherein a first probability of a first predicted source type indicates a likelihood that the event corresponds to the first predicted source. 15. The computerized method of claim 1 , wherein diagnosing the source of the mislabeling includes determining the source of the mislabeling, and generating and providing an alert to an analyst, the alert including at least (i) the event, (ii) the original source type, and (iii) the one or more predicted source types, and (iv) the source of the mislabeling.
Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually · CPC title
Query processing · CPC title
characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling · CPC title
Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title
for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.