Active learning on statistical server name extraction from information technology (IT) service tickets

US9299031B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9299031-B2
Application numberUS-201313901872-A
CountryUS
Kind codeB2
Filing dateMay 24, 2013
Priority dateMay 24, 2013
Publication dateMar 29, 2016
Grant dateMar 29, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Access is obtained to a plurality of information technology services problem tickets. At least a first subset of the tickets include free text tickets with server names embedded in unstructured text fields. The server names are extracted from the first subset of the tickets via a statistical machine learning technique. Using the extracted server names, those of the first subset of the tickets from which the server names have been extracted are linked to corresponding server entries in a configuration information database to facilitate resolution of problems associated with the first subset of the tickets from which the server names have been extracted; and/or at least one of the extracted server names is identified as missing from a list of known server names.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: obtaining access to a plurality of information technology services problem tickets, at least a first subset of said tickets comprising free text tickets with server names embedded in unstructured text fields; extracting said server names from said first subset of said tickets via a statistical machine learning technique; and using said extracted server names, carrying out at least one of: linking those of said first subset of said tickets from which said server names have been extracted to corresponding server entries in a configuration information database to facilitate resolution of problems associated with said first subset of said tickets from which said server names have been extracted; and identifying at least one of said extracted server names as missing from a list of known server names. 2. The method of claim 1 , wherein said extracting step comprises applying a conditional random field approach. 3. The method of claim 1 , wherein said extracting step comprises applying a maximum entropy approach. 4. The method of claim 1 , further comprising grouping first subset of said tickets to at least one of: said host names; and parameters corresponding to said host names; to identify at least one of trouble-prone hosts and trouble-prone parameters. 5. The method of claim 4 , further comprising identifying trouble-prone host-parameter combinations. 6. The method of claim 1 , wherein, in said extracting step, said statistical machine learning technique comprises a non-heuristic technique without keyword matching. 7. The method of claim 1 , further comprising building a model for use by said statistical machine learning technique by: obtaining access to said configuration information database; extracting a server dictionary from said configuration information database by carrying out a fuzzy match algorithm to obtain semi-truthful training data; and training said model on said semi-truthful training data. 8. The method of claim 1 , wherein said using of said extracted server names comprises at least linking those of said first subset of said tickets from which said server names have been extracted to corresponding server entries in said configuration information database to facilitate resolution of problems associated with said first subset of said tickets from which said server names have been extracted. 9. The method of claim 1 , wherein said using of said extracted server names comprises at least identifying at least one of said extracted server names as missing from said list of known server names. 10. The method of claim 1 , further comprising providing a system, wherein the system comprises distinct software modules, each of the distinct software modules being embodied on a computer-readable storage medium, and wherein the distinct software modules comprise a data extraction module, a statistical model decode engine module, and at least one of a business analyzer module and a comparator module; wherein: said obtaining is carried out by said data extraction module executing on at least one hardware processor; said extraction is carried out by said statistical model decode engine module executing on said at least one hardware processor; and said at least one of linking and identifying is carried out by a corresponding one of said at least one of said business analyzer module and said comparator module executing on said at least one hardware processor. 11. A non-transitory computer readable medium comprising computer executable instructions which when executed by a computer cause the computer to perform the method of: obtaining access to a plurality of information technology services problem tickets, at least a first subset of said tickets comprising free text tickets with server names embedded in unstructured text fields; extracting said server names from said first subset of said tickets via a statistical machine learning technique; and using said extracted server names, carrying out at least one of: linking those of said first subset of said tickets from which said server names have been extracted to corresponding server entries in a configuration information database to facilitate resolution of problems associated with said first subset of said tickets from which said server names have been extracted; and identifying at least one of said extracted server names as missing from a list of known server names. 12. The non-transitory computer readable medium of claim 11 , wherein said extracting comprises applying a conditional random field approach. 13. The non-transitory computer readable medium of claim 11 , wherein said extracting comprises applying a maximum entropy approach. 14. The non-transitory computer readable medium of claim 11 , wherein, in said extracting, said statistical machine learning technique comprises a non-heuristic technique without keyword matching. 15. The non-transitory computer readable medium of claim 11 , wherein said computer-executable instructions comprise distinct software modules, each of the distinct software modules being embodied on said non-transitory computer readable medium, and wherein the distinct software modules comprise a data extraction module, a statistical model decode engine module, and at least one of a business analyzer module and a comparator module; wherein: said data extraction module obtains said access; said statistical model decode engine module extracts said server names; and said at least one of a business analyzer module and a comparator carries out said at least one of linking and identifying. 16. An apparatus comprising: a memory; and at least one processor, coupled to said memory, and operative to: obtain access to a plurality of information technology services problem tickets, at least a first subset of said tickets comprising free text tickets with server names embedded in unstructured text fields; extract said server names from said first subset of said tickets via a statistical machine learning technique; and using said extracted server names, carry out at least one of: linking those of said first subset of said tickets from which said server names have been extracted to corresponding server entries in a configuration information database to facilitate resolution of problems associated with said first subset of said tickets from which said server names have been extracted; and identifying at least one of said extracted server names as missing from a list of known server names. 17. The apparatus of claim 16 , wherein said at least one processor is operative to extract by applying a conditional random field approach. 18. The apparatus of claim 16 , wherein said at least one processor is operative to extract by applying a maximum entropy approach. 19. The apparatus of claim 16 , wherein said statistical machine learning technique comprises a non-heuristic technique without keyword matching. 20. The apparatus of claim 16 , further comprising a plurality of distinct software modules, each of the distinct software modules being embodied on a computer-readable storage medium, and wherein the distinct software modules comprise a data extraction module, a statistical model decode engine module, and at least one of a business analyzer module and a comparator module; wherein: said at least one processor is operative to obtain by executing said data extraction module; said at least one processor is operative to extract by executing said statistical model decod

Assignees

Inventors

Classifications

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Physics · mapped topic

  • G06N5/048Primary

    Fuzzy inferencing · CPC title

  • Physics · mapped topic

  • Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9299031B2 cover?
Access is obtained to a plurality of information technology services problem tickets. At least a first subset of the tickets include free text tickets with server names embedded in unstructured text fields. The server names are extracted from the first subset of the tickets via a statistical machine learning technique. Using the extracted server names, those of the first subset of the tickets f…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N5/048. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 29 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).