Security alert-incident grouping based on investigation history
US-2021326744-A1 · Oct 21, 2021 · US
US11494418B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11494418-B2 |
| Application number | US-202117160712-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 28, 2021 |
| Priority date | Jan 28, 2021 |
| Publication date | Nov 8, 2022 |
| Grant date | Nov 8, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for discovering and/or determining section types for a given document class in a data-driven manner are provided. A modified Bayesian model merging algorithm can be used, along with extending an Analogical Story Merging (ASM) algorithm. The systems and methods can learn the section structure of documents without a pre-existing ontology of sections or time-intensive annotation efforts.
Opening claim text (preview).
What is claimed is: 1. A system for determining section types of a given document class, the system comprising: a processor; a memory in operable communication with the processor; and a machine-readable medium in operable communication with the processor and the memory, the machine-readable medium having instructions stored thereon that, when executed by the processor, perform the following steps: receiving a corpus of documents of the given document class; using a modified Bayesian model merging algorithm on the corpus to determine the section types of the given document class; and storing the determined section types on the memory to be used for labeling a document of the given document class, the using of the modified Bayesian model merging algorithm on the corpus comprising: creating an initial Hidden Markov Model (HMM)-like model, where each document of the corpus is represented as a linear chain of states, with each state of the linear chain of states corresponding to a section of unknown type in a same order as found in the respective document of the corpus; performing a merge operation on the initial HMM-like model to merge states and generate an updated model; defining a prior probability distribution over the updated model; computing a posterior probability distribution based on the prior probability distribution; and searching a merge space of the updated model based on the posterior probability distribution to determine the section types of the given document class. 2. The system according to claim 1 , the using of the modified Bayesian model merging algorithm on the corpus comprising extending an analogical story merging (ASM) approach with a Bayesian model merging algorithm. 3. The system according to claim 1 , the searching of the merge space of the updated model comprising maximizing the posterior probability distribution to give a generalizable model that fits the corpus. 4. The system according to claim 1 , the computing of the posterior probability distribution comprising computing P(M)P(D|M), which is proportional to P(M|D), where P(M) is the prior probability distribution, P(M|D) is the posterior probability distribution, M represents the updated model, and D represents a document of the corpus. 5. The system according to claim 1 , the defining of the prior probability distribution comprising using the following equations P ( M ) = N ( μ , σ 2 ) ∏ i G ( S i ) G ( S i ) = { 1 0 ∀ s j , s k ∈ S i , Sim ( s j , s k ) > T otherwise , where P(M) is the prior probability distribution, M represents the updated model, N(μ, σ 2 ) is a normal distribution of the updated model, S i is the i th state in the updated model, s j and s k are section contents that have been merged into state S i , Sim is a similarity function that takes content of s j and s k and computes a cosine similarity of vector representations of s j and s k , and T is a similarity threshold. 6. The system according to claim 5 , T being set as 1.5 standard deviations from a mean similarity of the similarity function. 7. The system according to claim 5 , where, if headers of all sections in the updated model are exactly the same, G(Si) is set to 1. 8. The system according to claim 1 , the corpus of documents comprising at least 100 documents. 9. The system according to claim 1 , the given document class being a psychiatric evaluation, a discharge summary, a radiology report, or a United States patent document. 10. A method for determining section types of a given document class, the method comprising: receiving, by a processor, a corpus of documents of the given document class; using, by the processor, a modified Bayesian model merging algorithm on the corpus to determine the section types of the given document class; and storing, by the processor, the determined section types on a memory in operable communication with the processor to be used for labeling a document of the given document class, the using of the modified Bayesian model merg
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Thesauruses; Synonyms · CPC title
using statistical methods · CPC title
Fragmentation of text files, e.g. creating reusable text-blocks; Linking to fragments, e.g. using XInclude; Namespaces · CPC title
Semantic analysis · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.