Machine learning model to identify and predict health and safety risks in electronic communications

US11803797B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11803797-B2
Application numberUS-202117512150-A
CountryUS
Kind codeB2
Filing dateOct 27, 2021
Priority dateSep 11, 2020
Publication dateOct 31, 2023
Grant dateOct 31, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and other embodiments associated with a machine learning system that monitors and detects health and safety risks in electronic correspondence related to a target field are described. In one embodiment, a method includes monitoring email communications over a network to identify an email associated with a target field. A machine learning classifier is initiated that is configured to classify text from the email with a risk as being related to a safety risk or a non-risk. The machine learning classifier generates a probability risk value that the email is related to a safety risk and labels the email as safety risk or non-risk based at least in part on the probability risk value indicating that the email is a safety risk. An electronic notice is generated and transmitted to a remote device in response to the email being labeled as being safety risk to provide an alert.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method performed by at least one computing device, the method comprising: training a machine learning classifier (i) to identify safety risk text with a first dataset of correspondences having known safety risk vocabulary that refers to health issues or safety issues, and (ii) to identify non-risk text with a second dataset of correspondences having known non-risk vocabulary that does not refer to health issues or safety issues; monitoring email communications over a network to identify an email; in response to receiving the email over the network, detecting and identifying the email as being associated with a construction project; tokenizing text from the email into a plurality of words; vectorizing each of the plurality of words into a numeric vector that maps each word to a numeric value; initiating the machine learning classifier to classify text with a risk as being a safety risk or a non-risk, and inputting the numeric vectors generated from the email into the machine learning classifier; wherein the machine learning classifier processes the numeric vectors from the email by at least corresponding the numeric vectors to a set of numeric vectors associated with the known safety risk vocabulary and a set of numeric vectors associated with the known non-risk vocabulary; generating a probability risk value, by the machine learning classifier, that the email includes text that is referring to or discussing a safety risk; labeling the email as a safety risk or non-risk based at least in part on the probability risk value that the email is a safety risk; and generating and transmitting an electronic notice to a remote device in response to the email being labeled as being a safety risk to provide an alert. 2. The method of claim 1 , wherein the machine learning classifier includes an ensemble classifier comprising a plurality of independent machine learning classifiers; and wherein each of the independent machine learning classifiers are configured to identify construction terminology; the method further comprising: generating an output by each of the independent machine learning classifiers that classifies the email as being a safety risk or non-risk; and combining the output from each of the independent machine learning classifiers based at least in part on a majority vote to generate the label for the email as being a safety risk or non-risk. 3. The method of claim 1 further comprising: initiating a second machine learning classifier and a third machine learning classifier both configured to identify construction terminology and to classify text with a prediction as being a safety risk or a non-risk, where each machine learning classifier is implemented with a different theoretical background from each other to avoid bias and redundancy during classification; generating an individual prediction by each of the machine learning classifiers indicating whether the email is a safety risk or a non-risk to produce at least three individual predictions; and labelling the email as safety risk or non-risk based on a majority vote of the three individual predictions. 4. The method of claim 1 , further comprising: generating the electronic notice to include identification of the email and the label that indicates the email as (i) the safety risk when the email is identified as referring to or discussing health or safety issues or (ii) the non-risk when the email is identified as not referring to or discussing health or safety issues; providing a user interface to allow input to validate the label and change the label; and in response to the label being changed via the user interface, feeding back the changed label and corresponding email to the machine learning classifier to retrain the machine learning classifier. 5. The method of claim 1 further comprising: inputting the construction terminology to the machine learning classifier from a glossary or database of construction project terms. 6. The method of claim 1 further comprising: training the machine learning classifier to identify safety risk text based at least in part on a first dataset of correspondences having known text associated with a safety risk and a second dataset of correspondences having known non-risk text. 7. The method of claim 1 , wherein: detecting and identify the email as being associated with the construction project by at least evaluating the text from the email in relation to a trained dataset of construction terminology implemented by the machine learning classifier. 8. A computing system, comprising: at least one processor configured to execute instructions; at least one memory operably connected to the at least one processor; a machine learning classifier configured to identify construction terminology and to classify text with a risk as being safety risk or non-risk; wherein the machine learning classifier is trained (i) to identify safety risk text with a first dataset of correspondences having known safety risk vocabulary that refers to health issues or safety issues, and (ii) to identify non-risk text with a second dataset of correspondences having known non-risk vocabulary that does not refer to health issues or safety issues; a non-transitory computer-readable medium that includes stored thereon computer-executable instructions that when executed by the at least one processor of cause the computing device to: monitor email communications over a network to identify an email transmitted; in response to receiving the email over the network, detect and identify the email as being associated with a construction project; tokenize text from the email into a plurality of words; input the plurality of words generated from the email into the machine learning classifier; wherein the machine learning classifier is configured to evaluate the plurality of words from the email by at least corresponding the plurality of words to a set of the known safety risk vocabulary and a set of the known non-risk vocabulary; generate, by the machine learning classifier, a probability risk value that the email is safety risk based at least on evaluating the plurality of words from the email; label the email as a safety risk or non-risk based at least in part on the probability risk value that the email is safety risk; and generate and transmit an electronic notice to a remote device in response to the email being labeled as being safety risk to provide an alert in near-real time in relation to receiving the email over the network. 9. The computing system of claim 8 , wherein the machine learning classifier includes an ensemble classifier comprising a plurality of independent machine learning classifiers; wherein each of the independent machine learning classifiers is configured to identify construction terminology; wherein each of the independent machine learning classifiers is configured to generate an output that classifies the email as being safety risk or non-risk; and wherein ensemble classifier is configured to combine the output from each of the independent machine learning classifiers based at least in part on a majority vote to generate the label for the email as being safety risk or non-risk. 10. The computing system of claim 8 , wherein machine learning classifier includes at least a first machine learning classifier, a second machine learning classifier and a third machine learning classifier; wherein each of the machine learning classifiers are configured to identify construction terminology and to classify text with a prediction as being safety risk or non-risk; where each machine learning classifier is implemented with a different theoretical backg

Assignees

Inventors

Classifications

  • Risk analysis of enterprise or organisation activities · CPC title

  • Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection · CPC title

  • Lexical analysis, e.g. tokenisation or collocates · CPC title

  • Ensemble learning · CPC title

  • Computer-aided management of electronic mailing [e-mailing] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11803797B2 cover?
Systems, methods, and other embodiments associated with a machine learning system that monitors and detects health and safety risks in electronic correspondence related to a target field are described. In one embodiment, a method includes monitoring email communications over a network to identify an email associated with a target field. A machine learning classifier is initiated that is configu…
Who is the assignee on this patent?
Oracle Int Corp
What technology area does this patent fall under?
Primary CPC classification G06Q10/0635. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 31 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).