System and method for training and refining machine learning models

US2023072171A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2023072171-A1
Application numberUS-202217837624-A
CountryUS
Kind codeA1
Filing dateJun 10, 2022
Priority dateAug 30, 2021
Publication dateMar 9, 2023
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and method for training and refining a machine learning model is disclosed. The disclosed system and method can further improve the accuracy of trained machine learning models by calculating which threshold values for predictions (e.g., probabilities output by the machine learning model) provide the most accurate results. The system and method may include applying an optimization technique (e.g., multi-objective optimization) to calculate which threshold values result in the best combination of precision and recall. In other words, the system and method adjust threshold values for prediction scores to optimize the objects of precision and recall. A machine learning model trained with these adjusted threshold values can determine when an input belongs to an unknown class because the unknown input has prediction scores below the threshold values for every known class. Embodiments may include refining an intent classifier to better classify unknown intents.

First claim

Opening claim text (preview).

We claim: 1 . A computer-implemented method of training and refining a machine learning model, comprising: generating an initially trained machine learning model by training a machine learning model with a labeled dataset including labels defining which classes apply to each piece of data; inputting an unlabeled dataset with unknown classes into the initially trained machine learning model to generate prediction scores representing the probability that each piece of data belongs in each known class, respectively; randomly selecting multiple threshold values from a range of threshold values for prediction scores for each known class; for each known class, initializing a population with the randomly selected threshold values; calculating objective values of recall and precision for each of the randomly selected threshold values; performing multi-objective optimization to determine which threshold values of the randomly selected threshold values optimize both recall and precision for each known class; and generating a finally trained machine learning model by training the initially trained machine learning model to classify input using the determined optimal threshold values. 2 . The method of claim 1 , wherein performing multi-objective optimization comprises performing Non-dominated Sorting Genetic Algorithm II (NSGA-II) to determine which threshold values of the randomly selected threshold values optimize both recall and precision for each class the initially trained machine learning model is trained for. 3 . The method of claim 1 , wherein performing multi-objective optimization comprises sorting the initialized population into fronts ranked by ascending level of non-domination, wherein the fronts each contain members including threshold values of the randomly selected threshold values. 4 . The method of claim 3 , wherein performing multi-objective optimization comprises, for each front, calculating a crowding distance for each member of the respective front. 5 . The method of claim 4 , wherein performing multi-objective optimization comprises generating an offspring population by applying crowded tournament selection to the members of the fronts. 6 . The method of claim 5 , wherein crowded tournament selection includes comparing the rank between two members and, if one member has a higher rank than the other member, selecting the member with the higher rank, and, if two members have the same rank, selecting the member with the highest crowding distance. 7 . The method of claim 5 , wherein generating the offspring population includes applying crossover and mutation operators to the initialized population. 8 . The method of claim 1 , wherein the initially trained machine learning model is an intent classification deep neural network. 9 . A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to: generate an initially trained machine learning model by training a machine learning model with a labeled dataset including labels defining which classes apply to each piece of data; input an unlabeled dataset with unknown classes into the initially trained machine learning model to generate prediction scores representing the probability that each piece of data belongs in each known class, respectively; randomly select multiple threshold values from a range of threshold values for prediction scores for each known class; for each known class, initialize a population with the randomly selected threshold values; calculate objective values of recall and precision for each of the randomly selected threshold values; perform multi-objective optimization to determine which threshold values of the randomly selected threshold values optimize both recall and precision for each known class; and generate a finally trained machine learning model by training the initially trained machine learning model to classify input using the determined optimal threshold values. 10 . The non-transitory computer-readable medium storing software of claim 9 , performing multi-objective optimization comprises performing Non-dominated Sorting Genetic Algorithm II (NSGA-II) to determine which threshold values of the randomly selected threshold values optimize both recall and precision for each class the initially trained machine learning model is trained for. 11 . The non-transitory computer-readable medium storing software of claim 9 , wherein performing multi-objective optimization comprises sorting the initialized population into fronts ranked by ascending level of non-domination, wherein the fronts each contain members including threshold values of the randomly selected threshold values. 12 . The non-transitory computer-readable medium storing software of claim 11 , wherein performing multi-objective optimization comprises, for each front, calculating a crowding distance for each member of the respective front. 13 . The non-transitory computer-readable medium storing software of claim 12 , wherein performing multi-objective optimization comprises generating an offspring population by applying crowded tournament selection to the members of the fronts. 14 . The non-transitory computer-readable medium storing software of claim 13 , wherein crowded tournament selection includes comparing the rank between two members and, if one member has a higher rank than the other member, selecting the member with the higher rank, and, if two members have the same rank, selecting the member with the highest crowding distance. 15 . The non-transitory computer-readable medium storing software of claim 14 , wherein generating the offspring population includes applying crossover and mutation operators to the initialized population. 16 . The non-transitory computer-readable medium storing software of claim 9 , wherein the initially trained machine learning model is an intent classification deep neural network. 17 . A system for training and refining a machine learning model, the system comprising: a device processor; and a non-transitory computer readable medium storing instructions that are executable by the device processor to: generate an initially trained machine learning model by training a machine learning model with a labeled dataset including labels defining which classes apply to each piece of data; input an unlabeled dataset with unknown classes into the initially trained machine learning model to generate prediction scores representing the probability that each piece of data belongs in each known class, respectively; randomly select multiple threshold values from a range of threshold values for prediction scores for each known class; for each known class, initialize a population with the randomly selected threshold values; calculate objective values of recall and precision for each of the randomly selected threshold values; perform multi-objective optimization to determine which threshold values of the randomly selected threshold values optimize both recall and precision for each known class; and generate a finally trained machine learning model by training the initially trained machine learning model to classify input using the determined optimal threshold values. 18 . The system of claim 17 , performing multi-objective optimization comprises performing Non-dominated Sorting Genetic Algorithm II (NSGA-II) to determine which threshold values of the randomly selected threshold values optimize both recall and precision for each class

Assignees

Inventors

Classifications

  • Training · CPC title

  • using artificial neural networks · CPC title

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation · CPC title

  • G06N3/086Primary

    using evolutionary algorithms, e.g. genetic algorithms or genetic programming · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2023072171A1 cover?
A system and method for training and refining a machine learning model is disclosed. The disclosed system and method can further improve the accuracy of trained machine learning models by calculating which threshold values for predictions (e.g., probabilities output by the machine learning model) provide the most accurate results. The system and method may include applying an optimization techn…
Who is the assignee on this patent?
Accenture Global Solutions Ltd
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Mar 09 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).