Optimizing gradient boosting feature selection
US-2021334667-A1 · Oct 28, 2021 · US
US11481810B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11481810-B2 |
| Application number | US-202117152419-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 19, 2021 |
| Priority date | Jan 19, 2021 |
| Publication date | Oct 25, 2022 |
| Grant date | Oct 25, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
This disclosure describes one or more implementations of a model segmentation system that generates accurate audience segments for client devices/individuals utilizing multi-class decision tree machine-learning models. For example, in various implementations, the model segmentation system generates a customized loss penalty matrix from multiple loss penalty matrices. In particular, the model segmentation system can generate regression mappings of model evaluation metrics for a plurality of decision tree models and combine loss penalty matrices based on the regression mappings to generate a customized loss penalty matrix that best fits an administrator's customized needs of segment accuracy and reach. The model segmentation system then utilizes the customized loss penalty matrix to train a multi-class decision tree machine-learning model to classify client devices into non-overlapping audience segments. Further, in one or more implementations, the model segmentation system refines the multi-class decision tree machine-learning model based on adjusting the tree depth.
Opening claim text (preview).
What is claimed is: 1. A non-transitory computer-readable medium storing instructions that, when executed by at least one processor, cause a computer device to: generate a plurality of loss matrices comprising penalty values for audience segment misclassifications corresponding to a plurality of audience segments; generate confusion matrices that indicate predicted classifications and actual classifications for the plurality of audience segments; determine, based on the confusion matrices, model evaluation metrics for a plurality of multi-class decision tree machine-learning models generated utilizing the plurality of loss matrices; generate a customized loss matrix for the plurality of audience segments utilizing a regression mapping of the model evaluation metrics and the plurality of loss matrices by: generating linear regression lines for audience segments of the plurality of audience segments; receiving user input of location selections on the linear regression lines; and generating the customized loss matrix based on the location selections on the linear regression lines for the audience segments of the plurality of audience segments; generate a set of multi-class decision tree machine-learning models based on a plurality of tree depth values; determine overfitting scores for the set of multi-class decision tree machine-learning models by comparing accuracy scores between the set of multi-class decision tree machine-learning models and a multi-class decision tree machine-learning model generated utilizing the customized loss matrix; select a target tree depth based on the overfitting scores for the set of multi-class decision tree machine-learning models; generate a finalized multi-class decision tree machine-learning model utilizing the customized loss matrix and the target tree depth; and in response to determining one or mom traits of a client device, utilize the finalized multi-class decision tree machine-learning model to classify the client device to a target audience segment of the plurality of audience segments. 2. The non-transitory computer-readable medium of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the computer device to determine the model evaluation metrics for the target audience segment of the plurality of audience segments by determining one or more accuracy scores and one or more reach scores for the target audience segment based on the confusion matrices. 3. The non-transitory computer-readable medium of claim 2 , further comprising instructions that, when executed by the at least one processor, cause the computer device to generate the linear regression lines based on the model evaluation metrics. 4. The non-transitory computer-readable medium of claim 2 , further comprising instructions that, when executed by the at least one processor, cause the computer device to: generate the target audience segment based on the one or more accuracy scores and the one or more reach scores for the target audience segment. 5. The non-transitory computer-readable medium of claim 1 , wherein receiving user input of the location selections on the linear regression lines comprises: providing the linear regression lines for display via a user interface of a client device; and receiving the location selections based on user interaction via the user interface. 6. The non-transitory computer-readable medium of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the computer device to classify the client device to a target audience segment by determining a recency rule and a frequency rule of the target audience segment from the multi-class decision tree machine-learning model. 7. The non-transitory computer-readable medium of claim 6 , further comprising instructions that, when executed by the at least one processor, cause the computer device to classify the client device to a target audience segment by comparing the one or more traits of the client device to the recency rule and the frequency rule of the target audience segment. 8. The non-transitory computer-readable medium of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the computer device to generate a set of accuracy scores for the set of multi-class decision tree machine-learning models. 9. The non-transitory computer-readable medium of claim 8 , further comprising instructions that, when executed by the at least one processor, cause the computer device to select the target tree depth based on the set of accuracy scores. 10. The non-transitory computer-readable medium of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the computer device to generate the customized loss matrix by associating each audience segment of the plurality of audience segments with a separate row and a separate column of the customized loss matrix. 11. A system for generating multi-class decision tree machine-learning models, the system comprising: one or more memory devices comprising learning data and a plurality of loss matrices for a target audience segment of a plurality of audience segments; and at least one server device configured to cause the system to: generate confusion matrices that indicate predicted classifications and actual classifications for the plurality of audience segments; determine, based on the confusion matrices, model evaluation metrics for a plurality of multi-class decision tree machine-learning models generated utilizing the plurality of loss matrices and the learning data; generate a customized loss matrix for the plurality of audience segments utilizing a regression mapping of the model evaluation metrics and the plurality of loss matrices by: generating linear regression lines for audience segments of the plurality of audience segments; receiving user input of location selections on the linear regression lines; and generating the customized loss matrix based on the location selections on the linear regression lines for the audience segments of the plurality of audience segments; generate a set of multi-class decision tree machine-learning models based on a plurality of tree depth values; determine overfitting scores for the set of multi-class decision tree machine-learning models by comparing accuracy scores between the set of multi-class decision tree machine-learning models and a multi-class decision tree machine-learning model generated utilizing the customized loss matrix; select a target tree depth based on the overfitting scores for the set of multi-class decision tree machine-learning models; generate a finalized multi-class decision tree machine-learning model utilizing the customised loss matrix and the target tree depth; and in response to determining one or more traits of a client device, utilize the finalized multi-class decision tree machine-learning model to classify the client device to a target audience segment of the plurality of audience segments. 12. The system of claim 11 , wherein the at least one server device is further configured to cause the system to generate the customized loss matrix by associating each audience segment of the plurality of audience segments with a separate row and a separate column of the customized loss matrix. 13. The system of claim 11 , wherein the at least one server device is further configured to cause the system to generate the customized loss matrix from the plurality of loss matrices by: detecting an upper-boundary multi-class decision tree machine-learning model and a lower-boundary multi-class dec
based on specific statistical tests · CPC title
Tree-organised classifiers · CPC title
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
characterised by the process organisation or structure, e.g. boosting cascade · CPC title
Ensemble learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.