Evaluating feature vectors across disjoint subsets of decision trees

US10217052B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10217052-B2
Application numberUS-201514699657-A
CountryUS
Kind codeB2
Filing dateApr 29, 2015
Priority dateApr 29, 2015
Publication dateFeb 26, 2019
Grant dateFeb 26, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosure is directed to evaluating feature vectors using decision trees. Typically, the number of feature vectors and the number of decision trees are very high, which prevents loading them into a processor cache. The feature vectors are evaluated by processing the feature vectors across a disjoint subset of trees repeatedly. After loading the feature vectors into the cache, they are evaluated across a first subset of trees, then across a second subset of trees and so on. If the values based on the first and second subsets satisfy a specified criterion, further evaluation of the feature vectors across the remaining of the decision trees is terminated, thereby minimizing the number of trees evaluated and therefore, consumption of computing resources.

First claim

Opening claim text (preview).

We claim: 1. A method performed by a computing system, comprising: receiving multiple feature vectors, wherein at least some of the feature vectors include a plurality of features; receiving multiple decision trees by which the feature vectors are to be evaluated; loading the feature vectors into a memory of the computing system; and loading disjoint subsets of the decision trees into the memory of the computing system successively for evaluating the feature vectors, the loading the disjoint subsets successively including: loading a first subset of the disjoint subsets into the memory, evaluating the feature vectors using the first subset to generate a first result, evicting the first subset from the memory, loading a second subset of the disjoint subsets into the memory after evicting the first subset, and evaluating the feature vectors using the second subset to generate a second result; determining whether to evaluate the feature vectors using a third subset of the disjoint subsets as a function of a variation between the first result and the second result; and responsive to a determination that the variation exceeds a specified threshold, loading the third subset into the memory after evicting the second subset, and evaluating the feature vectors using the third subset to generate a third result. 2. The method of claim 1 , wherein loading the second subset into the memory after evicting the first subset includes loading the second subset while retaining the feature vectors in the memory. 3. The method of claim 1 , wherein evaluating the feature vectors using the first subset includes: evaluating the feature vectors using a first decision tree of the first subset, and evaluating the feature vectors using a second decision tree of the first subset after evaluating using the first decision tree of the first subset. 4. The method of claim 1 , wherein evaluating the feature vectors using the first subset includes: evaluating a first feature vector of the feature vectors using the first subset, and evaluating a second feature vector of the feature vectors using the first subset after evaluating the first feature vector. 5. The method of claim 1 , wherein a first decision tree of the decision trees is expressed using a ternary expression. 6. The method of claim 5 , wherein the ternary expression includes a categorical operator. 7. The method of claim 1 , wherein the evaluating the feature vectors includes: evaluating a first feature vector of the feature vectors using a first decision tree of the first subset to generate a first value, evaluating the first feature vector using a second decision tree of the first subset to generate a second value, and computing a total value of the first feature vector as a function of the first value and the second value. 8. A non-transitory computer-readable storage medium storing computer-readable instructions, the instructions comprising: instructions for receiving multiple feature vectors and a number (m) of decision trees by which the feature vectors are to be evaluated; and instructions for evaluating the feature vectors using the decision trees by loading multiple subsets of the m decision trees into a memory of a computing system successively, the evaluating including: loading the feature vectors and a first subset of the subsets into the memory, wherein at least some of the subsets include k number of m decision trees, evaluating the feature vectors using the first subset to generate a first result, loading a second subset of the subsets into the memory after evicting the first subset, and evaluating the feature vectors using the second subset to generate a second result; instructions for determining whether to evaluate the feature vectors using a third subset of the subsets as a function of a variation between the first result and the second result; and instructions for loading the third subset into the memory after evicting the second subset in an event that the variation exceeds a specified threshold. 9. The non-transitory computer-readable storage medium of claim 8 , wherein the instructions for evaluating the feature vectors using the decision trees include: instructions for determining a common feature between at least some of the feature vectors; instructions for determining, in evaluating a first feature vector of the feature vectors having the common feature using a first decision tree of the first subset, a path taken by the common feature in the first decision tree; and instructions for excluding an evaluation of the common feature using the first decision tree for the at least some of the feature vectors. 10. The non-transitory computer-readable storage medium of claim 9 , wherein the instructions for excluding the evaluation of the common feature include: instructions for identifying at least some of the decision trees in the first subset that contain the path taken by the common feature, and instructions for regenerating the at least some of the decision trees by excluding the path as a specified set of decision trees. 11. The non-transitory computer-readable storage medium of claim 10 , wherein the instructions for loading the first subset into the memory include instructions for loading into the memory the specified set of decision trees. 12. The non-transitory computer-readable storage medium of claim 8 , wherein the instructions for evaluating the feature vectors using the decision trees include: instructions for evaluating the feature vectors by cascading the decision trees, the evaluating using cascading including: evaluating the feature vectors using the first subset of the decision trees, wherein the first subset of the decision trees are generated using a first model, pruning the feature vectors as a function of a result of the evaluating, to generate a pruned set of feature vectors, and evaluating the pruned set of feature vectors using the second subset of the decision trees, wherein the second subset of the decision trees are generated using a second model. 13. A system, comprising: a processor; and a storage device storing instructions that, when executed by the processor, cause the system to perform operations comprising: receiving multiple feature vectors, wherein at least some of the feature vectors include a plurality of features; receiving multiple decision trees by which the feature vectors are to be evaluated; loading the feature vectors into a memory of the system; loading disjoint subsets of the decision trees into the memory of the system successively for evaluating the feature vectors, wherein the loading the disjoint subsets successively comprises: loading a first subset of the disjoint subsets into the memory, evicting the first subset from the memory after the feature vectors are evaluated using the first subset, and loading a second subset of the disjoint subsets into the memory after evicting the first subset, for the feature vectors to be evaluated using the second subset; and evaluating the feature vectors after a specified subset of the disjoint subsets is loaded into the memory by cascading the decision trees, wherein the cascading the decision trees comprises: evaluating the feature vectors using the first subset of the decision trees, wherein the first subset of the decision trees are generated using a first model, pruning the feature vectors as a function of a result of the evaluating to generate a pruned set of the feature vectors, and evaluating the pruned set of the feature vectors using the second subset of the decision trees, wherein the second subset of the decision trees are generated using

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10217052B2 cover?
The disclosure is directed to evaluating feature vectors using decision trees. Typically, the number of feature vectors and the number of decision trees are very high, which prevents loading them into a processor cache. The feature vectors are evaluated by processing the feature vectors across a disjoint subset of trees repeatedly. After loading the feature vectors into the cache, they are eval…
Who is the assignee on this patent?
Facebook Inc
What technology area does this patent fall under?
Primary CPC classification G06N5/025. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 26 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).