Real-time multiclass driver action recognition using random forests

US9501693B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9501693-B2
Application numberUS-201314050259-A
CountryUS
Kind codeB2
Filing dateOct 9, 2013
Priority dateOct 9, 2013
Publication dateNov 22, 2016
Grant dateNov 22, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An action recognition system recognizes driver actions by using a random forest model to classify images of the driver. A plurality of predictions is generated using the random forest model. Each prediction is generated by one of the plurality of decision trees and each prediction comprises a predicted driver action and a confidence score. The plurality of predictions is regrouped into a plurality of groups with each of the plurality of groups associated with one of the driver actions. The confidence scores are combined within each group to determine a combined score associated with each group. The driver action associated with the highest combined score is selected.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for performing action recognition on an image of a driver in a vehicle, the method comprising: receiving, by a computing system, an image of the driver in the vehicle; accessing a random forest model comprising a plurality of decision trees; generating a plurality of predictions of the action being performed by the driver in the image through the random forest model, each prediction generated by one of the plurality of decision trees, each of the plurality of predictions comprising a predicted driver action and a confidence score comprising a ratio or percentage; grouping the plurality of generated predictions into a plurality of groups by the predicted driver action, such that each group of the plurality of groups is associated with a single predicted driver action; combining the confidence scores of the generated predictions for each group to determine a single combined score for each group relating to the predicted driver action associated with each group; and selecting the driver action associated with a highest combined confidence score from the plurality of groups. 2. The method of claim 1 , wherein combining the confidence scores comprises adding the confidence scores. 3. The method of claim 1 , wherein generating the plurality of predictions through the random forest model comprises: for a first decision tree in the plurality of decision trees, applying a first test to the image based on first test parameters of a parent branch node of the first decision tree in the random forest, the parent branch node having a plurality of child nodes; selecting one of the child nodes of the parent branch node based on a result of the test; responsive to the selected one of the child nodes being a branch node, applying a second test to the image based on second test parameters associated with the selected one of the child nodes; responsive to the selected one of the child nodes being a leaf node, generating the prediction, the generated prediction comprising the driver action and the confidence score associated with the leaf node. 4. The method of claim 3 , wherein applying the first test based on the first test parameters comprises: selecting a plurality of spatial regions of the image; selecting a feature channel representing features of the image; and selecting a threshold value. 5. The method of claim 4 , wherein the image comprises a three-dimensional image and wherein the selected feature channel comprises three-dimensional depth data. 6. The method of claim 4 , wherein applying the first test comprises: determining a difference between average values of the selected feature channels of at least two of the selected plurality of spatial regions; and comparing the difference to the threshold value. 7. The method of claim 1 , comprising: delaying a notification provided from an in-vehicle system based on the predicted driver action. 8. The method of claim 1 , wherein the predicted driver action comprises at least one of: normal driving, reaching for the center compartment, reaching for a glove compartment, reaching for an overhead compartment, adjusting a radio, talking on a phone, and adjusting a mirror. 9. The method of claim 1 , wherein the random forest model is learned based on a set of labeled training images. 10. A non-transitory computer-readable storage medium storing instructions for performing action recognition on an image of a driver in a vehicle, the instructions when executed by a processor causing the processor to perform steps including: receiving, by a computing system, an image of the driver in the vehicle; accessing a random forest model comprising a plurality of decision trees; generating a plurality of predictions through the random forest model, each prediction generated by one of the plurality of decision trees, each of the plurality of predictions comprising a predicted driver action and a confidence score comprising a ratio or percentage; grouping the plurality of generated predictions into a plurality of groups by the predicted driver action, such that each group of the plurality of groups is associated with a single predicted driver action; combining the confidence scores of the generated predictions for each group to determine a single combined score for each group relating to the predicted driver action associated with each group; and selecting the driver action associated with a highest combined confidence score from the plurality of groups. 11. The non-transitory computer-readable storage medium of claim 10 , wherein combining the confidence scores comprises adding the confidence scores. 12. The non-transitory computer-readable storage medium of claim 10 , wherein generating the plurality of predictions through the random forest model comprises: for a first decision tree in the plurality of decision trees, applying a first test to the image based on first test parameters of a parent branch node of the first decision tree in the random forest, the parent branch node having a plurality of child nodes; selecting one of the child nodes of the parent branch node based on a result of the test; responsive to the selected one of the child nodes being a branch node, applying a second test to the image based on second test parameters associated with the selected one of the child nodes; responsive to the selected one of the child nodes being a leaf node, generating the prediction, the generated prediction comprising the driver action and the confidence score associated with the leaf node. 13. The non-transitory computer-readable storage medium of claim 12 , wherein applying the first test based on the first test parameters comprises: selecting a plurality of spatial regions of the image; selecting a feature channel representing features of the image; and selecting a threshold value. 14. The non-transitory computer-readable storage medium of claim 13 , wherein the image comprises a three-dimensional image and wherein the selected feature channel comprises three-dimensional depth data. 15. The non-transitory computer-readable storage medium of claim 13 , wherein applying the first test comprises: determining a difference between average values of the selected feature channels of at least two of the selected plurality of spatial regions; and comparing the difference to the threshold value. 16. A method for learning a random forest model for action recognition, the random forest model comprising a plurality of decision trees, the method comprising: receiving, by a computing system, a plurality of training images, each training image depicting a driver action being performed inside a vehicle and each training image having a label identifying the driver action being performed; generating a test corresponding to a parent node of one of the plurality of decision trees, the test comprising one or more test parameters; applying the test to each training image to classify each training image into a plurality of image groups including at least a first image group and a second image group; determining if an entropy value of the first image group is below a threshold value; responsive to a determination that the entropy value of the first image group is below the threshold value, generating a prediction based on the labels associated with the first image group, the prediction comprising a driver action and a confidence score comprising a ratio or percentage, and generating a leaf node associated with the prediction as a child node of the parent node; and responsive to determining that the entropy value of the

Assignees

Inventors

Classifications

  • using classification, e.g. of video objects · CPC title

  • G06V20/597Primary

    Recognising the driver's state or behaviour, e.g. attention or drowsiness · CPC title

  • Static body considered as a whole, e.g. static pedestrian or occupant recognition · CPC title

  • Tree-organised classifiers · CPC title

  • Clustering techniques · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9501693B2 cover?
An action recognition system recognizes driver actions by using a random forest model to classify images of the driver. A plurality of predictions is generated using the random forest model. Each prediction is generated by one of the plurality of decision trees and each prediction comprises a predicted driver action and a confidence score. The plurality of predictions is regrouped into a plural…
Who is the assignee on this patent?
Honda Motor Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06V20/597. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 22 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).