What technology area does this patent fall under?

Primary CPC classification G06F18/24323. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 14 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Driving decision-making method and apparatus and chip

US12444244B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12444244-B2
Application number	US-202218145557-A
Country	US
Kind code	B2
Filing date	Dec 22, 2022
Priority date	Jun 23, 2020
Publication date	Oct 14, 2025
Grant date	Oct 14, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure relates to driving decision-making methods, apparatuses, and chips. One example method includes building a Monte Carlo tree based on a current driving environment state, where the Monte Carlo tree includes a root node and N−1 non-root nodes, each node represents one driving environment state, and a driving environment state represented by any non-root node is predicted by a stochastic model of driving environments. Based on at least one of an access count or a value function of each node in the Monte Carlo tree, a node sequence that starts from the root node and ends at a leaf node is determined, and a driving action sequence is determined based on a driving action corresponding to each node in the node sequence.

First claim

Opening claim text (preview).

What is claimed is: 1. A driving decision-making method, comprising: obtaining, by an autonomous driving vehicle, information of a current driving environment state of the autonomous driving vehicle, wherein the autonomous driving vehicle includes one or more sensors; constructing, by the autonomous driving vehicle, a Monte Carlo tree based on the current driving environment state, wherein the Monte Carlo tree comprises N nodes, each node represents a corresponding driving environment state, the N nodes comprise a root node and N−1 non-root nodes, the root node represents the current driving environment state, a first driving environment state represented by a first node is predicted by using a stochastic model of driving environments based on a second driving environment state represented by a parent node of the first node and based on a driving action, the driving action is determined by the parent node of the first node in a process of obtaining the first node through expansion, the first node is any node of the N−1 non-root nodes, and N is a positive integer greater than or equal to 2; determining, by the autonomous driving vehicle, in the Monte Carlo tree based on at least one of an access count or a value function of each node in the Monte Carlo tree, a node sequence, wherein the node sequence comprises a plurality of nodes that starts from the root node and ends at a leaf node; in response to determining the node sequence, determining, by the autonomous driving vehicle, a driving action sequence of a plurality of future driving steps, wherein each future driving step in the driving action sequence comprises a driving action corresponding to each node comprised in the node sequence, and wherein the driving action sequence is used by the autonomous driving vehicle for driving decision-making; autonomously driving, by the autonomous driving vehicle, the autonomous driving vehicle based on a first driving action in the driving action sequence; obtaining, by the autonomous driving vehicle, an actual driving environment state after the first driving action is executed; and updating, by the autonomous driving vehicle, the stochastic model of driving environments based on the current driving environment state, the first driving action, and the actual driving environment state, wherein the access count of each node is determined based on access counts of subnodes of the each node and an initial access count of the each node, the value function of the each node is determined based on value functions of subnodes of the each node and an initial value function of the each node, the initial access count of the each node is 1, and the initial value function of the each node is determined based on a value function that matches the corresponding driving environment state represented by the each node. 2. The method according to claim 1 , wherein that the first driving environment state represented by the first node is predicted by using the stochastic model of driving environments based on the second driving environment state represented by the parent node of the first node and based on the driving action comprises: predicting, through dropout-based forward propagation by using the stochastic model of driving environments, a probability distribution of a driving environment state after the driving action is executed based on the second driving environment state represented by the parent node of the first node; and obtaining the first driving environment state represented by the first node through sampling from the probability distribution. 3. The method according to claim 1 , wherein that the initial value function of the node is determined based on the value function that matches the driving environment state represented by the node comprises: selecting, from an episodic memory, a first quantity of target driving environment states that have a highest matching degree with the driving environment state represented by the node; and determining the initial value function of the node based on value functions respectively corresponding to the first quantity of target driving environment states. 4. The method according to claim 3 , wherein the method further comprises: when a driving episode ends, determining a cumulative reward return value corresponding to an actual driving environment state after each driving action in the driving episode is executed; and updating the episodic memory by using, as a value function corresponding to the actual driving environment state, the cumulative reward return value corresponding to the actual driving environment state after each driving action is executed. 5. The method according to claim 1 , wherein the node sequence is determined: based on the access count of the each node in the Monte Carlo tree according to a maximum access count rule; based on the value function of the each node in the Monte Carlo tree according to a maximum value function rule; or based on the access count and the value function of the each node in the Monte Carlo tree according to a “maximum access count first, maximum value function next” rule. 6. The method according to claim 1 , wherein obtaining, by the autonomous driving vehicle, information of the current driving environment state of the autonomous driving vehicle comprises: receiving, from a vehicle velocity sensor of the autonomous driving vehicle, a velocity of the autonomous driving vehicle. 7. The method according to claim 1 , obtaining, by the autonomous driving vehicle, information of the current driving environment state of the autonomous driving vehicle comprises: receiving, from an acceleration sensor of the autonomous driving vehicle, an acceleration of the autonomous driving vehicle. 8. The method according to claim 1 , wherein obtaining, by the autonomous driving vehicle, information of the current driving environment state of the autonomous driving vehicle comprises: receiving, from a distance sensor of the autonomous driving vehicle, a relative distance between the autonomous driving vehicle and another vehicle. 9. A driving decision-making apparatus in an autonomous driving vehicle, comprising: at least one processor; and one or more memories coupled to the at least one processor and storing programming instructions for execution by the at least one processor to: obtain information of a current driving environment state of the autonomous driving vehicle, wherein the autonomous driving vehicle includes one or more sensors; construct a Monte Carlo tree based on the current driving environment state, wherein the Monte Carlo tree comprises N nodes, each node represents a corresponding driving environment state, the N nodes comprise a root node and N−1 non-root nodes, the root node represents the current driving environment state, a first driving environment state represented by a first node is predicted by using a stochastic model of driving environments based on a second driving environment state represented by a parent node of the first node and based on a driving action, the driving action is determined by the parent node of the first node in a process of obtaining the first node through expansion, the first node is any node of the N−1 non-root nodes, and N is a positive integer greater than or equal to 2; determine, in the Monte Carlo tree based on at least one of an access count or a value function of each node in the Monte Carlo tree, a node sequence, wherein the node sequence comprises a plurality of nodes that starts from the root node and ends at a leaf node; in response to determining the node sequence, determine a driving action sequence of a plurality of future driving steps, wherein each driving future step in the driving action sequenc

Assignees

Huawei Tech Co Ltd

Inventors

Classifications

B60W2050/0018
Method for the design of a control system · CPC title
B60W2556/10
Historical data · CPC title
B60W2050/0028
Mathematical models, e.g. for simulation · CPC title
B60W2050/0016
State machine analysis · CPC title
G06N3/092
Reinforcement learning · CPC title

Patent family

Related publications grouped by family.

View patent family 78964328

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12444244B2 cover?: The present disclosure relates to driving decision-making methods, apparatuses, and chips. One example method includes building a Monte Carlo tree based on a current driving environment state, where the Monte Carlo tree includes a root node and N−1 non-root nodes, each node represents one driving environment state, and a driving environment state represented by any non-root node is predicted by…
Who is the assignee on this patent?: Huawei Tech Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06F18/24323. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 14 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).