Information prediction method, method of training autonomous driving model, device, medium, and autonomous driving vehicle

US2025206330A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025206330-A1
Application numberUS-202519080405-A
CountryUS
Kind codeA1
Filing dateMar 14, 2025
Priority dateJun 19, 2024
Publication dateJun 26, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An information prediction method, a method of training an autonomous driving model, a device, a medium, and an autonomous driving vehicle, which relate to a field of artificial intelligence technology, and in particular, to fields of computer vision technology and deep learning technology, which may be applied to scenarios such as autonomous driving. Specific implementation scheme of the information prediction method is: acquiring perception data including image data acquired by a sensor in a vehicle and driving data of the vehicle; encoding the image data to obtain an image token sequence corresponding to the image data; encoding the driving data to obtain a driving feature corresponding to the driving data; and generating, using a generative model, a predicted token sequence corresponding to the image token sequence and a control information for the vehicle based on the driving feature and the image token sequence.

First claim

Opening claim text (preview).

What is claimed is: 1 . An information prediction method, comprising: acquiring perception data comprising image data acquired by a sensor in a vehicle and driving data of the vehicle; encoding the image data to obtain an image token sequence corresponding to the image data; encoding the driving data to obtain a driving feature corresponding to the driving data; and generating, using a generative model, a predicted token sequence corresponding to the image token sequence and a control information for the vehicle based on the driving feature and the image token sequence. 2 . The method of claim 1 , wherein the image token sequence is a discrete feature of the image data, and the method further comprises: extracting an image feature of the image data using a first convolutional network to obtain a first feature vector representing the image feature, and wherein the generating, using a generative model, a predicted token sequence corresponding to the image token sequence and a control information for the vehicle based on the driving feature and the image token sequence, comprises: generating, using the generative model, the predicted token sequence and the control information based on the driving feature, the first feature vector and the image token sequence. 3 . The method of claim 1 , wherein the driving data comprises a driving parameter of the vehicle and navigation data of the vehicle; and the encoding the driving data to obtain a driving feature corresponding to the driving data comprises: encoding the navigation data using a second convolutional network to obtain a second feature vector representing the navigation data; and encoding the driving parameter using a multi-layer perceptron to obtain a third feature vector representing the driving parameter, wherein the driving feature comprises the second feature vector and the third feature vector. 4 . The method of claim 3 , wherein the navigation data comprises positions of at least two target points on a navigation path; and the encoding the navigation data using a second convolutional network to obtain a second feature vector representing the navigation data comprises: generating a mask image representing a path formed by the at least two target points based on the positions of the at least two target points; and encoding the mask image using the second convolutional network to obtain the second feature vector. 5 . The method of claim 1 , wherein the encoding the image data to obtain an image token sequence corresponding to the image data comprises: encoding the image data using an encoder to obtain an encoded feature sequence; and quantizing the encoded feature sequence using a quantizer to obtain the image token sequence. 6 . The method of claim 1 , wherein the generating, using a generative model, a predicted token sequence corresponding to the image token sequence and a control information for the vehicle based on the driving feature and the image token sequence, comprises: adding a start tag at a head position of the image token sequence and adding a query tag at a tail position of the image token sequence, so as to obtain a tagged token sequence; obtaining an input sequence of the generative model based on the driving feature and the tagged token sequence; and inputting the input sequence into the generative model to obtain the predicted token sequence and the control information generated by the generative model. 7 . A method of training an autonomous driving model, wherein the autonomous driving model comprises an encoding layer and a generative model; the encoding layer comprises a sequence encoding network and a driving data encoding network; and the method comprises: encoding, using the sequence encoding network, image data in sample perception data to obtain an image token sequence corresponding to the image data; encoding, using the driving data encoding network, driving data in the sample perception data to obtain a driving feature corresponding to the driving data; generating, using the generative model, a predicted token sequence corresponding to the image token sequence and a predicted control information for a vehicle based on the driving feature and the image token sequence; and training the autonomous driving model according to the predicted token sequence and the image token sequence. 8 . The method of claim 7 , wherein the sample perception data further comprises a real control information; and the method further comprises: training the autonomous driving model according to a difference between the real control information and the predicted control information. 9 . The method of claim 7 , wherein the training the autonomous driving model comprises: training other model structures in the autonomous driving model other than the sequence encoding network. 10 . The method of claim 7 , wherein the encoding layer further comprises a first convolutional network, the method further comprises: extracting an image feature of the image data using the first convolutional network to obtain a first feature vector representing the image feature, and wherein the generating, using the generative model, a predicted token sequence corresponding to the image token sequence and a predicted control information for a vehicle based on the driving feature and the image token sequence, comprises: generating, using the generative model, the predicted token sequence and the predicted control information based on the driving feature, the first feature vector and the image token sequence. 11 . The method of claim 7 , wherein the driving data comprises a historical driving parameter of the vehicle and historical navigation data of the vehicle; the driving data encoding network comprises a second convolutional network and a multi-layer perceptron; and the encoding, using the driving data encoding network, driving data in the sample perception data to obtain a driving feature corresponding to the driving data comprises: encoding, using the second convolutional network, the historical navigation data to obtain a second feature vector representing the historical navigation data; and encoding, using the multi-layer perceptron, the historical driving parameter to obtain a third feature vector representing the historical driving parameter, wherein the driving feature comprises the second feature vector and the third feature vector. 12 . The method of claim 7 , wherein the sequence encoding network comprises an encoder and a quantizer; and the encoding, using the sequence encoding network, image data in sample perception data to obtain an image token sequence corresponding to the image data, comprises: encoding, using the encoder, the image data to obtain an encoded feature sequence; and quantizing, using the quantizer, the encoded feature sequence to obtain the image token sequence, wherein the sequence encoding network is configured to process the image data based on a vector quantization compression technology. 13 . The method of claim 7 , wherein the generating, using the generative model, a predicted token sequence corresponding to the image token sequence and a predicted control information for a vehicle based on the driving feature and the image token sequence comprises: adding a start tag at a head position of the image token sequence and adding a query tag at a tail position of the image token sequence, so as to obtain a tagged token sequence; obtaining an input sequence of the generative model based on the driving feature and the tagged token sequence; and inputting the input sequence into the generative model to obtain the predicte

Assignees

Inventors

Classifications

  • G06N3/0464Primary

    Convolutional networks [CNN, ConvNet] · CPC title

  • Generative networks · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • using neural networks only · CPC title

  • Image sensing, e.g. optical camera · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025206330A1 cover?
An information prediction method, a method of training an autonomous driving model, a device, a medium, and an autonomous driving vehicle, which relate to a field of artificial intelligence technology, and in particular, to fields of computer vision technology and deep learning technology, which may be applied to scenarios such as autonomous driving. Specific implementation scheme of the inform…
Who is the assignee on this patent?
Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/0464. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 26 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).