Systems and Methods for Object Detection and Motion Prediction by Fusing Multiple Sensor Sweeps into a Range View Representation
US-2021278539-A1 · Sep 9, 2021 · US
US12568244B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12568244-B2 |
| Application number | US-202318113748-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 24, 2023 |
| Priority date | Aug 25, 2020 |
| Publication date | Mar 3, 2026 |
| Grant date | Mar 3, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method performed by at least one processor for improving video quality, includes: obtaining, from a first frame of a video comprising a plurality of pixels, first motion information regarding a user control object displayed in a display interface, the user control object comprising a first set of pixels included in the plurality of pixels; obtaining, from the first frame of the video by using a first neural network, second motion information from a second set of pixels included in the plurality of pixels, the second set of pixels excluding the first set of pixels; and generating, by using a second neural network with the first motion information and the second motion information as inputs for the second neural network, an interpolation frame between the first frame and a second frame included in the video.
Opening claim text (preview).
What is claimed is: 1 . A method performed by an apparatus for improving video quality, the method comprising: obtaining first motion information for a user control object displayed in a first frame of a video, from a mapping table pre-generated and stored in the apparatus; obtaining by using a first neural network, second motion information that is a prediction of motion information of a first set of pixels included in the first frame of the video excluding the user control object; and generating, by using a second neural network, an interpolation frame between the first frame and a second frame included in the video, based on the first frame, the second frame, the first motion information and the second motion information, wherein the mapping table is generated based on a correlation between motion information of pixels included in the user control object and a user input. 2 . The method of claim 1 , further comprising post processing the first motion information and the second motion information, wherein the post processing comprises at least one of: modifying motion information of a certain pixel included in a plurality of pixels by using motion information of at least one adjacent pixel adjacent to the certain pixel; or obtaining motion information per object by grouping the pixels included in a frame per object. 3 . The method of claim 1 , further comprising: receiving a third frame before the obtaining the first motion information; and generating the mapping table based on the user input regarding the user control object included in the third frame. 4 . The method of claim 3 , wherein the generating the mapping table comprises: obtaining motion information regarding all pixels included in the third frame by using the first neural network; detecting at least one object from the third frame by using a third neural network; identifying the user control object controlled according to the user input from the detected at least one object; and generating the mapping table based on a correlation between the motion information of the pixels included in the user control object included in the third frame and the user input regarding the user control object included in the third frame. 5 . The method of claim 4 , further comprising modifying motion information of pixels included in the detected at least one object, based on an event regarding the detected at least one object, wherein the event comprises at least one of zoom-in, zoom-out, or rotation. 6 . The method of claim 4 , wherein the detecting the at least one object comprises detecting whether the at least one object is a foreground object or a background object. 7 . The method of claim 4 , wherein the generating the mapping table comprises: receiving the user input during a predetermined period of time; and updating the mapping table based on the user input during the predetermined period of time. 8 . The method of claim 4 , wherein the generating of the mapping table comprises: obtaining a parameter change value of a parameter of a controller according to the user input; and mapping the motion information of the pixels included in the user control object according to the parameter change value of the controller, wherein the parameter of the controller comprises at least one of a moving direction, a moving distance, a moving time, a moving speed, a moving acceleration, a moving strength, or a moving amplitude. 9 . The method of claim 8 , wherein the generating of the mapping table comprises generating the mapping table for each type of the controller that receives the user input. 10 . The method of claim 1 , further comprising obtaining the first motion information from metadata included in the video. 11 . An apparatus for improving video quality, the apparatus comprising: a memory storing one or more instructions; and a processor configured to execute the one or more instructions stored in the memory to: obtain first motion information for a user control object displayed in a first frame of a video from a mapping table pre-generated and stored in the apparatus; obtain by using a first neural network, second motion information that is a prediction of motion information of a first set of pixels included in the first frame of the video excluding the user control object; and generate, by using a second neural network, an interpolation frame between the first frame and a second frame included in the video, based on the first frame, the second frame, the first motion information and the second motion information, wherein the mapping table is generated based on a correlation between motion information of pixels included in the user control object and a user input. 12 . The apparatus of claim 11 , wherein the processor is further configured to execute the one or more instructions to post process the first motion information and the second motion information, wherein the post process comprises at least one of modifying motion information of a certain pixel included in the plurality of pixels by using motion information of at least one adjacent pixel adjacent to the certain pixel, or obtaining motion information per object by grouping pixels included in a frame per object. 13 . The apparatus of claim 11 , wherein the processor is further configured to execute the one or more instructions to: receive a third frame before obtaining the first motion information; and generate the mapping table based on the user input regarding the user control object included in the third frame. 14 . The apparatus of claim 13 , wherein the processor is further configured to execute the one or more instructions to: obtain motion information regarding all pixels included in the third frame by using the first neural network; detect at least one object from the third frame by using a third neural network; identify the user control object controlled according to a user input from the detected at least one object; and generate the mapping table based on a correlation between the motion information of the pixels included in the user control object included in the third frame and the user input regarding the user control object included in the third frame. 15 . The apparatus of claim 14 , wherein the processor is further configured to execute the one or more instructions to: modify motion information of pixels included in the detected at least one object, based on an event regarding the detected at least one object, wherein the event comprises at least one of zoom-in, zoom-out, or rotation. 16 . The apparatus of claim 14 , wherein the processor is further configured to execute the one or more instructions to: detect whether the at least one object is a foreground object or a background object. 17 . The apparatus of claim 14 , wherein the processor is further configured to execute the one or more instructions to: receive the user input during a predetermined period of time; and update the mapping table based on the user input during the predetermined period of time. 18 . The apparatus of claim 14 , wherein the processor is further configured to execute the one or more instructions to: obtain a parameter change value of a parameter of a controller according to the user input; and map the motion information of the pixels included in the user control object according to the parameter change value of the controller, wherein the parameter of the controller comprises at least one of a moving direction, a moving distance, a moving time, a moving spe
Analysis of motion (motion estimation for coding, decoding, compressing or decompressing digital video signals H04N19/43, H04N19/51) · CPC title
using pattern recognition or machine learning (optical pattern recognition or electronic computations therefor G06V10/88) · CPC title
Artificial neural networks [ANN] · CPC title
based on super-resolution, i.e. the output image resolution being higher than the sensor resolution · CPC title
using neural networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.