Robot motion planning
US-2022147058-A1 · May 12, 2022 · US
US12539604B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12539604-B2 |
| Application number | US-202117561132-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 23, 2021 |
| Priority date | Dec 23, 2021 |
| Publication date | Feb 3, 2026 |
| Grant date | Feb 3, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are disclosed for the exploration of environments for the estimation and detection of hazards or near hazards within the environment and the mitigation of hazards therein. The exploration of the environment and mitigation of hazards therein may use one or more autonomous agents, including a hazard response robot. The estimation of the hazards may use a policy learning engine, and the hazards may be detected, and the associated risks therefrom, may be determined using a hazard estimation system.
Opening claim text (preview).
The invention claimed is: 1 . A controller for a hazard exploration system, comprising: a communication interface configured to communicate with one or more autonomous agents; and processing circuitry configured to: control the one or more autonomous agents to explore an environment; determine one or more hazards within the environment based on feedback data from the one or more autonomous agents; generate an exploration policy for the one or more autonomous agent based on the feedback data; interpret and fuse the feedback data from the one or more autonomous agents; calculate a global reward based on the fused feedback data; update a global policy map based on the global reward; segment the global policy map to generate an individual policy map for each of the one or more autonomous agents; and generate a policy update for the exploration policy based on the individual policy maps. 2 . The controller of claim 1 , wherein feedback data from an autonomous agent of the one or more autonomous agents corresponds to an individual reward for the autonomous agent. 3 . The controller of claim 1 , wherein the processing circuitry comprises a policy learning engine that is a reinforcement learning engine configured to perform reinforced learning. 4 . The controller of claim 1 , wherein the one or more hazards are determined based on the global reward exceeding a reward threshold. 5 . The controller of claim 1 , wherein the processing circuitry is configured to reset the exploration policy in response to the one or more hazards being determined. 6 . The controller of claim 1 , wherein corresponding states of the one or more autonomous agents are reset in response to the one or more hazards being determined. 7 . The controller of claim 1 , wherein the one or more autonomous agents are configured to perform actions within the environment and detect corresponding impacts of the performed actions, the feedback data generated by the one or more autonomous agents including the performed actions and the detected impacts. 8 . The controller of claim 7 , wherein the performed actions comprise random actions. 9 . The controller of claim 7 , wherein the one or more hazards are determined based on the performed actions and detected impacts. 10 . The controller of claim 1 , wherein the feedback data includes physical integrity information for the one or more autonomous agents, whether movement of the one or more autonomous agents is restricted or impaired, and/or a perceived potential accident or threat. 11 . The controller of claim 1 , wherein the controller is an Edge computer. 12 . A non-transitory computer-readable storage medium with an executable program stored thereon, that when executed, instructs a processor to: control the one or more autonomous agents to explore an environment; determine one or more hazards within the environment based on feedback data from the one or more autonomous agents; generate an exploration policy for the one or more autonomous agent based on the feedback data; interpret and fuse the feedback data from the one or more autonomous agents; calculate a global reward based on the fused feedback data; update a global policy map based on the global reward; segment the global policy map to generate an individual policy map for each of the one or more autonomous agents; and generate a policy update for the exploration policy based on the individual policy maps.
Dual arm manipulator; Coordination of several manipulators · CPC title
Hardware, e.g. neural networks, fuzzy logic, interfaces, processor · CPC title
Fire-fighting land vehicles · CPC title
learning, adaptive, model based, rule based expert control · CPC title
based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.