Configurable Power Saving Signal with Multiple Functionalities in 5G NR
US-2024414647-A1 · Dec 12, 2024 · US
US2024406861A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2024406861-A1 |
| Application number | US-202418609797-A |
| Country | US |
| Kind code | A1 |
| Filing date | Mar 19, 2024 |
| Priority date | May 31, 2023 |
| Publication date | Dec 5, 2024 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure provides methods, apparatuses, systems, and computer-readable mediums for operating a target base station by an apparatus. A method includes collecting a plurality of trajectories corresponding to the target base station and a plurality of source base stations, clustering, using an unsupervised reinforcement learning model, the plurality of trajectories into a plurality of clusters including a target cluster, selecting, as a target trajectory, a selected trajectory from the target cluster that maximizes an energy-saving parameter of the target base station, and applying, to the target base station, an energy-saving control policy corresponding to the target trajectory. The target cluster corresponds to the target base station and at least one source base station from among the plurality of source base stations.
Opening claim text (preview).
What is claimed is: 1 . A method for operating a target base station, by an apparatus, the method comprising: collecting a plurality of trajectories corresponding to the target base station and a plurality of source base stations; clustering, using an unsupervised reinforcement learning model, the plurality of trajectories into a plurality of clusters comprising a target cluster, the target cluster corresponding to the target base station and at least one source base station from among the plurality of source base stations; selecting, as a target trajectory, a selected trajectory from the target cluster that maximizes an energy-saving parameter of the target base station; and applying, to the target base station, an energy-saving control policy corresponding to the target trajectory. 2 . The method of claim 1 , further comprising: monitoring one or more energy-saving parameters of the target base station; and adjusting the energy-saving control policy applied to the target base station based on the one or more energy-saving parameters. 3 . The method of claim 2 , wherein the adjusting of the energy-saving control policy comprises: determining, based on the monitoring, that at least one of the one or more energy-saving parameters of the target base station is outside of a predetermined range of values; and adjusting the energy-saving control policy to cause the at least one of the one or more energy-saving parameters to be within the predetermined range of values. 4 . The method of claim 1 , further comprising: generating, using a base reinforcement learning model, a plurality of source control policies corresponding to the plurality of source base stations. 5 . The method of claim 4 , wherein the collecting of the plurality of trajectories comprises collecting a plurality of source base station trajectories corresponding to the plurality of source base stations, based on the plurality of source control policies, and wherein the applying of the energy-saving control policy comprises selecting the energy-saving control policy from among a control policy of the target base station and the plurality of source control policies. 6 . The method of claim 1 , further comprising: formulating the plurality of trajectories based on a Markov Decision Process (MDP), wherein each trajectory of the plurality of trajectories comprises a state space, an action space, a reward function, and a state transition probability function. 7 . The method of claim 6 , wherein the state space indicates at least one of a number of connected active devices per cell, a cell load ratio, and a throughput per cell, wherein the action space comprises at least one of activation thresholds and deactivation thresholds, wherein the reward function indicates a reward based on at least one of a power consumption and a minimum throughput, and wherein the state transition probability function indicates a probability of an action from the action space at a state of the state space. 8 . The method of claim 1 , wherein the selecting of the target trajectory comprises: performing iterative testing of respective control policies of each trajectory of the target cluster; determining, for each trajectory of the target cluster, an accumulated reward; and selecting, as the target trajectory, a trajectory of the target cluster that maximizes the accumulated reward. 9 . The method of claim 8 , wherein the performing of the iterative testing comprises performing testing of the respective control policies of each trajectory of the target cluster for a predetermined number of iterations. 10 . An apparatus for operating a target base station, the apparatus comprising: a memory storing instructions; and one or more processors communicatively coupled to the memory; wherein the one or more processors are configured to execute the instructions to: collect a plurality of trajectories corresponding to the target base station and a plurality of source base stations; cluster, using an unsupervised reinforcement learning model, the plurality of trajectories into a plurality of clusters comprising a target cluster, the target cluster corresponding to the target base station and at least one source base station from among the plurality of source base stations; select, as a target trajectory, a selected trajectory from the target cluster that maximizes an energy-saving parameter of the target base station; and apply, to the target base station, an energy-saving control policy corresponding to the target trajectory. 11 . The apparatus of claim 10 , wherein the one or more processors are further configured to execute further instructions to: monitor one or more energy-saving parameters of the target base station; and adjust the energy-saving control policy applied to the target base station based on the one or more energy-saving parameters. 12 . The apparatus of claim 11 , wherein the one or more processors are further configured to execute further instructions to: determine, based on the monitoring, that at least one of the one or more energy-saving parameters of the target base station is outside of a predetermined range of values; and adjust the energy-saving control policy to cause the at least one of the one or more energy-saving parameters to be within the predetermined range of values. 13 . The apparatus of claim 10 , wherein the one or more processors are further configured to execute further instructions to: generate, using a base reinforcement learning model, a plurality of source control policies corresponding to the plurality of source base stations. 14 . The apparatus of claim 13 , wherein the one or more processors are further configured to execute further instructions to: collect a plurality of source base station trajectories corresponding to the plurality of source base stations, based on the plurality of source control policies; and select the energy-saving control policy from among a control policy of the target base station and the plurality of source control policies. 15 . The apparatus of claim 10 , wherein the one or more processors are further configured to execute further instructions to: formulate the plurality of trajectories based on a Markov Decision Process (MDP), wherein each trajectory of the plurality of trajectories comprises a state space, an action space, a reward function, and a state transition probability function. 16 . The apparatus of claim 15 , wherein the state space indicates at least one of a number of connected active devices per cell, a cell load ratio, and a throughput per cell, wherein the action space comprises at least one of activation thresholds and deactivation thresholds, wherein the reward function indicates a reward based on at least one of a power consumption and a minimum throughput, and wherein the state transition probability function indicates a probability of an action from the action space at a state of the state space. 17 . The apparatus of claim 10 , wherein the one or more processors are further configured to execute further instructions to: perform iterative testing of respective control policies of each trajectory of the target cluster; determine, for each trajectory of the target cluster, an accumulated reward; and select, as the target trajectory, the selected trajectory from the target cluster that maximizes the accumulated reward. 18 . The apparatus of claim 17 , wherein the one or more processors are further configured to execute further instructions to: pe
using machine learning or artificial intelligence · CPC title
in access points, e.g. base stations · CPC title
in wireless communication networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.