Method of performing beam training based on reinforcement learning and wireless communication device performing the same

US11546033B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11546033-B2
Application numberUS-202117539759-A
CountryUS
Kind codeB2
Filing dateDec 1, 2021
Priority dateApr 30, 2021
Publication dateJan 3, 2023
Grant dateJan 3, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of performing beam training including obtaining at least one of a probability distribution and a value function for selecting one of a plurality of beams that are used to perform beamforming, selecting a candidate beam from among the plurality of beams based on the at least one of the probability distribution and the value function, the candidate beam being expected to be a best beam among the plurality of beams, performing a present training operation based on the candidate beam and a previous beam selected by at least one previous training operation, and selecting a better one of the candidate beam and the previous beam as a present beam based on a result of the present training operation may be provided.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of performing beam training, the method comprising: obtaining at least one of a probability distribution and a value function for selecting one of a plurality of beams that are used to perform beamforming; selecting a candidate beam from among the plurality of beams based on the at least one of the probability distribution and the value function, the candidate beam being expected to be a best beam among the plurality of beams; performing a present training operation based on the candidate beam and a previous beam selected by at least one previous training operation; and selecting a better one of the candidate beam and the previous beam as a present beam based on a result of the present training operation. 2. The method of claim 1 , further comprising: determining a policy of selecting the candidate beam in the present training operation based on an action of selecting at least one previous candidate beam in the at least one previous training operation and a reward corresponding to a result of the at least one previous training operation. 3. The method of claim 1 , further comprising: determining a policy of selecting the candidate beam using an adversarial bandit model based on an exponential-weight algorithm for exploration and exploitation (EXP3), wherein the selecting selects the candidate beam based on the probability distribution. 4. The method of claim 3 , wherein the probability distribution is defined by Equation 1 as follows: p k ( t ) = ( 1 - γ ) ⁢ exp ⁡ ( ρ ⁢ S ^ k ( t ) ) ∑ j = 1 K exp ⁡ ( ρ ⁢ S ^ j ( t ) ) + γ K , [ Equation ⁢ 1 ] where p k (t) denotes a probability distribution of a k-th beam among the plurality of beams, k denotes an integer greater than or equal to one and less than or equal to K, K denotes a number of the plurality of beams, S ^ k ( t ) = ∑ t = 1 T X ^ k ( t ) denotes an estimated value of a cumulative reward of the k-th beam up to t rounds, γ denotes a parameter used to adjust a ratio between the exploration and the exploitation, and ρ>0 denotes a training rate. 5. The method of claim 3 , further comprising: updating the probability distribution. 6. The method of claim 5 , wherein the updating includes: updating a first reward of the present beam; updating second rewards of neighboring beams adjacent to the present beam; and updating a cumulative reward based on the updated first reward and the updated second rewards. 7. The method of claim 6 , wherein the first reward and the second rewards are obtained based on Equation 2 and Equation 3, respectively, as follows: X ^ k ( t ) = { α p k ( t )

Assignees

Inventors

Classifications

  • H04B7/0617Primary

    for beam forming · CPC title

  • using beam selection · CPC title

  • Antenna weights or vector/matrix coefficients · CPC title

  • using beam selection · CPC title

  • Selecting one or more beams from a plurality of beams, e.g. beam training, management or sweeping · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11546033B2 cover?
A method of performing beam training including obtaining at least one of a probability distribution and a value function for selecting one of a plurality of beams that are used to perform beamforming, selecting a candidate beam from among the plurality of beams based on the at least one of the probability distribution and the value function, the candidate beam being expected to be a best beam a…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification H04B7/0617. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jan 03 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).