What technology area does this patent fall under?

Primary CPC classification H04L41/147. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Mar 03 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Intelligent dynamic network traffic management for global network access terminal

Patent metadata
Field	Value
Publication number	US-12568037-B2
Application number	US-202117529751-A
Country	US
Kind code	B2
Filing date	Nov 18, 2021
Priority date	Nov 18, 2021
Publication date	Mar 3, 2026
Grant date	Mar 3, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure provides a deep reinforcement learning (DRL) based dynamic network traffic management system including a LAN router, a plurality of WAN routers, a network switch, and a GNAT controller configured to measure one or more traffic states of a plurality of data flows, obtain an expected reward at the current time point, obtain the one or more traffic states to input to a DRL model to provide an expected reward of each data flow estimated for a next time point, obtain a target reward at the current time point, adjust parameters of the DRL model, predict a plurality of long-term rewards using the trained DRL model, select one of the plurality of long-term rewards, and adjust the bandwidth assigned to each data flow based on the selected long-term reward.

First claim

Opening claim text (preview).

What is claimed is: 1 . A deep reinforcement learning (DRL) based dynamic network traffic management (DNTM) system comprising: a local area network (LAN) router; a plurality of wireless area network (WAN) routers; a network switch; and a global network access terminal (GNAT) controller, configured to: measure one or more traffic states of a plurality of data flows at a current time point; obtain an expected reward at the current time point; input the one or more traffic states to a DRL model to provide an expected reward of each data flow estimated for a next time point; obtain a target reward at the current time point using the expected reward at the next time point; adjust parameters of the DRL model by minimizing a difference between the expected reward at the current time point and the target reward at the current time point to obtain a trained DRL model; predict a plurality of long-term rewards using the trained DRL model with different bandwidth assignments, a long-term reward representing a total contribution of bandwidth assigned to each data flow in the one or more traffic states in a future; select a maximum long-term reward from the plurality of long-term rewards; and adjust the bandwidth assigned to each data flow based on the selected long-term reward. 2 . The system according to claim 1 , wherein the GNAT controller is further configured to measure the one or more traffic states of the plurality of data flows periodically. 3 . The system according to claim 1 , wherein the DRL model includes a deep neural network (DNN) to provide an expected reward of each data flow estimated for the next time point. 4 . The system according to claim 3 , wherein parameters of the DNN are adjusted by minimizing the difference between the expected reward at the current time point and the target reward at the current time point. 5 . The system according to claim 1 , wherein the traffic state of each data flow includes traffic delay and data rate information. 6 . The system according to claim 5 , wherein the expected reward of each data flow is defined as: R t j = - ξ ⁡ ( max 1 ≤ i ≤ N { S t [ i , j , 1 ] } - D [ j ] ) + D [ j ] + ( 1 - ξ ) ⁢ ( ∑ i = 1 N ⁢ S t [ i , j , 2 ] - C [ j ] ) - C [ j ] where R t j represents the expected reward evaluated based on the traffic state S t , S t [i, j, 1] represents an average traffic delay of data flow j on soft flow i from time point t−1 to t, S t [i, j, 2] represents an average data rate of data flow j on soft flow i from time point t−1 to t, D[j] represents a packet delay required by data flow j, C[j] represents a data rate required by data flow j, ξ∈(0,1) indicates a relative importance between the packet delay required by data flow j and the data rate required data flow j. 7 . The system according to claim 1 , wherein the GNAT controller is further configured to update the target reward at the current time point by: {circumflex over (Q)} ( S t ,A t )← R t+1 +γ{circumflex over (Q)} ( S t+1 ,A t+1 ) where {circumflex over (Q)}(S t , A t ) represents the target reward at time point t, {circumflex over (Q)}(S t+1 , A t+1 ) represents the target reward at time point t+1, R t+1 represent the expected reward at time point t+1, γ is a coefficient. 8 . The system according to claim 1 , wherein the GNAT controller is configured to adjust the bandwidth assigned to each data flow by controlling a transmission rate. 9 . A deep reinforcement learning (DRL) based dynamic network traffic management (DNTM) method for communication between a local area network (LAN) router and a plurality of wireless area router (WAN) routers, comprising: measuring one or more traffic states of a plurality of data flows at a current time; obtaining an expected reward at the current time point; obtaining the one or more traffic states from a global network access terminal (GNAT) router to input to a DRL model to provide an expected reward of each

Assignees

Intelligent Fusion Tech Inc

Inventors

Classifications

H04L41/147Primary
for predicting network behaviour · CPC title
H04L41/16
using machine learning or artificial intelligence · CPC title
H04L45/08Primary
Learning-based routing, e.g. using neural networks or artificial intelligence · CPC title

Patent family

Related publications grouped by family.

View patent family 98370906

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12568037B2 cover?: The present disclosure provides a deep reinforcement learning (DRL) based dynamic network traffic management system including a LAN router, a plurality of WAN routers, a network switch, and a GNAT controller configured to measure one or more traffic states of a plurality of data flows, obtain an expected reward at the current time point, obtain the one or more traffic states to input to a DRL m…
Who is the assignee on this patent?: Intelligent Fusion Tech Inc
What technology area does this patent fall under?: Primary CPC classification H04L41/147. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Mar 03 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).