Mohammed SEID,Stephen ANOKYE,and SUN Guolin
(1.University of Electronic Science and Technology of China,Chengdu,Sichuan 611731,China;2.Dilla University,Dilla,Ethiopia;3.University of Mines and Technology,Tarkwa 237,Ghana)
Abstract:The emerging unmanned aerial vehicle (UAV)technology and its applications have become part of the massive Internet of Things (mIoT) ecosystem for future cellular networks.Internet of things (IoT) devices have limited computation capacity and battery life and the cloud is not suitable for offloading IoT tasks due to the distance,latency and high energy con?sumption.Mobile edge computing (MEC)and fog radio access network (F-RAN)together with machine learning algorithms are an emerging approach to solving complex network problems as described above.In this paper,we suggest a new orientation with UAV enabled F-RAN ar?chitecture.This architecture adopts the decentralized deep reinforcement learning (DRL) al?gorithm for edge IoT devices which makes independent decisions to perform computation offloading,resource allocation,and association in the aerial to ground (A2G) network.Addi?tionally,we summarized the works on machine learning approaches for UAV networks and MEC networks,which are related to the suggested architecture and discussed some technical challenges in the smart UAV-IoT,F-RAN 5G and Beyond 5G(6G).
Keywords:unmanned aerial vehicle;machine learning;F-RAN;edge computing
In the recent past,cellular technologies have become more dynamic and improved the network infrastructure to the satisfaction of end users.There are a number of ultradense heterogeneous devices from individuals and organi?zations,which are always generating and storing a huge amount of data via sensors (edge Internet of Things (IoT)devic?es) and applications [1].When the massive Internet of Things(mIoT) devices emerge,the data generated by various sensors will increase exponentially.Due to the huge volume of the data produced and different forms of conventional databases (with structured and unstructured data),big data analysis has attract?ed much attention in recent years and many organizations have focused on the analysis of collected data to extract useful data for making appropriate decisions [2].The data generated from billions of heterogeneous IoT sensors are sent to the cloud for processing computing tasks,with a high cost of processing de?lay and energy consumption.However,some IoT sensors data need to be processed faster than the current processing capabil?ity of clouds.To solve this problem,fog and edge computing(FEC) is proposed to enable computing tasks processed at the network edge of IoT [3]–[5].Edge computing is a new emerg?ing paradigm to solve IoT computation and resource allocation problem in localized manner [5].Fog computing is decentral?ized computing paradigm,where a number of smart devices which have a computational capacity are utilized [6],[7].In this paradigm,key issues were discussed about the require?ment and deployment of fog connectivity environment due to the existence of ultra-dense heterogeneous devices.Several technical issues on fog computing such as deployment,simula?tion,resource management,fault tolerance and services have been studied in [6],[8]–[13].Even though fog computing and edge computing both move the computation and storage to the edge of the network,closer to end-nodes,their paradigms are not identical [14].The rapid development of diverse mIoT de?vices such as wireless sensors,smart machines,and mobile us?ers’applications enable the users to enjoy high quality of expe?rience (QoE) and high quality of service (QoS) [5],[15],[16].However,most of these applications are delay sensitive or realtime applications,which need high computational capacity.The edge devices could not compute each task due to the limi?tation of battery and low computation capability,so it is diffi?cult for them to implement these applications [17].The FEC can compute tasks of IoT devices and interplay with the cloud server to provide better QoS and QoE to end users.Some works were done on computation offloading to mobile edge computing(MEC) servers and on resource allocation of the IoT devices to maximize network performance and optimize the problem in ul?tra-dense heterogeneous network [18]–[20].For ultra-dense IoT network system,a game theory computation offloading framework was designed in [21]and[22],to minimize the over?all computation overhead of the task on edge IoT devices.
Radio access network (RAN) provides connectivity to the wireless terminals through wireless access points (base sta?tions) and may use one or more radio access technologies(RATs).The fog radio access network (F-RAN) is composed of F-RAN nodes connected through a single or multiple RATs.The F-RAN has a unique feature better than the cloud radio ac?cess network (CRAN) and heterogeneous cloud radio access network (H-CRAN),which helps maximize the use of edge IoT devices of the network and improve network management and optimization mechanisms[5],[23],[24].
Based on the report from Federal Aviation Administration(FAA) [25],the fleet of drones will be more than doubled from estimated 1.1 million vehicles in 2017 to 2.4 million units by 2022.Benefitting from connecting unmanned aerial vehicles(UAVs) to cellular networks for better control and communica?tions,the growth of the UAV market is expected to bring new promising business opportunities for cellular operators.Mil?lions of UAVs have been used to perform various services such as public protection,disaster relief operation,surveillance ap?plications,traffic management,commercial services,extending the cellular-network coverage to remote areas,and acting as flying base stations [26],[27].The Third Generation Partner?ship Project (3GPP) is exploring the challenges and opportuni?ties for serving UAVs as a new type of User Equipment (UE),called aerial UE.UAVs can facilitate the development of IoT ecosystems for mIoT applications [28].UAVs will be the fu?ture of IoT because UAVs,at the beginning,efficiently replace the connected sensors at rest with one device that is deploy?able to different locations,capable of carrying flexible pay?loads,re-programmable in mission,able to measure anything from anywhere,easily deployed,and cost effective.In recent years,a number of works have been done on either UAVs net?works or their integration with cellular networks.Those works focused on computation offloading,maximization of energy effi?ciency,optimization of UAV trajectory and path planning,throughput maximization of UE in UAV network,and terrestri?al heterogeneous devices.
The authors of [29] summarized the journey of machine learning in the last thirty years and the roles machine learning played in the next-generation wireless network (NGWN) as a road for achieving the ambitious goal of NGWN and as a tool for managing the network complexity.The authors of [30] em?phasized the role of diverse machine learning algorithms in dif?ferent key issues of networking across different network tech?nologies.Machine learning techniques are applied for funda?mental problems in networking,including routing and classifi?cation,traffic prediction,congestion control,QoS and QoE management,resource and fault management,and network se?curity.In [31],the authors studied the advanced machine learning application in wireless communication for mobility management in the network layer,resource management in the MAC layer,and networking and localization in application lay?er.The paper [32] discussed the future cellular networks or wireless networks which support ultra-reliable and low-latency communications,as well as the intelligent management for mIoT devices in dynamic environment.Deep reinforcement learning (DRL) approaches for cellular networks,next genera?tion wireless networks and self-organization cellular networks were reviewed in [29]–[34].Recently,DRL has become one of the mostly popular machine learning algorithms for edge computing resource management and a suitable optimization technique for radio access networks.DRL has recently been used as an emerging tool for effectively solve various problems and challenges in modern networks that are more decentral?ized,ad-hoc,and autonomous in nature,such as heterogeneous networks (HetNets),IoT,vehicle to vehicle (V2V) system,ma?chine to machine (M2M) system,vehicle to everything (V2I)system,self-organization cellular networks,and UAV net?works[31].
Different non-deterministic polynomial-time hardness (NPhard) problems of UAV networks and UAV connected cellular networks were optimized by adopting traditional optimization techniques [35]–[38],[40]–[43].However,traditional opti?mization techniques are difficult to be applied for complex net?work infrastructure and not suitable for the current and future intelligent wireless networks.Recently,machine learning algo?rithms have been used to easily optimize different problems in UAV networks and UAV connected cellular networks [63]–[66],[68],[69].However,there are still challenges to using machine learning algorithms for UAV networks which assist the mIoT,public safety communication (PSC),and edge com?puting.
The main contributions of this work are summarized as fol?lows:
·We suggest a new orientation with UAV enabled F-RAN ar?chitecture.This architecture adopts the decentralized DRL al?gorithm for edge IoT devices,which enables decision indepen?dently made for offloading,resource allocation,and association in the A2G network.
·We summarize the works on machine learning approaches for UAV networks and MEC networks,which are related to the suggested architecture.
·We discuss some technical challenges in the smart UAVIoT,F-RAN 5G,and B5G(6G).
The rest of the paper is organized as follows.We provide a brief overview of UAV in wireless cellular networks and the use of UAV for emergency situation and computation offload?ing in Section 2.In Section 3,we review machine learning and its classification.In Section 4,we present our orientation with UAV-enabled F-RAN in MEC,which adopts the machine learning algorithm.In Section 5,we present the works on com?putation offloading and resource allocation using DRL in MEC and UAV networks.In Section 6,we discuss technical chal?lenges and future research directions of intelligent UAV en?abled F-RAN at the edge level.We conclude the paper in Sec?tion 7.
Currently,the use of flying UAV platform is popular; this rapidly growing technology has attractive attributes such as mo?bility,flexibility,and adaptive attitude,and has key potential applications in wireless system.UAVs can be used as aerial base stations (ABS) to enhance coverage,capacity,reliability,and energy efficiency of wireless networks,as well as flying mo?bile terminals in cellular network infrastructure.UAV can be connected with cellular networks as new user equipment and help increase the revenues for network operators.
The authors of [35] summarized the current state of UAV in cellular communication system from different points of view.Different types and characteristics of UAVs are available.A number of industry-led initiatives depend on the standards of cellular communications which support low-altitude UAVs for enabling beyond Line of Sight (LoS) control and establishing a reliable communication.The deployment of flying UAV base stations is better than that of ground base stations for reducing cost and minimizing electronics equipment of base stations.The deployment of ABS faces different practical challenges such as placement and mobility,but UAV flying base stations can be easily deployed at optimum locations in 3D space; they can potentially provide much better performance in different parameters such as coverage,load balancing,spectral efficien?cy,and user experience,compared to existing terrestrial based solutions.UAV can act as flying base stations in the heteroge?neous 5G environment and also support millimeter wave(mmW) communications; it is collectively viewed as the nexus of next-generation 5G cellular systems.UAV-enabled mmW communications is a proposing application of UAVs,which can establish LoS communication links to users [27].UAVs can al?so assist various terrestrial network infrastructure such as mIoT,cellular,and vehicular networks(V2V,V2X,V2I)in dif?ferent ways; for example,UAVs can improve the reliability of wireless connection and scalability,replace destroyed bases stations,compute different tasks of edge IoT devices,and relay the data or signal into central network controller.Table 1 com?pares terrestrial networks with base stations and UAV net?works with bases stations.
UAV at the edge level in cellular networks has a major im?pact on 5G and beyond.A single or multiple UAVs can com?pute the tasks of edge IoT devices.The UAV used as relaying and ABS which connect terrestrial smart mobile users with edge servers in MEC have been studied in [36].To minimizing the average weighted energy consumption of the smart mobile devices and the UAV,the authors of [37]studied the multi-cell edge which is three adjacent cells served by three base sta?tions;at the multi-cell edge,some of the users out of the radius of the base stations are connected with UAV.The problems are how to optimize the maximal sum rate of edge users by avoiding the interference and how to improve QoS and optimize UAV trajectory for the users who are out of network coverage and served by UAV.
The recent literature works on UAV network and UAV as?sisted cellular user or IoT focused on computation offloading,resource allocation and path planning,and trajectory optimiza?tion of either a single UAV or multi-UAV network.In all cas?es,the UAV assists the terrestrial users or IoT devices in offloading tasks and in requesting resources such as power,computational resources and bandwidth.LIU et al.[38] de?signed UAV-Edge-Cloud computing hybrid computing archi?tecture to jointly optimize the computation offloading and rout?ing problem for swarms of multi-UAV which are connected in D2D forms.The architecture in [38] aims to minimize the transmission delay and increase the computing capability be?tween UAVs and mobile users.TI et al.[39] designed UAV based Fog-Cloud-Computing (FCC) to minimize the computa?tion and power consumption of all users,which can jointly opti?mize the computation offloading,user-cloud/cloudlet associa?tion,transition power allocation,and path planning of mobile users.The UAV acts as a small distributed cloud and the local BS as micro cloud server;both users and UAV are movable.
When the terrestrial network infrastructure encounters a nat?ural disaster such as earthquake,volcano,landslide and ava?lanche,UAVs can act as a network life saver,especially for emergency situations.One of the popular communication tech?nologies is PSC,which plays a critical role in saving lives,property,and national infrastructure during natural or manmade emergency [40].This technology is developed for deliv?ering critical real-time streams (video,voice) using predefined spectrums.The UAV base station(UAVBS)or ABS,with LTEadvanced capabilities,can be utilized for emergency restora?tion and temporary expansion of public safety for disaster re?covery [41].ZHAO et al.[42]proposed a UAV-assisted emer?gency network to replace the destroyed base station by estab?lishing multi-hop D2D users in different cells and relay the sig?nal for emergency vehicular communication.And it is a prom?ising method for establishing emergency networks.The au?thors of [43] studied how to replace destroyed base stations by UAV base stations after creating multi-hop D2D communica?tions.They also designed a UAV transceiver for managing UAV uplink and downlink,extending the wireless coverage and guaranteeing the QoS of UAV communications for IoT in disasters.
Machine learning is an application of artificial intelligence(AI),which provides systems with the ability to automatically learn and improve themselves from experience without being explicitly programmed.It is essentially based on the premise that machines should be furnished with AI that enables them to learn from previous computations and adapt to their environ?ment through experience [32],[44].Machine learning began to flourish in the 1990s.Before 1990s,logic-and knowledgebased schemes,such as inductive logic programming and ex?pert systems dominated the AI scene relying on high-level hu?man-readable symbolic representations of tasks and logic.Re?searchers in 2000s gradually renewed their interest on deep learning (DL) with the aid of advanced hardware-based compu?tational capacity and the machine learning paradigm became popular at that time,supporting a wide range of services and applications in different areas[32],[44],[45].
Machine learning algorithms can be classified into three groups based on training data:supervised learning,unsuper?vised learning,and reinforcement learning(RL)
The supervised learning algorithm enables machines to be trained using labeled data.When dealing with labeled data,both the input data and its desired output data are known to the system.Supervised learning is commonly used in applica?tions that have enough historical data.The algorithm is used to infer a function that maps the input data to the output label relying on the training of sample data-label pairs.Practically,considering a set ofNsample data label pairs in the form of{(x1,y1),(x2,y2),...,(xN,yN)},wherexnis then-th sample input data andynrepresents its label.LetX={x1,x2,...,xN}denotes the input data set andY={y1,y2,...,yN}denotes the output la?bel set.The sample pairs are independent and identically dis?tributed (i.i.d.).The learning algorithms aim for seeking a functiong(x)that yields the highest value of the score functionf(x,y),hence we haveg(x)=argmaxy f(x,y).Supervised learning algorithms can be widely used in the context of classi?fication,regression and prediction.
The unsupervised learning algorithm enables machines to be trained without labeled data.Unsupervised learning is typical?ly about finding structure hidden in collections of unlabeled data.By analyzingNinput dataX={x1,x2,...,xN},a pair of popular methods have been conceived for revealing the under?lying unknown features ofNinput data,namely density estima?tion and feature extraction.
RL enables machines to learn what to and how to map situa?tions to actions so as to maximize a numerical reward signal.It is different from the above two algorithms and is currently the most popular research topic in the field of machine learning.There are elements which are necessary for reinforcement learning such as agent,state,action in a given environment.At each episode,the environment is in some stateSand the agent selects a legitimate actionA.The system responds at the next episode by moving into a new stateS'with a certain proba?bility influenced both by the specific action chosen and by the inherent transitions of the system.Meanwhile,the agent re?ceives a corresponding rewardr(S,A) from the system,as time evolves.RL,an important branch of machine learning,is an ef?fective tool and widely used Markov Decision Process (MDP)method[46].In RL process,an agent can learn its optimal pol?icy through interaction with its environment.Q-learning is the most effective method and widely used algorithm for RL.One of the most popular and widely used learning techniques is deep learning which allows the computer to build complex con?cepts out of simpler concepts.It is a set of algorithms and tech?niques that attempt to find important features of data and to model its high-level abstractions [40].However,the learning process of RL takes a lot of time to reach optimal policy or gen?erate best policy by exploring and generating knowledge of an environment,and this circumstance is not suitable and inappli?cable for complex large problems.An artificial neural network(ANN) is a computational nonlinear model based on the neural structure of the brain,which is able to learn to perform tasks such as classification,prediction,decision-making,and visual?ization.The basic model of a neuron is mathematically ex?pressed as follows:
wherexn kis an input signal from a given neuronnto neuroni,xn=[xn1,xn2,xn3,...,xnJ]is a vector of the input signal of neuronn,wnkis the corresponding input weight value,wn=[wn1,wn2,wn3,...,wnJ]is a vector of input weight of neuronn,Znis the output signal of neuronn,bnis the bias of neuronn,andf() is a nonlinear activation function.A bias value can shift the activation function,which is critical for successful learn?ing.The activation function in a neural network will represent the rate of action potential ring in the cell of a neuron.An ANN constructed using linear activation functions in (1) can?not reach a stable state after training,and this problem can be controlled by normalizing different activation functions such as sigmoid function,tanh function,and rectified linear unit (Re?LU)function.
Deep learning was recognized as the first among the top ten AI technology trends for 2018 [45] and is already the leading machine learning technique successfully used in many scientif?ic fields such as image recognition,text recognition,speech recognition,audio and language processing,and robotics [32],[44],[45].Deep learning models are based on an ANN.As we mentioned above,the application of RL is insufficient for the current complicated problems.The combination of RL and deep learning,known as deep reinforcement learning (DRL),can break the limitation of RL in different areas.The DRL takes the advantage of deep neural networks (DDN)to train the learning process,improving the learning efficiency and perfor?mance of RL algorithms.
Q-learning is one of the most common used RL algorithms.It is an attempt to learn the valueQ(s,a) of a specific action given to the agent in a particular state.Considering a table where the number of rows represents the number of states,the RL agent interacts with the environment to learn the Q-values,based on which the agent takes an action.The Q-value is de?fined as the discounted accumulative reward starting at a tuple of a state and an action.Once the Q-values are learned after a maximum episode,the agent can make a quick decision under the current state by taking the action with the largest Q-value and the number of columns represents the number of actions which is called a Q-table [45],[47].A large amount of state and action space in the environment makes the Q-table unman?ageable.In current real-world examples like cellular edge computing,the state space is infinitely large.In order to elimi?nate the shortcoming of Q-learning,a neural network is used to predict the Q-values.One popular DRL algorithm is deep Qnetwork (DQN),which uses DNN to approximate the values.DQN is much more capable of generalization compared to the Q-network.DQN inherits and promotes advantages of both re?inforcement and deep learning techniques,and thus it has a wide range of applications in practice such as game develop?ment,transportation,and robotics [44],[45],[47].The study of DQNs has let too many improvements; new architectures have been designed for better performance and stability,including double DQN (DDQN),dueling DQN,and another asynchronous DRL algorithms studied on this articles[47]-[49].
In the future,the ABS infrastructure will play a great role in 5G and beyond 5G communications.The ML algorithms ap?plied in the current and future cellular technologies and aerial networks will be used to manage the dynamic network environ?ment.Figs.1 and 2 depict the integration of UAV networks and terrestrial networks,where resources from cloud networks are accessed through the virtualized base band unit (VBBU).In VBBU,the network resources which are used for both aerial and terrestrial network infrastructures are virtualized in intelli?gent manner.The resources are allocated in the infrastruc?tures,depending on the network demands.We categorize the architecture into three layers.
The H-CRAN that has cloud computing resources is deliv?ered by server-based applications through digital networks or the public Internet itself.The resources which are available on cloud are far from edge IoT devices.Due to this,the edge IoT devices need localized computational nodes and resources to achieve features of 5G and B5G such as ultra-reliability,lowlatency and massive(ubiquitous)connectivity.
▲Figure 1.UAV enabled fog radio access network(F-RAN)system architecture.
The virtual BBU pool is located at the data center and multi?ple BBU nodes dynamically allocate resources to different net?work operators.The resources are allocated to aerial networks and terrestrial networks based on current network demands.On this layer,the resources are virtualized intoNnetwork slic?es which are found on cloud.The network virtualization allows network resources to be sliced and granted to multiple tenants.We assume the DRL is in decentralized manner and the fogedge network can make decisions independently based on the local learning environment and inputs.The resulting decision will then be sent to the central controller.
The main network operations such as DRL,resource man?agement and computation offloading are performed at this lay?er.It has three levels which are network controller,UAVsmall bases stations(SBS)and Edge IoT devices.
1) Level 1:The network controller (NC) is a central control?ler of the two network infrastructures and a communication platform where the aerial networks assist the terrestrial net?works and DRL makes an intelligent coordination depending on network traffic,emergency and resource scarcity.A macro base station(MBS)with MEC server is used to manage resourc?es which are allocated by the VBBU,allocate these resources to different network operators,and make a decision about the network condition for using DRL approach.To satisfy QoS and QoE of heterogeneous connected edge devices in each slice,the network will be assisted by UAV network in intelligent manners.Under MBS there are a number of SBSs with local servers in each small cell which are used to connect ultradense heterogeneous devices.
2) Level 2:UAV and SBS at this level are used to assist the communication in a given small cell mainly when the network is congested at specific time and in emergency situations;UAV acts as a flying base station to replace the destroyed BS and perform computational tasks and recharge of edge IoT devices.At this time the edge IoT devices are mainly wireless sensors,wearable devices and surveillance cameras,which offload the collected data into UAV for further analysis and decision mak?ing.Therefore,we consider UAV enabled F-RAN in which the UAV is considered as a flying remote radio head (RRH) or base station with computation capability to assist the edge IoT device.The UAV is part of cellular network; it recharges IoT sensor batteries and also sends collected data to MBSs.
▲Figure 2.UAV enabled fog radio access network(F-RAN)and edge computing system model in public safety communication(PSC).
3) Level 3:Edge IoT devices at this level are ultra-dense heterogeneous devices (mIoT devices),which are connected with each other and SBSs.These devices share common re?sources,exchange information with the nearest devices,and have different interests.The MEC server may be crowed or even damaged when the devices request resources and need to offload their own tasks at the same time.The layer three has more network traffic than other layers and the cooperation of aerial network with terrestrial network is needed.The UAV as?sists the edge IoT network when either the network coverage is far from base station or some natural disaster has affected the network.
In the current edge technology era,there is the sprite of di?rect communication between devices which are connected with the network infrastructures without travelling to base stations or core networks.D2D communication system is one of the most common networks and has been widely used in recent years; it is a milestone on the road towards self-organization and peer-to-peer (P2P) collaboration.Currently most of edge IoT devices need computing latency-sensitive support,which is not tolerable at the cloud level.In 2012,a group of research?ers from Cisco proposed a new paradigm known as fog comput?ing.Fog computing and edge computing appear similar since they both involve bringing intelligence and processing closer to UE.Most of the edge IoT devices have shortage of computa?tional capacity and limitation of battery life.Due to this limita?tion,the edge IoT devices may fail to perform different opera?tions properly.However,using the emerging MEC paradigm,the edge device can offload computation intensive tasks to the MEC server in different ways.The study of computation offloading and resource allocation in MEC and fog computing is complicated system analysis because of mobility patterns,ra?dio access interfaces,strong couplings among mobile users with heterogeneities in application demands,QoS provisioning,and wireless resources.A machine learning approach special?ly using RL is a promising candidate to manage huge state space and optimization variables,especially by using different types of ANN.
DRL is an emerging tool for sophisticated problems in com?munication and networking in IoT,MEC,HetNet,and UAV networks.The network unities such as IoT devices,mobile us?ers,and UAVs need to make local and autonomous decisions,like spectrum access,data rate selection,transmit power con?trol,computation offloading decision,and base station associa?tion,to achieve the goals of different networks including throughput maximization,delay minimization,energy consump?tion minimization,and UAV deployment.The main problem is an uncertain and stochastic environment but the MDP model can solve the problem using dynamic programming,value itera?tion and RL [45].LUONG et al.[31] studied the role of DRL in communication and networking.DRL minimizes the com?plexity of optimization and solves the problem in different per?spectives.DRL allows network entities to learn and build knowledge about the communication and networking environ?ment.By using DRL algorithms,mobile users can learn opti?mal policies for base station selection,channel selection,han?dover decision,caching and offloading decisions,UAV deploy?ment,path planning,and trajectory optimization without know?ing channel model and mobility pattern.In [31],different top?ics of the research works related to DRL were shown in per?centages,for example,as the research in space communication is 13%,Ad-hoc 19%,cellular network 31%,IoT network 9%,and others 31%; the related issues to be solved were also pre?sented in percentages,for example,the issues of wireless ca?pacity is 19%,computation offloading 13%,rate control 8%,network access 13%,data collection 9%,resource scheduling 9%,connectivity preservation 8%,and network security 12%.Although there are a number of works on machine learning ap?proaches for wireless communication networks [29],[31],[34],there is no research focus on machine learning based UAV en?abled F-RAN infrastructures yet.
Edge IoT devices such as sensors and wearable devices have a limited computational capacity,short life time of battery,and storage.Due to this limitation,the IoT devices do not support advanced applications such as face recognition and online gam?ing (VR/AR).To tackle the problems in edge IoT devices and also in the network,an offloading mechanism is used to offload computational tasks and data to the nearest computational nodes (MEC server,UAV,or local servers).The offloading of data and computation tasks of the IoT devices can minimize the processing delay and energy consumption,and may en?hance security.Under this circumstance,there are some criti?cal challenges to computation offloading,such as choosing a computational node from multiple computational nodes and de?termining the offloading rate.Selection of an overloaded com?putational node also affects the computation time and energy consumption of IoT devices.The previous works on computa?tion offloading and resources allocation used heuristic or itera?tion algorithms,but they have high complexity.Alternatively,machine learning is a promising tool used for solving the com?plex problem of computation offloading and resource alloca?tion.
Recently,machine learning algorithms have been applied in?to fog edge computing to minimize the optimization problems.The authors of [50] proposed SDN NFV based DQN framework for caching and computation offloading to achieve energy effi?ciency in the network.The authors of [51] proposed a deep learning-based offloading framework to minimize the offloading cost for MEC networks.A deep supervised learning was also modeled to obtain the optimal offloading policy for mobile us?ers.The authors of [52] tried to solve the resource allocation problems by joint optimization of caching,networking and com?putation for video content compressing and encoding,using feedforward neural network(FNN)based DQN.DDQN and du?eling DQN approaches were proposed to improve the stability and performance of the DQN algorithm [53].A DQN frame?work was also proposed for smart city applications,which is a dynamic orchestration of caching,bandwidth and computation to achieve QoS for different services[54].
The authors of [55] proposed offloading cellular traffic for WLAN by adopting the DQL algorithm and MDP model to min?imize energy consumption and mobile user cost.The MEC server has a limitation of resources to allocate for all edge de?vices;due to this,the MEC server also minimize cost and ener?gy.In a vehicular network,there is a huge action space and high complexity due to the vehicles’mobility and service de?lay.In [56],a multi-time scale DQN framework is proposed to minimize the system cost through jointly designing caching,communication and computing.The authors of [57] proposed DQN based joint optimization for computation offloading and resource allocation in MEC-enabled cellular networks.And the cost of delay and power consumption is accordingly mini?mized for all mobile users.In cellular networks,a DQL based optimal offloading policy was proposed to minimize the mobile users’cost and energy consumption [58].In[59],a virtualized computation offloading framework using DRL was designed and a DDQN based DQL algorithm was proposed for an agent to learn the optimal offloading policy without prior knowledge of the network environment in a dynamic manner.This work also focused on the utility function by decomposing Q-function and combining with DDQN; a novel online SARSA-based DRL algorithm was proposed[59].Besides,the computation offload?ing of multiple MEC servers have been considered [60]–[62].The authors of [60] designed Q-learning and fast DQL offload?ing scheme to achieve optimal policy for IoT devices and ener?gy harvesting capacity.
In [61],a two-layered DQL algorithm for offloading to maxi?mize the utilization of cloud resources was studied;the first lay?er uses a convolutional neural network (CNN) -based DQL framework to estimate an optimal cluster for each computation?al task and the second layer uses Q-learning to determine the optimal serving physical machine in cluster.The authors of[62] proposed distributed deep learning-based offloading(DDLO) for multi-computing servers,users and tasks in MEC networks to minimize wireless device (WD) energy consump?tion by offloading WD tasks to the MEC server or cloud and al?locating bandwidth.Table 2 shows different machine learn?ing algorithms in vehicular networks and cellular networks.
The application of machine learning for UAV is known as the drone system.Over the past years,many studies were conduct?ed on either the integration of UAV networks with terrestrial net?works or UAV networks in different application streams such as energy efficiency,computation offloading,resource allocation,and network coverage extension.However,Most of the previous works solved the existed problems using heuristic algorithm.The current research is focusing more on using machine learn?ing algorithms to solve the aerial and terrestrial network integra?tion for UAV assisted cellular networks,IoT,BSs and others to achieve a specific goal in the network.The cellular connected UAV will be a future hot research topic because it can integrate with future cellular networks and machine learning approaches to create a new intelligent aerial mobile user.
Many studies have been conducted on machine learning al?gorithms used in UAV or cellular connected UAV networks for optimizing UAV deployment,path planning,and trajectory as well as improving energy efficiency,UAV coverage,through?put,and resource allocation.GHANAVI et al.[63] proposed the optimal 3D UAV deployment to implement UAV-BSs which use RL to assist or serve the terrestrial network of mobil?ity equipment for keeping the reliability of connection and in?creasing the QoS of users.The authors of[64]proposed an effi?cient 3D ABS positioning solution,in which DQN with DRL is used to assist the terrestrial BS in a small cell where the BS is overloaded and none of LoS exists for maximizing the spectra efficiency of the system.In [65],proposed a novel framework was proposed to deploy ABSs to assist overloaded or congested base stations in small cells.Researchers also adopted the ma?chine learning approach to tackle the problem of predicting the traffic demand of each base station through previous histories,based on which ABSs are deployed for serving users in small cells and applying contract theory to jointly maximize the indi?vidual utility of each BS and UAV.In [66],an ANN based op?portunistic computation offloading framework was proposed,the clustered UAV network assists a vehicular traffic network and the ground controller predicts the response time of each clustered UAV to offload intensive tasks.A clustered UAV network can compute intensive tasks by itself or borrow the re?sources from another cluster UAV network [66].The authors of[67]studied the model free RL algorithm using Q-learning to optimize the trajectory of an UAV acting as a flying BS that serves multiple terrestrial network users.And the UAV also acts as an autonomous agent in the environment,learning the trajectory for maximize the sum rate of transmission during UAV flying time from one location to another location.CUI et al.[68] studied a multi-agent RL using Q-learning and sto?chastic game theory model for dynamic resource allocation in multi-UAV connected multi-users.Each UAV acts as an agent to make a decision independently for maximizing long-term re?wards of each agent to provide reliable communications.Us?ers,power levels and sub-channel selection strategies were al?so jointly studied in [68].For cellular connected UAVs in be?yond 5G system,a DRL algorithm was proposed based on the echo state network (ESN) for an interference aware path plan?ning and management [69].Each UAV acts as an agent that uses deep ESN to learn optimal path,transmission power level and cell association in each location of path and minimize se?quence of time-dependent utility function.Authors of [69]also studied energy efficiency,the control of UAVs,and the fair covering of the active areas where the users are available and the UAVs are required to act as base stations by the DRL algo?rithm.In this work,the fairness index algorithm was applied to control UAV network coverage to minimize UAV energy con?sumption and improve UE QoS.
According to the recent studies of various issues for futuregeneration network infrastructures,we outline some challenges and future research directions for the integration of aerial net?works and terrestrial networks with machine learning approach?es in F-RAN,NFV and MEC paradigms.
▼Table 2.Machine learning algorithms for computation offloading and resource allocation in vehicular networks and cellular networks
1) Machine learning used in virtualized UAV enabled FRAN:RL (commonly DQN,Q-Learning and others) in virtual?ized MEC system has been used to tackle many issues at differ?ent layers of cellular networks.Deploying the machine learn?ing algorithms at different layers of virtualized H-CRAN of UAV-enabled F-RAN will create the intelligence of the future network infrastructure of 5G and beyond.However,in this sce?nario there are a number of network infrastructure and con?cepts.Handling this multi-paradigm concept is complex in the current 5G technology and future 6G network.
2) Multi-agent in multi-layer UAV enabled F-RAN:Most of the current studies of cellular mobile networks or MEC system and UAV network focus on efficient resource allocation,energy efficiency,computation offloading,and caching to minimize de?lay and energy consumption or maximize revenue.The ma?chine learning (commonly RL) algorithms have been used to tackle these issues,but most of them use a single agent at the base station or service providers.The recent years have wit?nessed the rapid evolution of network infrastructure and tech?nologies from one generation to another generation every ten years.In the era of 5G,ultra-dense heterogeneous networks,which consist of different layers of IoT or fog network that sup?ports ultra-low-latency (ULL) devices,are connected to each other at a given time step.In the future,beyond 5G or 6G(5G+AI) will support intelligent Personal Edge (IPE),genome data?base,autonomous health,sensors to AI fusion block-chain,etc.[70]–[72].To perform complex multi-dimensional tasks in these networks,a multi-agent decentralized DRL approach needs to be adopted.Adopting this concept in the UAV-en?abled F-RAN multi-agent at each layer is somehow complex and needs clear framework modeling.
3) Determination of the state of network traffic in different small cells:In 5G and beyond 5G era,there is ultra-dense het?erogeneous network with massive IoT devices and smart mo?bile users which generate a huge amount of traffic in different circumstances.These ultra-dense devices will be assisted by UAV-cluster networks to satisfy the QoS and QoE rather than terrestrial base stations.In the UAV connected cellular net?work at lower layers such as fog or edge computing level,a sin?gle UAV or multi-UAVs are deployed and heuristic algorithms are used to identify network traffic in small cells,depending on the UAV capacity and coverage area.However,such applica?tion of machine learning in the dynamic network is unpredict?able,has a large and continuous state space for making the de?termination of the network traffic state in different cells,and faces complex deployment of UAV-clusters.
4) Handover for transmitting data and task of mIoT devices for emergency situations:One of the attractive and promising paradigms of the UAV connected cellular network is acting as a flying base station to assist the emergency service.In this sit?uation,the mIoT devices would send computational tasks and huge amount of request data traffic to the local base station at a specific time step.However,after the occurrence of a natural disaster,a good and intelligent handover framework is needed to manage the handovers in a terrestrial network environment in a disaster area.The application of machine learning algo?rithms in the handover process is much suitable.
1) Distributed machine learning based virtualized UAV en?abled F-RAN:One of the popular machine learning algorithm frameworks in wireless communication and network is RL with deep neuron network,which requires large amount of training.Most of the time the large DNN is implemented at the central network controller which has sufficient resources such as com?putational capacity and is capable of training a large continu?ous state space and action space in the dynamic network envi?ronment.The central controller minimizes the burden of aerial mobile users and IoT devices by considering the limitation of capacities and capabilities.The main functionalities of UAV networks and terrestrial or cellular networks can be integrated with the central network controller.The virtualized DRL framework for UAV enabled F-RAN or UAV connected cellu?lar system is an open issue.The network traffic exchanges from one layer to another and from aerial mobile users to terres?trial mobile users(mIoT devices)are efficient.
2) Dynamic deployment of multi-UAV cluster in F-RAN:In UAV networks,one of the open issues is UAV deployment in optimal 3D placement for different dynamic terrestrial network infrastructure.A number of previous works focused on UAV deployment with optimization of trajectory,path planning,and maximizing energy efficiency.Due to the dynamical network infrastructure in 5G and beyond 5G (6G),such as the rapid changes in coverage,the number of connected devices and net?work platforms,the DRL based approach for optimal 3D place?ment of UAV will be a necessity,with the integration of the cel?lular or IoT network.Under this consideration,there are other issues such as resource management (aerial mobile users and terrestrial network devices),optimal computation offloading,network coverage area,minimizing energy consumption of net?work,and cell association to maximize flight time.
3) Machine learning based resource management in UAVEnabled F-RAN:A number of studies have been conducted on resource management at different layers in cellular networks,vehicular networks,and UAV networks to solve complex prob?lems such as optimization,maximizing energy efficiency,re?source allocation for UAV and bandwidth management.These studies aim to maximize the revenue or minimize the cost of de?lay and energy in the system.Other works that used heuristic algorithms to tackle the complex problems in cellular net?works,vehicular networks,and fog and edge computing are now adopting machine learning,commonly RL (DQN,Q-learn?ing,DDQN,DDGP,Actor-critics) for resource management and computation offloading.However,in the mixed network in?frastructures such as UAV-enabled F-RAN,need to design a machine learning based joint resource management and compu?tation offloading framework.
4) Machine learning for dynamic deployment of ABS in emergency (PSC):UAV plays a potential role in the future promising paradigm for emergency situations known as PSC.The current communication era heavily relies on the backbone networks.For the failure of base stations due to natural disas?ter or malevolent attacks,PSC is able to use machine learning to deploy a group of multi-UAVs in ultra-dense HetNet archi?tecture as ABSs that can dynamically replace the destroyed or over-headed base stations in the terrestrial network.The UAVs are used to support the reliable connection for edge IoT devices,extend the network coverage,control the end user de?vices,etc.from the communication perspective.If a destroyed BS has the computational resource (local server),MEC server,and power source that cannot be accessed by edge IoT devices,the intelligent ABSs also replace the destroyed terrestrial BS to conduct computing task and allocate transmission power to sat?isfy the QoS and QoE of end users/IoT devices at the fog/edge level of RAN networks.
5) Machine learning based mobility control of multi-UAV connected cellular network/F-RAN:In a multi-UAV assisted cellular network/F-RAN,the UAV flies from one location to an?other location within the given time frame.At the time of UAV’s flying over the terrestrial network,mobile users/IoT de?vices will wait for long time to get access to the UAV terminal.Due to this,the QoS and QoE of the network could be degrad?ed.To tackle this issue,an intelligent machine learning based model is designed for multi-UAV mobility management,where the agents learn by themselves to adjust the mobility in the pre?dicted location in the terrestrial network infrastructure.Be?sides,the model also considers the terrestrial network connect?ed devices such as mobile users,vehicle,and other mobility environments.In this scenario,the management of resources(computational,bandwidth,and energy) is also considered in the mixed network infrastructures.
This paper presents a short review of the machine learning used to solve complex problems in modern network infrastruc?tures and suggests the machine learning based multi UAV-en?abled F-RAN.First,we introduce F-RAN and UAV for the current and future network technologies.Second,we discuss UAV in cellular networks and its replacement of base stations in terrestrial networks.Third,we review machine learning al?gorithms and RL and suggest the machine learning based UAV-enabled F-RAN framework architecture in H-CRAN net?work infrastructure for computation offloading and resource al?location.We also mention some previous works on edge com?puting and UAV using RL with DNN to solve different prob?lems such as resource allocation,computation offloading and base station replacement in different networks.Finally,we out?line the challenges and future research directions.