劉澤石 李思凝 張金鵬 張曉峰 程昊宇
摘 要: 針對存在異步切換的飛行器控制器設(shè)計(jì)問題,提出基于深度強(qiáng)化學(xué)習(xí)的智能魯棒控制算法。首先,針對飛行器的非線性動(dòng)力學(xué)模型,基于雅克比線性化方法,建立了飛行器的大包線切換系統(tǒng)模型??紤]網(wǎng)絡(luò)傳輸丟包引起的異步切換現(xiàn)象,建立控制器和子系統(tǒng)的異步動(dòng)態(tài)模型,在此基礎(chǔ)上,設(shè)計(jì)了魯棒控制器?;谄骄v留時(shí)間方法和多Lyapunov函數(shù)方法,分析了系統(tǒng)的穩(wěn)定性,給出了保證系統(tǒng)穩(wěn)定且具有給定干擾抑制指標(biāo)的充分條件。通過線性矩陣不等式給出了控制器的求解方法。進(jìn)一步,基于深度強(qiáng)化學(xué)習(xí)對得到的控制器進(jìn)行優(yōu)化,在保證系統(tǒng)穩(wěn)定性和魯棒性的基礎(chǔ)上,提升了系統(tǒng)的動(dòng)態(tài)響應(yīng)性能。最后,通過仿真驗(yàn)證了所提方法的有效性。
關(guān)鍵詞:切換系統(tǒng);魯棒控制;智能控制;異步切換;平均駐留時(shí)間;深度學(xué)習(xí); 飛行器
中圖分類號(hào):TJ765.2;V249.1
文獻(xiàn)標(biāo)識(shí)碼:A
文章編號(hào):1673-5048(2022)05-0035-08
DOI:10.12132/ISSN.1673-5048.2021.0165
0 引? 言
切換系統(tǒng)由一系列子系統(tǒng)和子系統(tǒng)之間的切換邏輯組成,是混雜系統(tǒng)的一個(gè)重要分支。切換系統(tǒng)能夠?qū)?fù)雜的非線性系統(tǒng)轉(zhuǎn)化為一系列線性子系統(tǒng)之間的切換,為復(fù)雜高動(dòng)態(tài)非線性系統(tǒng)的建模、分析與控制器設(shè)計(jì)提供了有效手段。因此,切換系統(tǒng)理論在航空航天、過程控制、機(jī)器人控制等領(lǐng)域受到學(xué)者的廣泛關(guān)注。近年來,學(xué)者們針對切換系統(tǒng)的建模、穩(wěn)定性分析[1-3]、控制器設(shè)計(jì)[4- 6]、濾波器設(shè)計(jì)[7-8]、故障檢測[9]和容錯(cuò)控制[10-11]等方面進(jìn)行了廣泛深入的研究,取得一系列進(jìn)展。
大包線飛行器的高度和馬赫數(shù)在飛行包線內(nèi)快速大范圍變化,具有快時(shí)變、強(qiáng)非線性等特點(diǎn),給飛行器的建模和控制器設(shè)計(jì)帶來諸多挑戰(zhàn)。切換系統(tǒng)作為連接非線性系統(tǒng)和線性系統(tǒng)的橋梁,為模型突變、參數(shù)快時(shí)變等問題提供了一種可行的解決方案,近年來,受到國內(nèi)外學(xué)者的廣泛關(guān)注。文獻(xiàn)[12-13]將變體飛行器的連續(xù)變形過程建模為一類切換系統(tǒng),將后掠角的變化等效為飛行器在子系統(tǒng)中的切換。在建立變體飛行器縱向運(yùn)動(dòng)的鏈?zhǔn)狡交袚Q系統(tǒng)模型基礎(chǔ)上,推導(dǎo)了鏈?zhǔn)狡交到y(tǒng)有限時(shí)間有界且具有給定魯棒性指標(biāo)的充分條件,以線性矩陣不等式的形式給出了控制器的求解方法。文獻(xiàn)[14]將近空間飛行器建模為一系列非線性切換系統(tǒng),基于反步滑模方法和非線性干擾觀測器設(shè)計(jì)了魯棒控制器。基于Lyapunov函數(shù)方法給出了閉環(huán)系統(tǒng)穩(wěn)定的充分條件,采用變增益控制策略為飛行器不同模態(tài)選擇合適的增益。文獻(xiàn)[15]針對變體飛行器變形過程的控制問題,將切換系統(tǒng)理論與多變量自適應(yīng)控制理論相結(jié)合,提出一種基于切換系統(tǒng)的魯棒自適應(yīng)控制器設(shè)計(jì)方法,對外界的干擾和不確定性進(jìn)行補(bǔ)償,基于Lyapunov函數(shù)方法分析了系統(tǒng)的穩(wěn)定性。但是,以上文獻(xiàn)均針對理想信息傳輸情況下的建模和控制器設(shè)計(jì)問題進(jìn)行研究,在實(shí)際工程應(yīng)用中,由于網(wǎng)絡(luò)帶寬的限制,信號(hào)通過網(wǎng)絡(luò)進(jìn)行傳輸時(shí)不可避免地會(huì)存在數(shù)據(jù)包丟失現(xiàn)象,引起系統(tǒng)性能下降,甚至發(fā)生失穩(wěn)。文獻(xiàn)[16]考慮測量鏈路和控制鏈路同時(shí)存在數(shù)據(jù)包丟失的現(xiàn)象,將丟包建模為Bernoulli隨機(jī)過程,從而得到存在丟包情況下的飛行器切換系統(tǒng)模型?;谀B(tài)依賴平均時(shí)間方法和多Lyapunov函數(shù)方法,設(shè)計(jì)了魯棒控制器,保證系統(tǒng)的穩(wěn)
定性和魯棒性。文獻(xiàn)[17]針對存在外部擾動(dòng)和數(shù)據(jù)包丟失的情況, 基于狀態(tài)反饋設(shè)計(jì)了魯棒控制器,給出了保證飛行器的有限時(shí)間穩(wěn)定且具有給定性能指標(biāo)的充分條件。
飛行器在飛行包線內(nèi)快速大包線飛行,飛行環(huán)境復(fù)雜,受到外界強(qiáng)干擾和強(qiáng)非線性的影響。另一方面,隨著飛行任務(wù)趨于復(fù)雜,飛行性能要求提升,這些都對飛行控制系統(tǒng)的魯棒性和動(dòng)態(tài)性能提出了更高的要求。高精度、強(qiáng)魯棒性成為飛行控制系統(tǒng)發(fā)展的突出要求。文獻(xiàn)[18]考慮存在時(shí)滯的飛行器故障檢測與容錯(cuò)控制一體化設(shè)計(jì)問題,針對時(shí)滯引起的異步切換現(xiàn)象,基于多Lyapunov-Krarasovskii函數(shù)方法和平均駐留時(shí)間方法分析了飛行器在異步切換情況下的穩(wěn)定性和魯棒性,保證故障情況下系統(tǒng)能夠快速有效檢測出故障,并對故障進(jìn)行補(bǔ)償。文獻(xiàn)[19]針對大包線飛行器的控制問題,設(shè)計(jì)了局部重疊切換控制系統(tǒng),提升了系統(tǒng)的魯棒性和動(dòng)態(tài)性能。
隨著計(jì)算機(jī)計(jì)算能力的提升,以深度學(xué)習(xí)、強(qiáng)化學(xué)習(xí)為代表的機(jī)器學(xué)習(xí)技術(shù)引起了學(xué)者的廣泛關(guān)注,在圖像識(shí)別[20]、目標(biāo)跟蹤[21]、語音識(shí)別[22]和導(dǎo)航制導(dǎo)控制系統(tǒng)設(shè)計(jì)[23-24]等領(lǐng)域取得一系列突出成就,有效提高了智能化水平。文獻(xiàn)[25]基于狀態(tài)反饋設(shè)計(jì)了控制器,為了提高控制精度,基于深度強(qiáng)化學(xué)習(xí)算法對控制器參數(shù)進(jìn)行動(dòng)態(tài)調(diào)整。文獻(xiàn)[26]則基于深度學(xué)習(xí)和最優(yōu)控制,將著陸問題轉(zhuǎn)化為兩點(diǎn)邊值問題,利用深度學(xué)習(xí)對不同初值下的最優(yōu)控制序列進(jìn)行學(xué)習(xí)和擬合,兼顧了控制的實(shí)時(shí)性和最優(yōu)性。
綜上所述,為了提高飛行器控制系統(tǒng)的魯棒性和動(dòng)態(tài)性能,增強(qiáng)其應(yīng)對環(huán)境不確定干擾的能力,本文考慮網(wǎng)絡(luò)丟包引起的異步切換現(xiàn)象,在建立飛行器切換系統(tǒng)模型的基礎(chǔ)上,設(shè)計(jì)了H∞魯棒控制器;基于ADT方法和多Lyapunov函數(shù)方法分析了系統(tǒng)的穩(wěn)定性,并給出保證系統(tǒng)具有給定魯棒性指標(biāo)的充分條件。進(jìn)一步,為了提升系統(tǒng)的動(dòng)態(tài)響應(yīng)性能,基于深度強(qiáng)化學(xué)習(xí)算法對控制器參數(shù)進(jìn)行優(yōu)化,有效提升了控制系統(tǒng)的動(dòng)態(tài)性能。
定義系統(tǒng)的丟包率為0.95,最大連續(xù)丟包數(shù)為5,則由定理3可以得到矩陣U1i,? U2i,? S1i和S2i,進(jìn)而由式(33)~(34)可以得到魯棒控制器的參數(shù)值。以得到的魯棒控制器參數(shù)為基準(zhǔn)參數(shù),基于DDPG算法對控制器進(jìn)行優(yōu)化。深度神經(jīng)網(wǎng)絡(luò)均采用全連接的方式連接,其激活函數(shù)定義為ReLu,動(dòng)作網(wǎng)絡(luò)和評(píng)價(jià)網(wǎng)絡(luò)的學(xué)習(xí)率為0.001,獎(jiǎng)勵(lì)函數(shù)的權(quán)重系數(shù)為λ1=0.8,λ2=0.2,執(zhí)行機(jī)構(gòu)的飽和值u-a=15°,懲罰值u-p=200,可以得到仿真結(jié)果如圖3~8所示。圖中,“DRobust”表示本文提出
的智能魯棒控制算法, “Robust”代表傳統(tǒng)魯棒控制方法。
圖3~4為攻角跟蹤信號(hào)和跟蹤誤差,從圖中可以看出,經(jīng)過深度強(qiáng)化學(xué)習(xí)對魯棒控制器參數(shù)進(jìn)行優(yōu)化,在保證閉環(huán)系統(tǒng)穩(wěn)定的前提下,有效減小了系統(tǒng)在切換時(shí)刻的跟蹤誤差和穩(wěn)態(tài)誤差,提升了控制系統(tǒng)的閉環(huán)響應(yīng)性能。圖5為俯仰角速率響應(yīng)曲線,從圖中可以看出,俯仰角速率沒有發(fā)生飽和。圖6~8分別表示升降舵、副翼和鴨翼響應(yīng)曲線,從圖中可以看出,執(zhí)行機(jī)構(gòu)的響應(yīng)曲線沒有超過物理限制,能夠執(zhí)行控制系統(tǒng)指令。圖9為獎(jiǎng)勵(lì)函數(shù)響應(yīng)曲線,反映了本文提出的深度強(qiáng)化學(xué)習(xí)算法具有良好的跟蹤性能。
綜上所述,本文所提的算法能夠在傳統(tǒng)魯棒控制方法的基礎(chǔ)上,利用DDPG算法提高閉環(huán)系統(tǒng)的動(dòng)態(tài)響應(yīng)性能,兼顧了系統(tǒng)的穩(wěn)定性、魯棒性和動(dòng)態(tài)性能。
4 結(jié)? 論
本文針對飛行器的智能魯棒控制問題進(jìn)行研究,基于非線性動(dòng)力學(xué)模型得到面向飛行器大包線飛行的切換系統(tǒng)模型??紤]數(shù)據(jù)包丟失引起的異步切換現(xiàn)象,設(shè)計(jì)了智能魯棒控制器??刂破鞯脑O(shè)計(jì)分魯棒跟蹤控制器和智能控制器兩部分。其中,基于多Lyapunov函數(shù)方法和平均駐留時(shí)間方法保證閉環(huán)系統(tǒng)穩(wěn)定且具有給定的干擾抑制指標(biāo),通過線性矩陣不等式給出了魯棒控制器的求解方法。為了提升控制系統(tǒng)的動(dòng)態(tài)性能和抗干擾能力,基于深度強(qiáng)化學(xué)習(xí)設(shè)計(jì)了智能控制器,對內(nèi)外干擾進(jìn)行補(bǔ)償以提升控制器的動(dòng)態(tài)性能?;赼ctor-critic框架提出DDPG算法,在魯棒控制器設(shè)計(jì)的基礎(chǔ)上進(jìn)行優(yōu)化,有效保證了整個(gè)控制器的穩(wěn)定性、魯棒性和動(dòng)態(tài)性能。
參考文獻(xiàn):
[1] Wang Z Y,? Gao L J,? Liu H Y. Stability and Stabilization of Impulsive Switched System with Inappropriate Impulsive Switching Signals under Asynchronous Switching[J]. Nonlinear Analysis: Hybrid Systems,? 2021,? 39: 100976.
[2] Liu Z,? Zhang X F,? Lu X D,? et al. Stabilization of Positive Switched Delay Systems with All Modes Unstable[J]. Nonlinear Analysis: Hybrid Systems,? 2018,? 29: 110-120.
[3] Hong S S,? Zhang Y. Input/Output-to-State Stability of Impulsive Switched Delay Systems[J]. International Journal of Robust and Nonlinear Control,? 2019,? 29(17): 6031-6052.
[4] Zheng Y,? Wang Y N. Full-Order and Reduced-Order l1 Filtering for Positive Switched Delay Systems under the Improved MADT[J]. Nonlinear Analysis: Hybrid Systems,? 2019,? 32: 147-156.
[5] Zhong G X,? Yang G H. Dynamic Output Feedback Control of Saturated Switched Delay Systems under the PDT Switching[J]. International Journal of Robust and Nonlinear Control,? 2017,? 27(15): 2567-2588.
[6] Zhu C H,? Li X D,? Cao J D. Finite-Time H∞ Dynamic Output Feedback Control for Nonlinear Impulsive Switched Systems[J]. Nonlinear Analysis: Hybrid Systems,? 2021,? 39: 100975.
[7] Park J H,? Mathiyalagan K,? Sakthivel R. Fault Estimation for Discrete-Time Switched Nonlinear Systems with Discrete and Distributed Delays[J]. International Journal of Robust and Nonlinear Control,? 2016,? 26(17): 3755-3771.
[8] Liu H Y,? Gao L J,? Wang Z Y,? et al. Asynchronous l2-l∞ Filtering of Discrete-Time Impulsive Switched Systems with Admissible Edge-Dependent Average Dwell Time Switching Signal[J]. International Journal of Systems Science,? 2021,? 52(8): 1564-1585.
[9] Zhang M,? Shi P,? Shen C,? et al. Static Output Feedback Control of Switched Nonlinear Systems with Actuator Faults[J]. IEEE Transac-tions on Fuzzy Systems,? 2020,? 28(8): 1600-1609.
[10] Yin Y H,? Wang F Y,? Liu Z X,? et al. Fault-Tolerant Consensus for Switched Multiagent Systems with Input Saturation[J]. International Journal of Robust and Nonlinear Control,? 2021,? 31(11): 5047-5068.
[11] Wang Y Q,? Xu N,? Liu Y J,? et al. Adaptive Fault-Tolerant Control for Switched Nonlinear Systems Based on Command Filter Technique[J]. Applied Mathematics and Computation,? 2021,? 392: 125725.
[12] 江未來,? 董朝陽,? 王通,? 等. 變體飛行器平滑切換LPV魯棒控制[J]. 控制與決策,? 2016,? 31(1): 66-72.
Jiang Weilai,? Dong Chaoyang,? Wang Tong,? et al. Smooth Switch-ing LPV Robust Control for Morphing Aircraft[J]. Control and Decision,? 2016,? 31(1): 66-72.(in Chinese)
[13] 王青,? 王通,? 董朝陽,? 等. 變體飛行器鏈?zhǔn)狡交袚Q控制[J]. 控制理論與應(yīng)用,? 2015,? 32(7): 949-954.
Wang Qing,? Wang Tong,? Dong Chaoyang,? et al. Chained Smooth Switching Control for Morphing Aircraft[J]. Control Theory & Applications,? 2015,? 32(7): 949-954.(in Chinese)
[14] 路遙,? 董朝陽,? 王青,? 等. 近空間飛行器變增益非線性切換控制器設(shè)計(jì)[J]. 控制與決策,? 2017,? 32(4): 613-618.
Lu Yao,? Dong Chaoyang,? Wang Qing,? et al. Variable Gain Nonlinear Switching Controller Design for Near Space Vehicles[J]. Control and Decision,? 2017,? 32(4): 613-618.(in Chinese)
[15] 梁小輝,? 王青,? 董朝陽. 基于切換系統(tǒng)的變體飛行器魯棒自適應(yīng)控制[J]. 北京航空航天大學(xué)學(xué)報(bào),? 2019,? 45(3): 538-545.
Liang Xiaohui,? Wang Qing,? Dong Chaoyang. Robust Adaptive Control for Morphing Aircraft Based on Switching System[J]. Journal of Beijing University of Aeronautics and Astronautics,? 2019,? 45(3): 538-545.(in Chinese)
[16] Cheng H Y,? Dong C Y,? Jiang W L,? et al. Non-Fragile Switched H∞ Control for Morphing Aircraft with Asynchronous Switching[J]. Chinese Journal of Aeronautics,? 2017,? 30(3): 1127-1139.
[17] Cheng H Y,? Fu W X,? Dong C Y,? et al. Asynchronously Finite-Time H∞ Control for Morphing Aircraft[J]. Transactions of the Institute of Measurement and Control,? 2018,? 40(16): 4330-4344.
[18] 程昊宇,? 董朝陽,? 江未來,? 等. 變體飛行器故障檢測與容錯(cuò)控制一體化設(shè)計(jì)[J]. 兵工學(xué)報(bào),? 2017,? 38(4): 711-721.
Cheng Haoyu,? Dong Chaoyang,? Jiang Weilai,? et al. Integrated Fault Detection and Fault Tolerant Control for Morphing Aircraft[J]. Acta Armamentarii,? 2017,? 38(4): 711-721.(in Chinese)
[19] Yang H,? Guan Y C,? Ma Y J,? et al. Overlapping-Decomposition-Based Control Design for Switched Full-Envelope Flight[J]. Journal of Guidance,? Control,? and Dynamics,? 2018,? 41(12): 2658-2665.
[20] 付哲泉,? 李相平,? 李尚生,? 等. 深度學(xué)習(xí)在雷達(dá)目標(biāo)高分辨距離像識(shí)別中的研究綜述[J]. 航空兵器,? 2020,? 27(3): 37-43.
Fu Zhequan,? Li Xiangping,? Li Shangsheng,? et al. Review of Radar HRRP Target Recognition Based on Deep Learning[J]. Aero Weaponry,? 2020,? 27(3): 37-43.(in Chinese)
[21] Munjani J,? Joshi M. A Non-Conventional Lightweight Auto Regressive Neural Network for Accurate and Energy Efficient Target Tracking in Wireless Sensor Network[J]. ISA Transactions,? 2021,? 115: 12-31.
[22] 薛艷飛,? 毛啟容,? 張建明. 基于多任務(wù)學(xué)習(xí)的多語言語音情感識(shí)別方法[J]. 計(jì)算機(jī)應(yīng)用研究,? 2021,? 38(4): 1069-1073.
Xue Yanfei,? Mao Qirong,? Zhang Jianming. Multi-Language Speech Emotion Recognition Based on Multi-Task Learning[J]. Application Research of Computers,? 2021,? 38(4): 1069-1073.(in Chinese)
[23] Li R F,? Hu L,? Cai L. Adaptive Tracking Control of a Hypersonic Flight Aircraft Using Neural Networks with Reinforcement Synthesis[J]. Aero Weaponry,? 2018(6): 3-10.
[24] Gaudet B,? Linares R,? Furfaro R. Deep Reinforcement Learning for Six Degree-of-Freedom Planetary Landing[J]. Advances in Space Research,? 2020,? 65(7): 1723-1741.
[25] Xu J,? Hou Z M,? Wang W,? et al. Feedback Deep Deterministic Policy Gradient with Fuzzy Reward for Robotic Multiple Peg-in-Hole Assembly Tasks[J]. IEEE Transactions on Industrial Informatics,? 2019,? 15(3): 1658-1667.
[26] Sánchez-Sánchez C,? Izzo D. Real-Time Optimal Control via Deep Neural Networks: Study on Landing Problems[J]. Journal of Guidance,? Control,? and Dynamics,? 2018,? 41(5): 1122-1135.
Intelligent Robust Control for Flight Vehicles with
Asynchronous Switching
Liu Zeshi1*,Li Sining1,Zhang Jinpeng2,Zhang Xiaofeng3,? Cheng Haoyu3
(1. Shenyang Aircraft Design Institute, Shenyang 110035, China;2. China Airborne Missile Academy,? Luoyang 471009, China;
3. Northwestern Polytechnical University,? Xian 710072, China)
Abstract: The problem of intelligent robust controller design for flight vehicles with asynchronous switching is investigated based on deep reinforcement learning. The switched model of flight vehicle in full envelope is established based on Jacobian linearization according to the nonlinear dynamic model. The asynchronous switching caused by packet loss are taken into consideration and the asynchronous dynamic? model of controllers and subsystems are introduced. Then the robust controller is provided.
The stability of the system is analyzed, and the sufficient conditions to ensure the stability with prescribed interference suppression index are given based on average dwell time method and multiple Lyapunov functional method. The solutions of controllers are obtained by linear matrix inequality. Moreover,? the obtained controller is optimized based on deep reinforcement learning, and the dynamic response performance of the system is improved while ensuring the stability and robustness.
Numerical examples in the end are given to illustrate the effectiveness of the proposed method.
Key words:? switching system;robust control;intelligent control;asynchronous switching;average dwell time;deep learning;flight vehicle
收稿日期: 2021-08-27
基金項(xiàng)目:航空科學(xué)基金項(xiàng)目(20180153001;201907053001)
作者簡介:劉澤石(1989-),男,遼寧沈陽人,碩士研究生,高級(jí)工程師。