楊淋堅(jiān) 張宇
技術(shù)應(yīng)用
殘差網(wǎng)絡(luò)與注意力機(jī)制結(jié)合的啰音檢測(cè)方法
楊淋堅(jiān) 張宇
(廣東工業(yè)大學(xué),廣東 廣州 510006)
為解決啰音強(qiáng)度和性質(zhì)易改變而導(dǎo)致的支持向量機(jī)人工參數(shù)選擇困難、檢測(cè)精度不高等問(wèn)題,提出一種殘差網(wǎng)絡(luò)與注意力機(jī)制結(jié)合的啰音檢測(cè)方法。通過(guò)殘差網(wǎng)絡(luò)加深網(wǎng)絡(luò)結(jié)構(gòu)提取更多層次的信息,同時(shí)加入注意力機(jī)制進(jìn)一步挖掘通道層面與空間維度特征,實(shí)現(xiàn)啰音檢測(cè)。使用自主研發(fā)的數(shù)字聽(tīng)診器記錄的呼吸音進(jìn)行實(shí)驗(yàn)。實(shí)驗(yàn)結(jié)果表明:本文提出的方法相較于SVM和ResNet50啰音檢測(cè)精度分別提高了6.83%和1.58%。
啰音檢測(cè);信號(hào)處理;殘差網(wǎng)絡(luò);注意力機(jī)制
據(jù)《國(guó)際呼吸學(xué)會(huì)論壇報(bào)告》[1-2]以及《2020世界衛(wèi)生統(tǒng)計(jì)報(bào)告》[3]顯示:全世界超過(guò)10億人患有呼吸系統(tǒng)疾病,其中每年有400萬(wàn)人死亡,呼吸系統(tǒng)疾病已成為全球第三大死亡病因;醫(yī)患數(shù)量差距較大,45%成員國(guó)每1000人擁有不到1名醫(yī)生。我國(guó)肺疾病的知曉率及肺功能檢查普及率較低[4-5]。
肺音亦稱呼吸音,其變化可直接反映肺部生理和病理變化,可通過(guò)聽(tīng)診器聽(tīng)到[6]。呼吸音一般分為正常音和附加音。其中附加音是指異常聲音疊加在正常呼吸音上,啰音是常見(jiàn)的呼吸附加音[7-9],它是哮喘、肺炎、慢性阻塞性肺病等呼吸道疾病的早期征兆[10-13]。啰音通常由醫(yī)生聽(tīng)診辨析,主觀性強(qiáng)且易受外部因素影響,而啰音自動(dòng)檢測(cè)更為客觀穩(wěn)定[14]。因此,研究啰音檢測(cè)方法,通過(guò)數(shù)字聽(tīng)診器自動(dòng)分析呼吸音并及時(shí)發(fā)現(xiàn)啰音,對(duì)肺疾病的預(yù)防和診斷起著關(guān)鍵作用[15]。
在呼吸音識(shí)別領(lǐng)域,國(guó)內(nèi)外學(xué)者提出了許多方法。ZHANG J提出利用熵描述信號(hào)的頻譜模式用以檢測(cè)啰音是否存在,并在經(jīng)驗(yàn)閾值的基礎(chǔ)上進(jìn)行分類[16]。ZHANG K等通過(guò)提取和分析數(shù)字化肺錄音的頻譜信息,對(duì)肺信號(hào)的小波頻譜進(jìn)行分析,提取數(shù)學(xué)形態(tài)學(xué)特征集,實(shí)現(xiàn)呼吸音的自動(dòng)識(shí)別[17]。JAKOVLJEVI等利用譜減法去除噪聲,將梅爾倒譜系數(shù)(Mel-Frequency Cepstral coefficients, MFCCs)及其一階導(dǎo)數(shù)作為輸入特征,提出一種基于隱馬爾可夫模型和高斯混合模型的呼吸音分類方法[18]。SERBES等提出一種新的非線性譜特征提取算法,采用非動(dòng)態(tài)可調(diào)Q因子小波變換將信號(hào)分為高頻、低頻以及噪聲3個(gè)通道;利用短時(shí)傅里葉變換進(jìn)行特征提取及特征融合,從而分類呼吸音[19]。PERNA D等提出基于MFCCs的數(shù)據(jù)預(yù)處理和循環(huán)神經(jīng)網(wǎng)絡(luò)(recurrent neural network, RNN)模型的學(xué)習(xí)框架,長(zhǎng)短時(shí)記憶網(wǎng)絡(luò)和門(mén)控循環(huán)單元作為RNN的高級(jí)架構(gòu),用于檢測(cè)呼吸附加音[20]。
上述研究的數(shù)據(jù)均來(lái)自教科書(shū)光盤(pán)、在線教程以及少量自采音頻,出于教學(xué)目的,聲音較為干凈且受試者少,導(dǎo)致模型魯棒性較差,不利于臨床應(yīng)用[21]。本文以實(shí)驗(yàn)室自主研發(fā)的數(shù)字聽(tīng)診器在醫(yī)院臨床采集的呼吸音信號(hào)為實(shí)驗(yàn)數(shù)據(jù),提出采用殘差網(wǎng)絡(luò)(ResNet)與卷積塊注意力機(jī)制單元(convolutional block attention module, CBAM)結(jié)合的啰音檢測(cè)方法,以提高啰音檢測(cè)的準(zhǔn)確率。
正常呼吸音采集于肺健康的志愿者,異常呼吸音(啰音)采集于就診患者,包含男女患者,年齡范圍在0.5歲~70歲,且患有不同的肺部疾病以及嚴(yán)重程度也各不相同,并由醫(yī)生對(duì)采集的樣本進(jìn)行標(biāo)注。聽(tīng)診器信號(hào)采樣頻率為8000 Hz,每個(gè)樣本時(shí)長(zhǎng)12 s。共采集325名受試者合計(jì)2620個(gè)肺音樣本,其中2000個(gè)為正常呼吸音,620個(gè)為啰音。根據(jù)受試者不同,把數(shù)據(jù)集分成5份,使用跨被試五折交叉驗(yàn)證對(duì)結(jié)果進(jìn)行評(píng)估。
呼吸音包括低頻和高頻成分,它們?cè)跁r(shí)域和頻域都有重疊。正常呼吸音頻率在60 Hz~600 Hz;啰音頻率在100 Hz~2500 Hz[22-24]。為確保采集呼吸音的有效性,通過(guò)頻譜分析驗(yàn)證信號(hào)的分布,如圖1所示。
圖1 呼吸音頻譜分析
肺部聲音微弱且容易受噪聲(心跳聲、聽(tīng)診器接觸摩擦聲等)影響,使用8階巴特沃斯帶通濾波器[25]將目標(biāo)頻率保持在100 Hz~2500 Hz。為排除聽(tīng)診器接觸以及離開(kāi)人體時(shí)可能產(chǎn)生的噪聲,只截取信號(hào)中間的10 s。
在傳統(tǒng)機(jī)器學(xué)習(xí)方法中,使用MFCCs作為SVM的輸入特征;在深度殘差網(wǎng)絡(luò)中,使用梅爾譜圖(Mel Spectrogram)[26-27]作為ResNet的輸入特征。
ResNet[28]結(jié)構(gòu)容易修改和擴(kuò)展,通過(guò)調(diào)整殘差塊內(nèi)的通道數(shù)量以及堆疊的塊數(shù)量,可調(diào)整網(wǎng)絡(luò)的寬度和深度,得到不同表達(dá)能力的網(wǎng)絡(luò);只要訓(xùn)練數(shù)據(jù)足夠,逐步加深網(wǎng)絡(luò),即可獲得更好的性能表現(xiàn),而不用擔(dān)心網(wǎng)絡(luò)退化問(wèn)題。
CBAM[29]是一種結(jié)合了空間(Spatial)和通道(Channel)的卷積塊注意力機(jī)制單元。通過(guò)在ImageNet-1K上測(cè)試證明:增加CBAM后,大部分網(wǎng)絡(luò)的分類錯(cuò)誤率都有一定程度的降低。同時(shí)通過(guò)grad-CAM可視化分析發(fā)現(xiàn):增加CBAM的網(wǎng)絡(luò)模型將注意力更準(zhǔn)確地放在正確待分類對(duì)象上[29]。
由于啰音的強(qiáng)度和性質(zhì)易改變,部位易變換,瞬間數(shù)量可明顯增減。為挖掘啰音的深層次特征,本文結(jié)合ResNet50與CBAM作為啰音檢測(cè)模型CBAM-ResNet。殘差塊(ResBlock)加入CBAM如圖2所示,啰音檢測(cè)流程如圖3所示。原始音頻信號(hào)經(jīng)過(guò)預(yù)處理和特征提取轉(zhuǎn)變成梅爾譜圖作為網(wǎng)絡(luò)的輸入特征,輸出為信號(hào)是否包含啰音。
圖2 在殘差塊中加入CBAM[29]
圖3 ResNet50與CBAM結(jié)合的啰音檢測(cè)流程
為提高模型的精度和避免模型過(guò)擬合問(wèn)題,采用自適應(yīng)學(xué)習(xí)率優(yōu)化算法Adam[30]。由于正常呼吸音和啰音分布不平衡,使用α-balanced focal loss[31]作為損失函數(shù),公式為
使用SVM,ResNet50和CBAM-ResNet對(duì)啰音檢測(cè)的五折交叉驗(yàn)證的實(shí)驗(yàn)結(jié)果如表1所示。
表1 實(shí)驗(yàn)結(jié)果
由表1可以看出:相較于SVM和ResNet50,CBAM-ResNet啰音檢測(cè)精度分別提高了6.83%和1.58%。表明在啰音檢測(cè)中,ResNet50具有比SVM更好的性能;CBAM既考慮不同通道像素的重要性,又考慮了同一通道不同位置像素的重要性,表現(xiàn)出更優(yōu)的分類檢測(cè)性能。綜上所述,CBAM-ResNet的啰音檢測(cè)方法取得較好的檢測(cè)效果。
針對(duì)啰音強(qiáng)度和性質(zhì)易變的特性,本文提出一種殘差網(wǎng)絡(luò)(ResNet)與卷積塊注意力機(jī)制單元(CBAM)結(jié)合的啰音檢測(cè)方法CBAM-ResNet,并使用自主研發(fā)的數(shù)字聽(tīng)診器記錄臨床呼吸音作為實(shí)驗(yàn)數(shù)據(jù)進(jìn)行啰音檢測(cè)實(shí)驗(yàn)。實(shí)驗(yàn)結(jié)果表明:本文提出的方法相較于需要人工參數(shù)的支持向量機(jī)方法取得更好的啰音檢測(cè)精度。
[1] Forum of International Respiratory Societies. The Global Impact of Respiratory Disease–Second Edition. Sheffield[M]. European Respiratory Society, 2017.
[2] “World Lung Day 2019,” Forum of International Respiratory Societies[DB/OL]. (2019-09-25) [2020-10-14]. https://www. thoracic.org/about/newsroom/press-releases/journal/2019/world-lung-day-2019-respiratory-groups-unite-to-call-for-healthy-lungs-for-all.php.
[3] “World Health Statistics 2020: Monitoring health for the SDGs, sustainable development goals,” World Health Organization, Tech. Rep., 2020 [DB/OL]. (2020-05-23) [2020-10-14]. http://www.who.int/gho/publications/world_health_statistics/ 2020/en/.
[4] LIWEN F, PEI G, Heling B,et al. Chronic obstructive pulmonary disease in China: a nationwide prevalence study[J]. Lancet Respiratory Medicine, 2018, 6(6): 421-430.
[5] BRUSSELLE G G , KO F W . Prevalence and burden of asthma in China: time to act[J]. The Lancet, 2019, 394(10196):364-366.
[6] EARIS J E, CHEETHAM B M G. Current methods used for computerized respiratory sound analysis[J]. European Respiratory Review, 2000, 10(77):586-590.
[7] SOVIJARVI, A R A, DALMASSO F, VANDERSCHOOT J, et al. Definition of terms for applications of respiratory sounds[J]. Nki Distance Education, 2000, 10(6):138-165.
[8] SARKAR M, MADABHAVI I, NIRANJAN N, et al. Auscultation of the respiratory system[J]. Annals of Thoracic Medicine, 2015, 10(3): 158-168.
[9] BOHADANA A, IZBICKI G, KRAMAN S S. Fundamentals of lung auscultation[J]. New England Journal of Medicine, 2014, 370(8): 744-751.
[10] GURUNG A, SCRAFFORD C G, Tielsch J M, et al. Computerized lung sound analysis as diagnostic aid for the detection of abnormal lung sounds: a systematic review and meta-analysis[J]. Respiratory Medicine, 2011,105(9):1396-1403.
[11] PACIEJ R, VYSHEDSKIY A, BANA D, et al. Squawks in pneumonia[J]. Thorax, 2004, 59(2): 177-178.
[12] MUNAKATA M, UKITA H, DOI I, et al. Spectral and waveform characteristics of fine and coarse crackles[J]. Thorax, 1991, 46(9): 651-657.
[13] SOVIJARVI A R A. Characteristics of breath sounds and adventitious respiratory sounds[J]. Eur Respir Rev, 2000, 10: 591-596.
[14] 李真真,吳效明.基于S變換的羅音信號(hào)檢測(cè)算法[J].華南理工大學(xué)學(xué)報(bào)(自然科學(xué)版),2013,41(06):1-5.
[15] 李真真,吳效明.附加性呼吸音信號(hào)處理的研究進(jìn)展[J].生物醫(yī)學(xué)工程學(xué)雜志,2013,30(05):1131-1135.
[16] ZHANG J, SER W, YU J, et al. A novel wheeze detection method for wearable monitoring systems[C]//2009 International Symposium on Intelligent Ubiquitous Computing and Education. IEEE, 2009: 331-334.
[17] ZHANG K, WANG X, HAN F, et al. The detection of crackles based on mathematical morphology in spectrogram analysis[J]. Technology and Health Care, 2015, 23(S2): S489-S494.
[18] JAKOVLJEVI N , LONAR-TURUKALO T . Hidden Markov Model Based Respiratory Sound Classification[M]// Precision Medicine Powered by pHealth and Connected Health. 2017.
[19] SERBES G, ULUKAYA S, KAHYA Y P. An automated lung sound preprocessing and classification system based onspectralanalysis methods[M]//Precision Medicine Powered by pHealth and Connected Health. Springer, Singapore, 2018: 45-49.
[20] PERNA D, TAGARELLI A. Deep auscultation: Predicting respiratory anomalies and diseases via recurrent neural networks[C]//2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS). IEEE, 2019: 50-55.
[21] ADHI P R X, STUART B, ESTHER R V, et al. Automatic adventitious respiratory sound analysis: a systematic review[J]. PLOS ONE, 2017, 12(5):e0177926-.
[22] EARLE B. WEISS, C. JEFFREY CARLSON. Recording of breath sounds[J]. American Review of Respiratory Disease, 1972, 105(5):835-9
[23] GAVRIELY N, PALTI Y, ALROY G. Spectral characteristics of normal breath sounds[J]. Journal of applied physiology, 1981, 50(2): 307-314.
[24] FORGACS P, NATHOO A R, Richardson H D. Breath sounds[J]. Thorax, 1971, 26(3): 288-295.
[25] SELESNICK I W, BURRUS C S. Generalized digital butterworth filter design[J]. IEEE Trans. Signal Process, 1998, 46(6):1688–1694.
[26] ROCHA B M, FILOS D, MENDES LUíS, et al. An open access database for the evaluation of respiratory sound classification algorithms[J]. Physiological Measurement, 2019, 40(3): 035001.
[27] SHI L , DU K , ZHANG C , et al. Lung Sound Recognition Algorithm Based on VGGish-BiGRU[J]. IEEE Access, 2019, (99):1.
[28] HE K , ZHANG X , REN S , et al. Deep Residual Learning for Image Recognition[C]// IEEE Conference on Computer Vision & Pattern Recognition. IEEE Computer Society, 2016.
[29] Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 3-19.
[30] Kingma D P, Ba J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv:1412.6980, 2014.
[31] LIN T Y , GOYAL P , GIRSHICK R , et al. Focal Loss for Dense Object Detection[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2020, 42(2): 318-327.
Rale Detection Method Based on Residual Network and Attention Mechanism
Yang Linjian ZhangYu
(Guangdong University of Technology, Guangzhou 510006, China)
In order to solve the problems caused by the easy change of rales intensity and properties, such as difficulty in selecting artificial parameters of support vector machine, poor detection accuracy and so on. A rales detection method based on residual network and attention mechanism is proposed. Through the residual network to deepen the network structure to extract more levels of information, while adding the attention mechanism to further mine the channel level and spatial dimension features to achieve rales detection. We used a self-developed digital stethoscope to record a total of 2620 breath sounds in 325 subjects. The experimental results show that compared with SVM and ResNet50, the proposed method improves the accuracy of rale detection by 6.83% and 1.58% respectively.
rales detection; signal processing; ResNet; attention mechanism
TP391
A
1674-2605(2021)01-0007-05
10.3969/j.issn.1674-2605.2021.01.007
楊淋堅(jiān),男,1994年生,碩士研究生,主要研究方向:模式識(shí)別、機(jī)器學(xué)習(xí)、生物信號(hào)處理。E-mail: 429667439@qq.com
張宇,男,1992年生,碩士研究生,主要研究方向:模式識(shí)別、機(jī)器學(xué)習(xí)、生物信號(hào)處理。