李鮮 王艷 羅勇 周激流
摘 要:針對(duì)醫(yī)學(xué)圖像中存在的灰度對(duì)比度低、器官組織邊界模糊等問題,提出一種新的隨機(jī)森林(RF)特征選擇算法用于鼻咽腫瘤MR圖像的分割。首先,充分提取圖像的灰度、紋理、幾何等特征信息用于構(gòu)建一個(gè)初始的隨機(jī)森林分類器;隨后,結(jié)合隨機(jī)森林特征重要性度量,將改進(jìn)的特征選擇方法應(yīng)用于原始手工特征集;最終,以得到的最優(yōu)特征子集構(gòu)建新的隨機(jī)森林分類器對(duì)測(cè)試圖像進(jìn)行分割。實(shí)驗(yàn)結(jié)果表明,該算法對(duì)鼻咽腫瘤的分割精度為:Dice系數(shù)79.197%,Acc準(zhǔn)確率97.702%,Sen敏感度72.191%,Sp特異性99.502%。通過與基于傳統(tǒng)隨機(jī)森林和基于深度卷積神經(jīng)網(wǎng)絡(luò)(DCNN)的分割算法對(duì)比可知,所提特征選擇算法能有效提取鼻咽腫瘤MR圖像中的有用信息,并較大程度地提升小樣本情況下鼻咽腫瘤的分割精度。
關(guān)鍵詞:鼻咽腫瘤;隨機(jī)森林;特征重要性;特征選擇;最優(yōu)特征子集
中圖分類號(hào):TP391.41
文獻(xiàn)標(biāo)志碼:A
Abstract: Due to the low greylevel contrast and blurred boundaries of organs in medical images, a Random Forest (RF) feature selection algorithm was proposed to segment nasopharyngeal neoplasms MR images. Firstly, graylevel, texture and geometry information was extracted from nasopharyngeal neoplasms images to construct a random forest classifier. Then, feature importances were measured by the random forest, and the proposed feature selection method was applied to the original handcrafted feature set. Finally, the optimal feature subset obtained from the feature selection process was used to construct a new random forest classifier to make the final segmentation of the images. Experimental results show that the performances of the proposed algorithm are: dice coefficient 79.197%, accuracy 97.702%, sensitivity 72.191%, and specificity 99.502%. By comparing with the conventional random forest based and Deep Convolution Neural Network (DCNN) based segmentation algorithms, it is clearly that the proposed feature selection algorithm can effectively extract useful information from the nasopharyngeal neoplasms MR images and improve the segmentation accuracy of nasopharyngeal neoplasms under small sample circumstance.
英文關(guān)鍵詞Key words: nasopharyngeal neoplasms; random forest; feature importance; feature selection; optimal feature subset
0 引言
醫(yī)學(xué)圖像分割是當(dāng)前圖像處理領(lǐng)域的熱點(diǎn)問題之一,對(duì)醫(yī)學(xué)圖像進(jìn)行精準(zhǔn)的分割是后續(xù)治療的重要保障; 然而,由于當(dāng)前醫(yī)學(xué)成像普遍存在灰度對(duì)比度低、器官組織邊界模糊的問題,醫(yī)學(xué)圖像的分割精度始終無法得到有效的提升。
在諸多頭頸部腫瘤中,鼻咽腫瘤是最常見的腫瘤之一,在全球尤其是中國(guó)的廣東地區(qū)有著較高的發(fā)病率。與其他部位的腫瘤相比,鼻咽腫瘤結(jié)構(gòu)復(fù)雜,周邊血管、淋巴管、腺體較多,且不同患者之間腫瘤形狀和大小有較大的差異,因此目前臨床上通常依賴醫(yī)生結(jié)合解剖學(xué)及腫瘤形態(tài)學(xué)知識(shí)對(duì)其進(jìn)行手動(dòng)分割,過程枯燥耗時(shí),具有極大的主觀性且可重復(fù)性差[1-4],因此,相關(guān)領(lǐng)域的研究者一直致力于開發(fā)一種自動(dòng)/半自動(dòng)的分割方法,實(shí)現(xiàn)鼻咽腫瘤的精準(zhǔn)分割。
Zhou等[5-7]近十余年來在鼻咽腫瘤分割領(lǐng)域做了大量的工作,其在文獻(xiàn)[5]中提出一種基于知識(shí)的模糊聚類方法,首先使用半監(jiān)督模糊C均值算法對(duì)圖像進(jìn)行初始分割,隨后再基于對(duì)稱性、連通性及聚類中心這三種空間解剖信息得到最終的分割結(jié)果;文獻(xiàn)[6]提出一種新的圖像紋理測(cè)量方法對(duì)文獻(xiàn)[5]所提出的算法進(jìn)行了改進(jìn);文獻(xiàn)[7]則通過構(gòu)造一個(gè)二分類支持向量機(jī)(Support Vector Machine, SVM)模型對(duì)鼻咽腫瘤圖像進(jìn)行了分割; Lee等[8]基于圖像掩模、貝葉斯概率統(tǒng)計(jì)、閾值平滑、種子生長(zhǎng)等技術(shù)實(shí)現(xiàn)了鼻咽腫瘤的分割; Chanapai[9]首先對(duì)鼻咽腫瘤圖像進(jìn)行了分層定位,并根據(jù)各層位置將腫瘤分為三個(gè)部分,然后基于自組織映射(Self Organizing Map, SOM)技術(shù)構(gòu)造這三個(gè)部分的表征圖用于構(gòu)建初始腫瘤區(qū)域,最后基于區(qū)域生長(zhǎng)算法實(shí)現(xiàn)鼻咽腫瘤的最終分割; Huang等[10]提出一種混合算法,將Adaboost、SVM、貝葉斯(Bayes)分類器結(jié)合起來用于鼻咽腫瘤的分割; 洪容容等[11]提出一種基于區(qū)域生長(zhǎng)的改進(jìn)分割方法,該方法從基于區(qū)域生長(zhǎng)的自動(dòng)分割入手,利用概率矩陣完成初始種子的自動(dòng)生成,再使用SUSAN(Small Univalue Segment Assimilating Nucleus)算子作為區(qū)域生長(zhǎng)的終止準(zhǔn)則,最終實(shí)現(xiàn)鼻咽腫瘤磁共振(Magnetic Resonance, MR)圖像的分割; Huang等[12]首先使用距離正則化水平集演化方法勾畫得到一個(gè)初始的腫瘤邊界,再借助最大熵隱馬爾可夫隨機(jī)場(chǎng)得到最終的分割結(jié)果; 文獻(xiàn)[13-15]則將目前主流的深度學(xué)習(xí)方法引入鼻咽腫瘤的分割領(lǐng)域,并取得了一定的成果。
隨機(jī)森林(Random Forest,RF)算法最早由Breiman[16]提出,因其具有實(shí)現(xiàn)簡(jiǎn)單、訓(xùn)練速度快、抗過擬合能力強(qiáng)、可并行處理等優(yōu)勢(shì),因此被廣泛應(yīng)用于數(shù)據(jù)處理、文本分類、語義分割等領(lǐng)域。
在此基礎(chǔ)上,本文提出了一種隨機(jī)森林特征選擇算法用于鼻咽腫瘤MR圖像的分割,算法的具體流程如圖1所示。通過與基于普通RF、深度卷積神經(jīng)網(wǎng)絡(luò)(Deep Convolutional Neural Network, DCNN)的分割方法進(jìn)行比較可知,本文算法可以有效提升鼻咽腫瘤的分割精度。
5 結(jié)語
本文借助隨機(jī)森林特征重要性度量特性提出了一種新的鼻咽腫瘤MR圖像分割方法,該方法可以實(shí)現(xiàn)對(duì)原始手工特征的選擇優(yōu)化,從而更好地實(shí)現(xiàn)對(duì)鼻咽腫瘤MR圖像的分割。本文算法還有待改進(jìn),下一步擬借助深度學(xué)習(xí)方法的特征學(xué)習(xí)能力,將深度學(xué)習(xí)算法與RF結(jié)合起來,充分提取圖像的中高級(jí)語義特征,進(jìn)一步提升鼻咽腫瘤MR圖像的分割精度。
參考文獻(xiàn) (References)
[1] ??? 蔣君.多模態(tài)腫瘤圖像聯(lián)合分割方法研究[D].廣州:南方醫(yī)科大學(xué), 2014: 40-45.(JIANG J. The research on tumor cosegmentation using multimodal images[D]. Guangzhou: Southern Medical University, 2014: 40-45.)
[2] ??? CHU E A, WU J M, TUNKEL D E, et al. Nasopharyngeal carcinoma: the role of the EpsteinBarr virus [J]. Medscape Journal of Medicine, 2008, 10(7): 165.
[3] ??? KLEIN G, KASHUBA E. Nasopharyngeal carcinoma[J]. Brenners Encyclopedia of Genetics, 2013, 13(1): 4-5.
[4] ??? CHUA M L K, WEE J T S, HUI E P, et al. Nasopharyngeal carcinoma[J]. Lancet, 2015, 387(10022): 1012.
[5] ??? ZHOU J, LIM T K, CHONG V. Tumor volume measurement for nasopharyngeal carcinoma using knowledgebased fuzzy clustering MRI segmentation[J]. Proceedings of SPIE, 2002, 4684: 1698-1708.
[6] ??? ZHOU J, LIM T K, CHONG V, et al. A texture combined multispectral magnetic resonance imaging segmentation for nasopharyngeal carcinoma[J]. Optical Review, 2003, 10(5): 405-410.
[7] ??? ZHOU J, CHAN K L, XU P, et al. Nasopharyngeal carcinoma lesion segmentation from MR images by support vector machine[C]// Proceedings of the 3rd IEEE International Symposium on Biomedical Imaging: Nano to Macro. Piscataway, NJ: IEEE Press, 2006: 1364-1367.
[8] ??? LEE F K, YEUNG D K, KING A D, et al. Segmentation of NasoPharyngeal Carcinoma (NPC) lesions in MR images[J]. International Journal of Radiation Oncology Biology Physics, 2005, 61(2): 608-620.
[9] ??? CHANAPAI W. Nasopharyngeal carcinoma segmentation using a region growing technique[J]. International Journal of Computer Assisted Radiology and Surgery, 2012, 7(3): 413-422.
[10] ?? HUANG WC, LIU CL. A hybrid supervised learning nasal tumor discrimination system for DMRI[J]. Journal of the Chinese Institute of Engineers, 2012, 35(6): 723-733.
[11] ?? 洪容容, 葉少珍. 基于改進(jìn)的區(qū)域生長(zhǎng)鼻咽癌MR醫(yī)學(xué)圖像分割[J]. 福州大學(xué)學(xué)報(bào)(自然科學(xué)版), 2014, 42(5): 683-687.(HONG R R, YE S Z. Segmentation of nasopharyngeal MR medical image based on improved region growing[J]. Journal of Fuzhou University (Natural Science Edition), 2014, 42(5): 683-687.)
[12] ?? HUANG K W, ZHAO Z Y, GONG Q, et al. Nasopharyngeal carcinoma segmentation via HMRFEM with maximum entropy[C]// Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Piscataway NJ: IEEE, 2015: 2968-2972.
[13] ?? FENG A, CHEN Z H, WU X, et al. From convolutional to recurrent: case study in nasopharyngeal carcinoma segmentation[C]// Proceedings of the 2017 International Conference on the Frontiers and Advances in Data Science. Piscataway NJ: IEEE, 2017: 18-22.
[14] ?? MEN K, CHEN X Y, ZHANG Y, et al. Deep deconvolutional neural network for target segmentation of nasopharyngeal cancer in planning computed tomography images[J]. Frontiers in Oncology, 2017, 7: 315.
[15] ?? WANG Y, ZU C, HU G, et al. Automatic tumor segmentation with deep convolutional neural networks for radiotherapy applications[J]. Neural Processing Letters, 2018, 48(3): 1323-1334.
[16] ?? BREIMAN L. Random forest[J]. Machine Learning, 2001, 45(1): 5-32.
[17] ?? BREIMAN L. Bagging predictors[J]. Machine Learning, 1996, 24(2): 123-140.
[18] ?? BREIMAN L. Manual on setting up, using, and understanding random forests V3.1[EB/OL]. [2012-05-05]. http://oz.berkeley.edu/users/breiman/Using_random_forests_V3.1.pdf.
[19] ?? UWE H, RALF M, MICHAELl K B, et al. A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data[J]. BMC Bioinformatics, 2009, 10(1): 1-16.
[20] ?? ALTMANN A, TOLOSI L, SANDER O, et al. Permutation importance: a corrected feature importance measure[J]. Bioinformatics, 2010, 26(10): 1340-1347.
[21] ?? CAROLIN S, ANNELAURE B, THOMAS K, et al. Conditional variable importance for random forests [J]. BMC Bioinformatics, 2008, 9(1): 307-307.
[22] ?? 姚登舉.面向醫(yī)學(xué)數(shù)據(jù)的隨機(jī)森林特征選擇及分類方法研究[D].哈爾濱:哈爾濱工程大學(xué), 2016: 71-88. (YAO D J. Research on feature selection and classification method based on random forest for medical datasets[D]. Harbin: Harbin Engineering University, 2016: 71-88.)