賈偉寬,李倩雯,張中華,劉國良,侯素娟,Ji Ze,鄭元杰
復(fù)雜環(huán)境下柿子和蘋果綠色果實(shí)的優(yōu)化SOLO分割算法
賈偉寬1,2,李倩雯1,張中華1,劉國良3,侯素娟1,Ji Ze4,鄭元杰1※
(1. 山東師范大學(xué)信息科學(xué)與工程學(xué)院,濟(jì)南 250358;2. 機(jī)械工業(yè)設(shè)施農(nóng)業(yè)測(cè)控技術(shù)與裝備重點(diǎn)實(shí)驗(yàn)室,鎮(zhèn)江 212013;3. 山東大學(xué)控制科學(xué)與工程學(xué)院,濟(jì)南 250061;4. 卡迪夫大學(xué)工程學(xué)院,卡迪夫 CF24 3AA,英國)
為了實(shí)現(xiàn)果園復(fù)雜環(huán)境下柿子和蘋果綠色果實(shí)的精準(zhǔn)分割,該研究提出了一種基于SOLO的綠色果實(shí)優(yōu)化分割算法。首先,利用分離注意力網(wǎng)絡(luò)(ResNeSt)設(shè)計(jì)SOLO算法的主干網(wǎng)絡(luò),用于提取綠色果實(shí)特征;其次,為更好地應(yīng)對(duì)綠色果實(shí)特征的多尺度問題,引入特征金字塔網(wǎng)絡(luò)(Feature Pyramid Networks,F(xiàn)PN),構(gòu)造ResNeSt+FPN組合結(jié)構(gòu);最后,將SOLO算法分為類別預(yù)測(cè)和掩碼生成2個(gè)分支,類別預(yù)測(cè)分支在預(yù)測(cè)語義類別的同時(shí),掩碼生成分支實(shí)現(xiàn)了對(duì)綠色果實(shí)的實(shí)例分割。試驗(yàn)結(jié)果表明,優(yōu)化SOLO分割算法的平均召回率和精確率分別達(dá)到94.84%和96.16%,平均每張綠色果實(shí)圖像在圖形處理器(Graphics Processing Unit,GPU)上的分割時(shí)間為0.14 s。通過對(duì)比試驗(yàn)可知,優(yōu)化SOLO分割算法的召回率分別比優(yōu)化掩膜區(qū)域卷積神經(jīng)網(wǎng)絡(luò)算法(Optimized Mask Region Convolutional Neural Network,Optimized Mask R-CNN)、SOLO算法、掩膜區(qū)域卷積神經(jīng)網(wǎng)絡(luò)算法(Mask Region Convolutional Neural Network,Mask R-CNN)和全卷積實(shí)例感知語義分割算法(Fully Convolutional Instance-aware Semantic Segmentation,F(xiàn)CIS)提高了1.63、1.74、2.23和6.52個(gè)百分點(diǎn),精確率分別提高了1.10、1.47、2.61和6.75個(gè)百分點(diǎn),分割時(shí)間縮短了0.06、0.04、0.11和0.13 s。該研究算法可為其他果蔬的果實(shí)分割提供理論借鑒,擴(kuò)展果園測(cè)產(chǎn)和機(jī)器采摘的應(yīng)用范圍。
圖像分割;圖像處理;算法;特征金字塔網(wǎng)絡(luò);綠色果實(shí)
視覺系統(tǒng)是果蔬采摘機(jī)器人的重要組成部分,其分割精度和速度對(duì)采摘機(jī)器人的效率有很大的影響,實(shí)現(xiàn)目標(biāo)果實(shí)的精準(zhǔn)分割已成為視覺系統(tǒng)研究的關(guān)鍵。然而,實(shí)際的果園環(huán)境呈現(xiàn)出復(fù)雜性和非結(jié)構(gòu)化特點(diǎn),受相機(jī)拍攝角度、果實(shí)生長姿態(tài)等因素的影響,導(dǎo)致果實(shí)的被遮擋或重疊現(xiàn)象;受光照條件和光照角度的變化影響,致使果實(shí)圖像的逆光現(xiàn)象;綠色果實(shí)跟背景枝葉顏色相近,容易導(dǎo)致果實(shí)的漏識(shí)。這些因素均影響果實(shí)的分割效果,給果實(shí)的精準(zhǔn)快速分割帶來很大挑戰(zhàn)。盡管如此,該領(lǐng)域仍吸引著國內(nèi)外一些學(xué)者的關(guān)注,并取得一定研究進(jìn)展[1],如綠色柑橘的自動(dòng)計(jì)數(shù)[2-3]、綠色桃子的識(shí)別[4]、重疊綠色蘋果的識(shí)別等[5]。
傳統(tǒng)機(jī)器學(xué)習(xí)算法在目標(biāo)果實(shí)分割領(lǐng)域做出重要貢獻(xiàn),Ahmad等[6]提出基于模糊推理系統(tǒng)與模糊C均值的蘋果圖像穩(wěn)健分割算法,用于蘋果生長期內(nèi)不同顏色目標(biāo)果實(shí)的分割,提高了分割算法的泛化能力。劉曉洋等[7]提出一種基于超像素特征的蘋果分割算法,解決了著色不均勻果實(shí)的識(shí)別分割問題,分割準(zhǔn)確率達(dá)到了0.921 4。Lyu等[8]提出一種套袋綠蘋果圖像分割算法,在提取目標(biāo)果實(shí)正常光照和高亮區(qū)域進(jìn)行組合,實(shí)現(xiàn)目標(biāo)果實(shí)的高效分割。Ji等[9]提出區(qū)域增長和顏色特征的蘋果圖像分割算法,并設(shè)計(jì)一種基于支持向量機(jī)(Support Vector Machine,SVM)蘋果識(shí)別算法,識(shí)別成功率約為89%,平均識(shí)別時(shí)間為0.352 s。姬偉等[10]采用一種基于引導(dǎo)濾波的具有邊緣保持特性的Retinex圖像增強(qiáng)算法分割蘋果目標(biāo),為夜間圖像的分割和目標(biāo)識(shí)別提供了保障。上述算法雖在一定的條件下取得了較為理想的分割效果,但由于部分算法學(xué)習(xí)目標(biāo)特征時(shí)環(huán)境條件的要求相對(duì)嚴(yán)格,然而在面對(duì)果園等實(shí)際而復(fù)雜環(huán)境下進(jìn)行綠色目標(biāo)果實(shí)分割時(shí),這些算法的分割效果略顯不足。
隨著深度學(xué)習(xí)理論和計(jì)算機(jī)硬件設(shè)備的快速發(fā)展,諸多計(jì)算機(jī)視覺問題開始借助深度神經(jīng)網(wǎng)絡(luò)來解決,其端到端處理方式,大幅提升算法的精度和魯棒性,衍生出的眾多算法廣泛應(yīng)用在圖像分割領(lǐng)域,并取得較為理想分割效果[11-13]。深度學(xué)習(xí)理論的快速發(fā)展,同樣引起農(nóng)業(yè)領(lǐng)域?qū)W者的關(guān)注[14-17],給果實(shí)分割帶來新的啟示。Jia等[18]針對(duì)重疊蘋果目標(biāo)果實(shí),提出一種優(yōu)化掩膜區(qū)域卷積神經(jīng)網(wǎng)絡(luò)的目標(biāo)果實(shí)識(shí)別算法,融合殘差網(wǎng)絡(luò)(ResNet)和密集連接卷積網(wǎng)絡(luò)(DenseNet)作為特征提取的主干網(wǎng)絡(luò),提高了重疊及枝葉遮擋環(huán)境下蘋果目標(biāo)的檢測(cè)精度。王丹丹等[19]提出基于區(qū)域的全卷積網(wǎng)絡(luò)的蘋果目標(biāo)識(shí)別方法,設(shè)計(jì)ResNet-44作為主干網(wǎng)絡(luò),在包含遮擋、模糊、重疊的蘋果目標(biāo)的測(cè)試集上得到95.1%的識(shí)別準(zhǔn)確率。Kang等[20]優(yōu)化雙注意力全卷積孿生神經(jīng)網(wǎng)絡(luò),對(duì)樹枝進(jìn)行語義分割,檢測(cè)蘋果的得分為0.83,蘋果和樹枝分割中的得分分別為86.5%和75.5%。Bargoti等[21]利用多尺度多層感知器和卷積神經(jīng)網(wǎng)絡(luò)將蘋果圖像分割,提取出圖像中的蘋果目標(biāo),檢測(cè)精度達(dá)到0.9以上,但無法識(shí)別出群集中出現(xiàn)的所有水果。Liu等[22]提出了從圖像序列中識(shí)別可見的柑橘和蘋果果實(shí)并計(jì)數(shù)的方法,使用匈牙利算法跟蹤圖像幀中的果實(shí),采用運(yùn)動(dòng)恢復(fù)結(jié)構(gòu)算法估計(jì)果實(shí)的三維位置和大小并去除假陽性。面對(duì)非結(jié)構(gòu)化的蘋果園,受自然光線、天氣、綠色果實(shí)、采集角度、樣本數(shù)量等影響,以上算法的精度、魯棒性及適用性相較于傳統(tǒng)機(jī)器視覺算法有大幅提升,仍難以滿足果園測(cè)產(chǎn)和自動(dòng)采摘的實(shí)時(shí)作業(yè)需求,在識(shí)別精度和運(yùn)行效率上還有待進(jìn)一步提升。
綜上,針對(duì)果園復(fù)雜環(huán)境下綠色果實(shí)圖像分割難題,本研究分別構(gòu)建綠色柿子和綠色蘋果數(shù)據(jù)集,提出優(yōu)化SOLO分割算法。該算法主要思想在特征提取環(huán)節(jié)引入綠色果實(shí)的位置和大小信息,在分割環(huán)節(jié)實(shí)現(xiàn)類別預(yù)測(cè)分支與掩碼生成分支同時(shí)完成,以提高綠色果實(shí)的分割精度和效率。
本研究采集綠色柿子和綠色蘋果圖像,采集地點(diǎn)分別為山東師范大學(xué)(長清湖校區(qū))后山和山東省煙臺(tái)市福山區(qū)龍王山蘋果生產(chǎn)基地。采集設(shè)備為佳能EOS 80D單反相機(jī)(80D,佳能Canon,佳能株式會(huì)社,日本),圖像分辨率為6 000×4 000(像素),保存為.jpg格式,24位彩色圖像。在白天(7:00-17:00)自然光下采集和夜晚(19:00-22:00)LED燈光下采集。試驗(yàn)共采集568張綠色柿子圖像和498張綠色蘋果圖像,具體包括夜間、重疊、逆光、順光、遮擋、雨后等多種情況,如圖 1所示。
不同環(huán)境下采集到的綠色果實(shí)圖像樣本數(shù)量分布如表 1所示。為滿足實(shí)時(shí)作業(yè)需求,減小算法的計(jì)算量,將圖像分辨率從6 000×4 000壓縮至600×400(像素)。采用LabelMe軟件標(biāo)注綠色果實(shí)圖像制作為COCO格式數(shù)據(jù)集[23],首先將綠色果實(shí)的邊緣輪廓使用LabelMe標(biāo)注點(diǎn)進(jìn)行標(biāo)注生成標(biāo)簽;其次標(biāo)注點(diǎn)將圖像分為2個(gè)部分,其內(nèi)部為綠色目標(biāo)果實(shí),外部則為背景;隨后所有的標(biāo)注信息如標(biāo)簽、標(biāo)注點(diǎn)坐標(biāo)等均保存至與原圖像對(duì)應(yīng)的.json文件中;最后將.json文件使用LabelMe軟件轉(zhuǎn)換為COCO格式數(shù)據(jù)集。將柿子數(shù)據(jù)集和蘋果數(shù)據(jù)集均按照7∶3的比例劃分訓(xùn)練集和測(cè)試集,其中柿子訓(xùn)練集為398張圖像,測(cè)試集為170張圖像;蘋果訓(xùn)練集為348張圖像,測(cè)試集為150張圖像。
圖像的背景越簡單越利于果實(shí)分割,然而果園實(shí)際環(huán)境復(fù)雜,采集到的圖像背景較為復(fù)雜,果實(shí)姿態(tài)多變。枝葉遮擋、重疊、逆光、夜間、雨后等,再由于綠色果實(shí)和背景間顏色相近,導(dǎo)致果實(shí)邊界不清晰,給綠色果實(shí)的精準(zhǔn)分割帶來一定影響。由于綠色果實(shí)的特殊性及果園環(huán)境的復(fù)雜性,提出一種優(yōu)化SOLO分割算法,實(shí)現(xiàn)綠色果實(shí)的高效精準(zhǔn)分割。優(yōu)化SOLO分割算法的主干網(wǎng)絡(luò)采用分離注意力網(wǎng)絡(luò)(ResNeSt)提取圖像特征,以增強(qiáng)前后層特征傳輸、重用和融合能力。由于果實(shí)尺寸不盡相同,在ResNeSt后引入特征金字塔網(wǎng)絡(luò)(Feature Pyramid Networks,F(xiàn)PN),將不同尺寸的果實(shí)映射到不同層次的特征圖,以解決綠色果實(shí)的多尺度問題。將ResNeSt+FPN組合結(jié)構(gòu)獲取到的圖像特征輸入優(yōu)化SOLO分割算法的2個(gè)分支:類別預(yù)測(cè)分支和掩碼生成分支,類別預(yù)測(cè)分支預(yù)測(cè)語義類別,而掩碼生成分支分割對(duì)象實(shí)例。其核心思想是按照果實(shí)位置分割圖像,將圖像劃分為×網(wǎng)格,如果對(duì)象的中心落在網(wǎng)格單元中,則該網(wǎng)格單元負(fù)責(zé)預(yù)測(cè)語義類別以及分配每像素位置類別,最終得到綠色果實(shí)的分割圖。優(yōu)化SOLO分割算法流程如圖2所示。
1.2.1 主干網(wǎng)絡(luò)(ResNeSt)
優(yōu)化SOLO分割算法采用ResNeSt作為主干網(wǎng)絡(luò),提取圖像中綠色果實(shí)的特征,該網(wǎng)絡(luò)優(yōu)于具有類似算法復(fù)雜度的其他網(wǎng)絡(luò),可大幅度提高了算法的精度。ResNeSt是一種基于殘差網(wǎng)絡(luò)(ResNet)的改進(jìn)卷積神經(jīng)網(wǎng)絡(luò)(Convolutional Neural Network,CNN),是多個(gè)分離注意力模塊的組合,該塊可以跨特征圖組實(shí)現(xiàn)信息交互。通過以ResNet樣式堆疊分離注意力模塊得到ResNeSt,保留了完整的ResNet結(jié)構(gòu)。
分離注意力模塊作為計(jì)算單元,包括特征映射組和分離注意力操作。特征映射組將特征分成不同的組,特征圖組的數(shù)量由基數(shù)超參數(shù)()給出,將所得的特征圖組稱為基數(shù)組。然后,再引入一個(gè)基數(shù)超參數(shù)(),該基數(shù)指示基數(shù)組內(nèi)的拆分塊數(shù)。最初的輸入特征圖沿著通道維度被劃分為個(gè)特征圖小組,對(duì)每個(gè)單獨(dú)的組應(yīng)用一系列變換{1,2,...G},然后對(duì)于∈{1,2,...},每個(gè)組的中間表示為=()。其中,映射變換由1×1和3×3的卷積操作實(shí)現(xiàn)。
隨后進(jìn)行分離注意力操作,對(duì)多個(gè)拆分塊元素求和融合,可以獲得每個(gè)基數(shù)組的組合表示。
第個(gè)基數(shù)組的表示,如式(1)所示。
最后,將基數(shù)組表示沿通道維級(jí)聯(lián):=Concat{1,2, …},其中 Concat表示級(jí)聯(lián),1,2, …為基數(shù)組表示。與標(biāo)準(zhǔn)殘差塊中一樣,如果輸入和輸出特征圖共享相同的形狀,則使用快捷方式連接生成分離注意力模塊的最終輸出:=+。對(duì)于具有跨步的塊,將適當(dāng)?shù)淖儞Q應(yīng)用于快捷連接以對(duì)齊輸出形狀:=+(),其中,為跨步卷積或帶池組合卷積。ResNeSt增強(qiáng)了前后層特征傳輸、重用和融合的能力,同時(shí),還具有減弱過度擬合的能力和極強(qiáng)的泛化能力,可直接用于下游任務(wù),而不會(huì)引起額外的計(jì)算成本。
綠色果實(shí)的淺層特征可以實(shí)現(xiàn)果實(shí)與背景的區(qū)分,然而由于果實(shí)尺寸不同,使得果實(shí)邊界模糊,需要進(jìn)一步提取果實(shí)的深層特征,以更清晰得到不同尺寸果實(shí)的邊界信息。因此,采用ResNeSt初步提取圖像特征后,再結(jié)合 FPN網(wǎng)絡(luò),解決綠色果實(shí)分割中的多尺度問題。借助FPN定義不同尺度的分配策略,按照果實(shí)的尺度將其最優(yōu)分配到金字塔層級(jí)中,大尺度的果實(shí)由最頂層的特征圖負(fù)責(zé)分割,隨著果實(shí)尺度的下降,負(fù)責(zé)分割的金字塔層級(jí)也相應(yīng)下降。采用FPN生成的多層特征圖同時(shí)參與綠色果實(shí)分割,增強(qiáng)算法對(duì)不同尺度果實(shí)的分割效果,在一定程度上緩解果實(shí)間的重疊問題。
1.2.2 類別預(yù)測(cè)和掩碼生成
在語義類別預(yù)測(cè)過程中,對(duì)于每個(gè)網(wǎng)格,優(yōu)化SOLO算法均會(huì)預(yù)測(cè)個(gè)類別數(shù),用來表示語義類別概率。將輸入綠色果實(shí)特征圖劃分為×網(wǎng)格,則輸出空間為××。這里需要假設(shè)×網(wǎng)格的每個(gè)單元必須屬于一個(gè)單獨(dú)的實(shí)例,且僅屬于一個(gè)語義類別。在推理期間,維輸出指示每個(gè)對(duì)象實(shí)例的類概率,即網(wǎng)格(,)落入任何地面真值掩模的中心區(qū)域則視為正樣本,否則為負(fù)樣本。
在語義類別預(yù)測(cè)分支工作的同時(shí),掩碼生成分支并行地生成相應(yīng)綠色果實(shí)的實(shí)例掩碼。對(duì)于輸入圖像的×個(gè)網(wǎng)格,則最多生成2個(gè)預(yù)測(cè)掩碼,在輸出張量的第三維(通道)上顯示編碼這些掩碼,即實(shí)例掩碼輸出的維數(shù)是××2。第個(gè)通道負(fù)責(zé)對(duì)網(wǎng)格(,)上的實(shí)例進(jìn)行分割,其中,=iS+。因此,在語義類別和與類相關(guān)的掩碼建立一對(duì)一對(duì)應(yīng)關(guān)系。
實(shí)例掩碼預(yù)測(cè)一般采用全卷積網(wǎng)絡(luò)(Fully Convolutional Networks,F(xiàn)CN),包括卷積和反卷積操作,具有平移不變性,然而,本研究的掩碼是基于網(wǎng)格的位置(2個(gè)通道)產(chǎn)生,需要平移可變性。借鑒CoordConv操作,解決卷積神經(jīng)網(wǎng)絡(luò)的坐標(biāo)變換問題,直接將標(biāo)準(zhǔn)化的像素坐標(biāo)饋送給網(wǎng)絡(luò)。創(chuàng)建一個(gè)與包含像素坐標(biāo)的輸入具有相同空間大小的張量,并規(guī)范為[-1,1],將該張量連接到輸入特征并傳遞到下層。通過簡單地賦予卷積對(duì)其自身輸入坐標(biāo)的訪問權(quán),將空間功能添加到傳統(tǒng)的FCN網(wǎng)絡(luò)中,生成相應(yīng)的掩碼。
類別預(yù)測(cè)和對(duì)應(yīng)的掩碼由參考網(wǎng)格單元(=iS+)自然關(guān)聯(lián),在此基礎(chǔ)上,可以直接形成每個(gè)網(wǎng)格的最終實(shí)例分割結(jié)果。原始分割結(jié)果就是通過收集所有網(wǎng)格結(jié)果生成。每個(gè)網(wǎng)格只激活一個(gè)實(shí)例,多個(gè)相鄰掩碼通道可以預(yù)測(cè)一個(gè)實(shí)例,采用非最大值抑制(NMS)來抑制冗余掩碼,得到最終的綠色果實(shí)分割結(jié)果。
1.2.3 損失函數(shù)
優(yōu)化SOLO分割算法的總損失函數(shù)計(jì)算如式(5)所示。
骰子系數(shù)損失定義如式(7)和式(8)所示。
本試驗(yàn)運(yùn)行環(huán)境:Ubuntu 16.04操作系統(tǒng)、32GB GPU Tesla V100和v10.0 CUDA環(huán)境的服務(wù)器,搭建PyTorch深度學(xué)習(xí)框架,采用Python語言編程實(shí)現(xiàn)綠色果實(shí)分割算法的訓(xùn)練和測(cè)試。
遷移學(xué)習(xí)可以降低算法訓(xùn)練時(shí)的過擬合問題和計(jì)算量,本試驗(yàn)采用基于COCO數(shù)據(jù)集的預(yù)訓(xùn)練算法的初始權(quán)重,可以使損失函數(shù)盡快趨于穩(wěn)定值,加快訓(xùn)練數(shù)據(jù)。將初始學(xué)習(xí)率設(shè)置為0.01,權(quán)重衰減率為0.000 1,動(dòng)量因子為0.9,最大迭代次數(shù)為500,每迭代20次保存1次算法參數(shù)。算法的訓(xùn)練精度隨著迭代次數(shù)的增加而迅速升高,且趨于穩(wěn)定。
本研究采用召回率(Recall,%)和精確率(Precision,%)兩項(xiàng)指標(biāo)對(duì)分割算法進(jìn)行評(píng)估,其具體計(jì)算如式(9)和式(10)所示。
式中TP為真實(shí)的正樣本數(shù)量,F(xiàn)P為虛假的正樣本數(shù)量,F(xiàn)N為虛假的負(fù)樣本數(shù)量。
模擬果園真實(shí)采摘場景,采用本研究優(yōu)化SOLO的綠色果實(shí)圖像分割算法,對(duì)綠色果實(shí)圖像進(jìn)行分割。由于在實(shí)際果園圖像中果實(shí)對(duì)象的信息往往不同,相應(yīng)地算法分割效果會(huì)受到不同程度地影響,如在果實(shí)稀疏完整的圖像中,目標(biāo)果實(shí)比較完整清晰;而在果實(shí)重疊遮擋圖像中,有可能存在果實(shí)過小、粘連、相互遮擋或被枝葉遮擋等情況,不易分割;在夜間或逆光的圖像中,對(duì)目標(biāo)果實(shí)的分割也有一定的難度。本研究算法的果實(shí)(綠色柿子和綠色蘋果)分割效果如圖3所示。采集的柿子圖像中的果實(shí)較為稠密,果實(shí)數(shù)目較多;蘋果圖像中的果實(shí)相對(duì)稀疏,果實(shí)數(shù)目較少。柿子圖像采集環(huán)境明顯比蘋果圖像采集環(huán)境復(fù)雜。
為更客觀說明優(yōu)化SOLO分割算法的性能,分別列舉了重疊、順光、雨后、逆光等復(fù)雜條件下的柿子圖像和蘋果圖像的召回率和精確率,結(jié)果列于表2。在上述復(fù)雜條件下,蘋果圖像的召回率比柿子圖像略高0.23~2.01個(gè)百分點(diǎn),蘋果圖像的精確率與柿子圖像相比略高0.26~1.67個(gè)百分點(diǎn)。重疊、逆光、夜間和遮擋條件對(duì)果實(shí)分割造成了一定影響,分割效果略差。柿子圖像在重疊、逆光、夜間和遮擋條件下,召回率在92.00%左右,精確率在94.00%左右。蘋果圖像在重疊、逆光、夜間和遮擋條件下,召回率在94.00%左右,精確率在95.00%以上。順光和雨后下的果實(shí)分割效果相對(duì)較好。無遮擋或重疊的獨(dú)立果實(shí)分割效果最好,柿子圖像和蘋果圖像的召回率和精確率達(dá)到99.00%以上。
為進(jìn)一步驗(yàn)證優(yōu)化SOLO分割算法的性能,與具有代表性的優(yōu)化掩膜區(qū)域卷積神經(jīng)網(wǎng)絡(luò)算法(Optimized Mask Region Convolutional Neural Network,Optimized Mask R-CNN)、SOLO算法、掩膜區(qū)域卷積神經(jīng)網(wǎng)絡(luò)算法(Mask Region Convolutional Neural Network,Mask R-CNN)和全卷積實(shí)例感知語義分割算法(Fully Convolutional Instance-aware Semantic Segmentation,F(xiàn)CIS)進(jìn)行對(duì)比。計(jì)算平均精確率和召回率,結(jié)果列于表3。
表2 復(fù)雜環(huán)境下的綠色柿子和綠色蘋果圖像的分割結(jié)果
由表3可知,盡管存在誤識(shí)別和漏識(shí)別現(xiàn)象,與其他算法相比,本研究算法能夠相對(duì)準(zhǔn)確地分割出圖像中的綠色果實(shí),優(yōu)化SOLO分割算法召回率和精確率分別達(dá)到了94.84%和96.16%,比其他分割算法的召回率高1.63~6.52個(gè)百分點(diǎn),精確率高1.10~6.75個(gè)百分點(diǎn)。除評(píng)估優(yōu)化SOLO分割算法的精度外,還需要考慮算法在實(shí)際分割用時(shí),即保證精度的同時(shí)降低分割時(shí)間。上述算法的在圖形處理器(Graphics Processing Unit,GPU)上平均識(shí)別一張圖像的分割時(shí)間分別為0.2、0.18、0.25、0.27和0.14 s,本研究算法的分割時(shí)間最短。
表3 不同分割算法的分割性能比較
果實(shí)的精準(zhǔn)分割是果園測(cè)產(chǎn)和自動(dòng)采摘的重要前提,為解決果園復(fù)雜環(huán)境下綠色果實(shí)的分割難題,本研究提出一種優(yōu)化SOLO分割算法,通過引入果實(shí)的位置信息,增強(qiáng)算法對(duì)綠色果實(shí)的分割性能。
1)優(yōu)化SOLO分割算法的平均召回率和精確率分別達(dá)到94.84%和96.16%,平均每一張綠色果實(shí)圖像在GPU上的分割時(shí)間為0.14 s。
2)優(yōu)化SOLO分割算法與傳統(tǒng)的SOLO分割算法相比召回率高1.74個(gè)百分點(diǎn),精確率高1.47個(gè)百分點(diǎn)。
3)優(yōu)化SOLO分割算法對(duì)比Optimized Mask R-CNN、Mask R-CNN、FCIS分割算法,召回率分別高出1.63、2.23和6.52個(gè)百分點(diǎn),精確率分別高出1.1,2.61和6.75個(gè)百分點(diǎn)。
優(yōu)化SOLO分割算法可滿足果園復(fù)雜環(huán)境下綠色果實(shí)的實(shí)時(shí)、精準(zhǔn)分割。
[1] Jia W K, Zhang Y, Lian J, et al. Apple harvesting robot under information technology: A review[J/OL]. International Journal of Advanced Robotic Systems, [2020-04-16], https: // www. researchgate. net/publication/342209598_Apple_ harvesting_robot_under_information_technology_A_review.
[2] Maldonado J W, Barbosa J C. Automatic green fruit counting in orange trees using digital images[J]. Computers and Electronics in Agriculture, 2016, 127: 572-581.
[3] Wang C L, Lee W S, Zou X J, et al. Detection and counting of immature green citrus fruit based on the Local Binary Patterns (LBP) feature using illumination-normalized images[J]. Precision Agriculture, 2018, 19: 1062-1083.
[4] 黃小玉,李光林,馬馳,等. 基于改進(jìn)判別區(qū)域特征融合算法的近色背景綠色桃子識(shí)別[J]. 農(nóng)業(yè)工程學(xué)報(bào),2018, 34(23):142-148.
Huang Xiaoyu, Li Guanglin, Ma Chi, et al. Green peach recognition based on improved discriminative regional feature integration algorithm in similar background[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(23): 142-148. (in Chinese with English abstract)
[5] 李大華,趙輝,于曉. 基于改進(jìn)譜聚類的重疊綠蘋果識(shí)別方法[J]. 光譜學(xué)與光譜分析,2019,39(9):2974-2981.
Li Dahua, Zhao Hui, Yu Xiao. Overlapping green apple recognition based on improved spectral clustering[J]. Spectroscopy and Spectral Analysis, 2019, 39(9): 2974-2981. (in Chinese with English abstract)
[6] Ahmad T, Greenspan M, Asif M, et al. Robust apple segmentation using fuzzy logic[C]//5thInternational Multi-Topic ICT Conference (IMTIC), Karachi: IEEE, 2018.
[7] 劉曉洋,趙德安,賈偉寬,等. 基于超像素特征的蘋果采摘機(jī)器人果實(shí)分割方法[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2019,50(11):15-23.
Liu Xiaoyang, Zhao De’an, Jia Weikuan, et al. Fruits segmentation method based on superpixel features for apple harvesting robot[J]. Transactions of the Chinese Society for Agricultural Machinery, 2019, 50(11): 15-23. (in Chinese with English abstract).
[8] Lyu J D, Wang F, Xu L M, et al. A segmentation method of bagged green apple image[J]. Scientia Horticulturae, 2019, 246: 411-417.
[9] Ji W, Zhao D A, Cheng F Y, et al. Automatic recognition vision system guided for apple harvesting robot[J]. Computers & Electrical Engineering, 2012, 38(5), 1186-1195.
[10] 姬偉,呂興琴,趙德安,等. 蘋果采摘機(jī)器人夜間圖像邊緣保持的Retinex增強(qiáng)算法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2016,32(6):189-196.
Ji Wei, Lü Xingqin, Zhao De’an, et al. Edge-preserving Retinex enhancement algorithm of night vision image for apple harvesting robot[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2016, 32(6): 189-196. (in Chinese with English abstract)
[11] Garcia-Garcia A, Orts-Escolano S, Oprea S, et al. A survey on deep learning techniques for image and video semantic segmentation[J]. Applied Soft Computing, 2018, 70: 41-65.
[12] Minaee S, Boykov Y Y, Porikli F, et al. Image segmentation using deep learning: A survey[J/OL]. IEEE Transactions on Pattern Analysis and Machine Intelligence, [2021-02-17], https: //ieeexplore. ieee. org/abstract/document/9356353
[13] Qi C R, Su H, Mo K C, et al. PointNet: Deep learning on point sets for 3D classification and segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Hawaii: IEEE, 2017.
[14] Hossain M S, Al-Hammadi M, Muhammad G. Automatic fruit classification using deep learning for industrial applications[J]. IEEE Transactions on Industrial Informatics, 2018, 15(2): 1027-1034.
[15] Koirala A, Walsh K B, Wang Z L, et al. Deep learning–Method overview and review of use for fruit detection and yield estimation[J]. Computers and Electronics in Agriculture, 2019, 162: 219-234.
[16] 傅隆生,馮亞利,Tola E. 基于卷積神經(jīng)網(wǎng)絡(luò)的田間多簇獼猴桃圖像識(shí)別方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2018, 34(2):205-211.
Fu Longsheng, Feng Yali, Tola E. Image recognition method of multi-cluster kiwifruit in field based on convolutional neural networks[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(2): 205-211. (in Chinese with English abstract)
[17] 孫紅,李松,李民贊,等. 農(nóng)業(yè)信息成像感知與深度學(xué)習(xí)應(yīng)用研究進(jìn)展[J]. 農(nóng)業(yè)機(jī)械學(xué)報(bào),2020,51(5):1-17.
Sun Hong, Li Song, Li Minzan, et al. Research progress of image sensing and deep learning in agriculture[J]. Transactions of the Chinese Society for Agricultural Machinery, 2020, 51(5): 1-17. (in Chinese with English abstract)
[18] Jia W K, Tian Y Y, Luo R, et al. Detection and segmentation of overlapped fruits based on optimized Mask R-CNN application in apple harvesting robot[J/OL]. Computers and Electronics in Agriculture, [2020-03-18], https: // doi. org/10. 1016/j. compag. 2020. 105380.
[19] 王丹丹,何東健. 基于R-FCN深度卷積神經(jīng)網(wǎng)絡(luò)的機(jī)器人疏果前蘋果目標(biāo)的識(shí)別[J]. 農(nóng)業(yè)工程學(xué)報(bào),2019,35(3):156-163.
Wang Dandan, He Dongjian. Recognition of apple targets before fruits thinning by robot based on R-FCN deep convolution neural network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(3): 156-163. (in Chinese with English abstract)
[20] Kang H W, Chen C. Fruit detection, segmentation and 3D visualisation of environments in apple orchards[J/OL]. Computers and Electronics in Agriculture, [2020-02-20], https: // doi. org/10. 1016/j. compag. 2020. 105302.
[21] Bargoti S, Underwood J. Deep fruit detection in orchards[C]//IEEE International Conference on Robotics and Automation (ICRA), Singapore: IEEE, 2017.
[22] Liu X, Chen S W, Aditya S, et al. Robust fruit counting: Combining deep learning, tracking, and structure from motion[C]// International Conference on Intelligent Robots and System, Madrid: IEEE, 2018.
[23] Lin T Y, Maire M, Belongie S, et al. Microsoft coco: Common objects in context[C]//European Conference on Computer Vision. Zurich: Springer, 2014.
[24] Wang X L, Kong T, Shen C H, et al. SOLO: Segmenting objects by locations[C]//European Conference on Computer Vision, Glasgow: Springer, 2020.
[25] He K M, Gkioxari G, Dollár P, et al. Mask R-CNN[C]//IEEE International Conference on Computer Vision, Venice: IEEE, 2017.
[26] Li Y, Qi H Z, Dai J F, et al. Fully convolutional instance-Aware semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition, Hawaii: IEEE, 2017.
Optimized SOLO segmentation algorithm for the green fruits of persimmons and apples in complex environments
Jia Weikuan1,2, Li Qianwen1, Zhang Zhonghua1, Liu Guoliang3, Hou Sujuan1, Ji Ze4, Zheng Yuanjie1※
(1.,,250358,; 2.,212013,; 3.,,250061,;4.,,CF24 3AA,)
To solve the green fruit recognition problem of persimmons and apples, a green fruit segmentation algorithm based on optimized SOLO (Segmenting Objects by Locations) was proposed in this study to achieve accurate segmentation of green fruits in complex environments. The proposed algorithm was a single-stage instance segmentation method, which avoided the disadvantage that detection before segmentation in two-stage methods relied on detection performance. By introducing the concept of instance category, each pixel in the instance was assigned a category according to the location and size of the instance, therefore, the instance segmentation was transformed into a classification problem. This study takes green persimmons and green apples as the research objects. The image collection locations are Shandong Normal University (Changqing Lake Campus) Houshan and the Longwangshan Apple Production Base in Fushan District, Yantai City, Shandong Province. The acquisition device is a Canon EOS 80D SLR camera with an image resolution of 6 000×4 000 pixels. Collect under natural light during the day (7:00-17:00) and under LED light at night (19:00-22:00). A total of 568 images of green persimmons and 498 images of green apples were collected in the experiment, including nighttime, overlap, backlighting, forward light, blocked, and after rain. The collected images were annotated by LabelMe software and then were made into a dataset in COCO format. Specifically, first, split-attention networks (ResNeSt) were used to extract features of the target fruit as the backbone network of optimized SOLO, which enhanced the transfer, reuse, and fusion of features in the front and back layers. Then ResNeSt and Feature Pyramid Network (FPN) were combined to solve the multi-scale problem of green fruits. Because FPN defined allocation strategies for different scale features and assigned them to the pyramid levels optimally. Finally, the image features extracted by the ResNeSt+FPN structure were utilized for the subsequent prediction. The optimized SOLO segmentation algorithm was divided into two branches: category prediction and mask generation. While the semantic category was predicted by the category prediction branch, the object instance was segmented by the mask generation branch, in this way, the target fruit segmentation was completed. The experimental results showed that the average recall and precision of the optimized SOLO segmentation algorithm reached 94.84% and 96.16%, respectively, with an average segmentation time of 0.14 s per green target fruit image on Graphics Processing Unit (GPU). Besides, compared with four algorithms, which were the optimized Mask R-CNN fruit recognition algorithm, SOLO, Mask Region Convolutional Neural Network (Mask R-CNN), and Fully Convolutional Instance-aware Semantic Segmentation (FCIS), the recall of the optimized SOLO segmentation algorithm in this study was improved by 1.63, 1.74, 2.23, and 6.52 percentage points, the precision was improved by 1.10, 1.47, 2.61, and 6.75 percentage points, respectively, and the segmentation times were reduced by 0.06, 0.04, 0.11, and 0.13 s, respectively. The relevant results show that the green fruit optimization SOLO segmentation algorithm proposed by the study can meet the real-time performance of green fruit segmentation and improve the accuracy of green fruit segmentation. This research algorithm can provide theoretical reference for segmentation of other target fruits and vegetables to extend the application of orchard yield measurement and robot harvesting.
image segmentation; image processing; algorithms; feature pyramid networks; green fruits
賈偉寬,李倩雯,張中華,等. 復(fù)雜環(huán)境下柿子和蘋果綠色果實(shí)的優(yōu)化SOLO分割算法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2021,37(18):121-127.doi:10.11975/j.issn.1002-6819.2021.18.014 http://www.tcsae.org
Jia Weikuan, Li Qianwen, Zhang Zhonghua, et al.Optimized SOLO segmentation algorithm for the green fruits of persimmons and apples in complex environments[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(18): 121-127. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.2021.18.014 http://www.tcsae.org
2020-11-30
2021-07-19
國家自然科學(xué)基金(62072289,81871508);山東省自然科學(xué)基金(ZR2020MF076,ZR2020MF133);山東省重點(diǎn)研發(fā)計(jì)劃項(xiàng)目(2019GNC106115);山東省泰山學(xué)者基金
賈偉寬,博士,副教授,研究方向?yàn)槿斯ぶ悄堋D像處理、農(nóng)業(yè)信息技術(shù)與裝備。Email:jwk_1982@163.com
鄭元杰,博士,教授,博士生導(dǎo)師,研究方向?yàn)槿斯ぶ悄?、圖像處理。Email:yjzheng@sdnu.edu.cn
10.11975/j.issn.1002-6819.2021.18.014
TP24;TP391
A
1002-6819(2021)-18-0121-07