尹 川,蘇議輝,潘 勉,段金松
?農(nóng)業(yè)信息與電氣技術(shù)?
基于改進YOLOv5s的名優(yōu)綠茶品質(zhì)檢測
尹 川,蘇議輝,潘 勉※,段金松
(杭州電子科技大學(xué)電子信息學(xué)院,杭州 310018)
針對實際檢測過程中茶葉數(shù)量多、體積小、茶葉之間顏色和紋理相似等特點,該研究提出了一種基于YOLOv5s的名優(yōu)綠茶品質(zhì)檢測算法。首先,該算法在骨干網(wǎng)絡(luò)層引入膨脹卷積網(wǎng)絡(luò),通過增大感受野的方式增強茶葉微小特征的提取。其次,改進特征融合進程,基于通道注意力和空間注意力抑制無關(guān)信息的干擾,構(gòu)建CBAM注意力機制優(yōu)化檢測器。接著根據(jù)swin transformer網(wǎng)絡(luò)結(jié)構(gòu)在多個維度對小尺度茶葉的特征進行交互和融合。最后,配合SimOTA匹配算法動態(tài)分配茶葉正樣本,提高不同品質(zhì)茶葉的識別能力。結(jié)果表明,改進后的算法精準(zhǔn)度、召回率、平均精度均值、模型體積、檢測速度分別為97.4%、89.7%、91.9%、7.11MB和51幀/s,相較于基礎(chǔ)的YOLOv5s平均精度均值提高了3.8個百分點,檢測速度提高了7幀/s。利用相同數(shù)據(jù)集在不同目標(biāo)檢測模型上進行對比試驗,與Faster-RCNN、SSD、YOLOv3、YOLOv4等模型相比,平均精度均值分別提升10.8、22.9、18.6、8.4個百分點,進一步驗證了該研究方法的有效性和可靠性。
機器視覺;圖像識別;茶葉品質(zhì)檢測;YOLOv5s;感受野;swin transformer;注意力機制;SimOTA
中國是世界上茶葉的主要生產(chǎn)和出口國家之一,近年來茶葉生產(chǎn)呈現(xiàn)迅速增長的趨勢[1-3],茶葉產(chǎn)量不斷提高,在國民經(jīng)濟占據(jù)相當(dāng)重要的地位。茶葉是一種歷史悠久的傳統(tǒng)飲品,其品質(zhì)的好壞關(guān)系到消費者對茶葉的認(rèn)可和信賴。茶葉品質(zhì)檢測通常會運用感官審評法[4-6],該方法是評審員根據(jù)自身的經(jīng)驗對茶葉進行相應(yīng)的評定,確定茶葉的等級和品質(zhì),評審結(jié)果隨機性大,可重復(fù)性差。為了改變目前茶葉市場良莠不齊、魚龍混雜的不良狀況,除加強監(jiān)管力度外,通過科技手段對茶葉品質(zhì)進行量化評估是另一種有效途徑。
隨著圖像處理技術(shù)的不斷進步[7-8],計算機視覺在茶葉品質(zhì)檢測的研究也日益增多,目前對于茶葉品質(zhì)的研究大多使用機器學(xué)習(xí)[9]的方式。國內(nèi)外相關(guān)學(xué)者借助傳統(tǒng)的圖像分割[10]、標(biāo)記以及HIS顏色模型等技術(shù)在提取茶葉色澤特征等領(lǐng)域開展了大量研究[11-14],徐海衛(wèi)等[15]提出了基于灰度共生矩和Tamura的茶葉特征提取方法,并將其應(yīng)用在BP神經(jīng)網(wǎng)絡(luò)上,仿真試驗具有一定成效;劉自強等[16]采集了10種茶葉鮮葉,使用灰度共生矩陣提取顏色、形態(tài)、分形以及紋理4類特征,在6種分類器中訓(xùn)練,結(jié)果表明SVMKM和隨機森林的檢測效果最好。汪建等[17]提出了基于HIS模型的特征提取方法,在BP神經(jīng)網(wǎng)絡(luò)中加入茶葉顏色參數(shù)和形狀參數(shù),檢測效果良好。然而傳統(tǒng)的圖像識別技術(shù)依賴于人工特征提取,對光照、視角和尺度敏感,不僅工作量大而且泛化能力及魯棒性較差。
與傳統(tǒng)的機器視覺技術(shù)相比,基于深度學(xué)習(xí)[18-19]的算法根據(jù)大量樣本數(shù)據(jù),自動提取樣本的多維特征,具有更強的適應(yīng)性和可移植性,常用的檢測算法有:R-CNN[20]、YOLO[21]、SSD等。呂軍等[22]利用AlexNet卷積神經(jīng)網(wǎng)絡(luò)識別茶葉嫩芽狀態(tài),實現(xiàn)了自然光照下茶葉嫩芽的分類。孫肖肖等[23]提出了基于YOLO的茶葉圖像檢測模型,采用中尺度和大尺度預(yù)測層,提升了檢測效率。YANG等[24]改進了YOLOv3網(wǎng)絡(luò),利用殘差結(jié)構(gòu)和新的卷積運算,有效提升茶葉的檢測精度。
通過上述文獻可以看出,主流的深度學(xué)習(xí)模型算法[25-29]具有高度的魯棒性,但研究目標(biāo)主要為嫩芽,針對名優(yōu)綠茶的品質(zhì)檢測相關(guān)研究較少,本文針對實際檢測過程中茶葉顏色紋理相似,數(shù)量多、體積小等特點,提出了一種基于改進YOLOv5s的名優(yōu)綠茶品質(zhì)檢測算法YOLOv5s-tea。該算法首先在骨干網(wǎng)絡(luò)的末端設(shè)計新的TDC(three-branch dilated convolution)膨脹卷積結(jié)構(gòu),增強網(wǎng)絡(luò)的感受野從而提高網(wǎng)絡(luò)對茶葉的特征提取能力,得到茶葉相鄰區(qū)域的特征信息。其次基于注意力機制,自適應(yīng)學(xué)習(xí)輸入數(shù)據(jù)中的通道和空間信息。然后引入Swin transfomer增強茶葉小目標(biāo)的語義信息和全局感知能力;最后采用了SimOTA正樣本匹配算法,為每顆茶葉正樣本分配一個真實目標(biāo)框,結(jié)果表明,本文所提方法可以顯著提升名優(yōu)綠茶品質(zhì)檢測的效率及精度。
本文檢測目標(biāo)選用新昌扁形名優(yōu)綠茶,綜合評茶師鑒定和市場標(biāo)準(zhǔn)將茶葉根據(jù)牙尖、條索、凈度、色澤4項指標(biāo)分為5個等級。如圖1所示,第一等級茶葉條索緊結(jié),牙尖細窄,葉脈平滑,色澤翠碧;第二等級茶葉條索飽滿,主側(cè)脈成對多為閉合,張口略??;第三等級茶葉條索交錯,葉脈散開,呈多齒狀,少數(shù)側(cè)脈短小,張口略大;第四等級茶葉細小,形態(tài)各異,參差不齊,色澤暗沉;第五等級茶葉呈粉末細屑狀,多為雜質(zhì)。
圖像采集于浙江省新昌縣茶葉數(shù)字化生產(chǎn)基地,采集時保證背景單一,光照穩(wěn)定,避免太陽光直射和光線不足,角度統(tǒng)一為正上方30~40 cm,共采集茶葉圖像412張。利用目標(biāo)標(biāo)注工具Labelling對采集到圖像中的各個等級茶葉進行人工標(biāo)注,圖像標(biāo)注時使用最小外接矩陣保證矩形框內(nèi)盡可能排除無關(guān)像素。將原始圖像中茶葉歸納成5個等級進行數(shù)據(jù)標(biāo)注并生成XML格式文件,按9∶1的比例劃分為訓(xùn)練集和測試集。
圖1 不同品質(zhì)茶葉外形特征
為豐富訓(xùn)練集數(shù)據(jù)的多樣性,本文模擬不同環(huán)境下茶葉外形特征的表現(xiàn)形式,增強模型泛化能力,采用水平鏡像、旋轉(zhuǎn)角度、對比度增強和亮度增強4種方式進行數(shù)據(jù)集擴增。亮度增強和對比度增強權(quán)重均設(shè)置為0.9和1.1之間的隨機數(shù)。數(shù)據(jù)增強后生成圖像2 060張,每張圖像含40~70顆茶葉,共計茶葉數(shù)量95 580顆,如圖 2所示。最終得到的各等級茶葉數(shù)量如表1所示。
圖2 數(shù)據(jù)增強示例
表1 茶葉數(shù)據(jù)增強后數(shù)據(jù)集
YOLOv5s算法框架主要由輸入端,主干網(wǎng)絡(luò),特征融合,預(yù)測層4個模塊組成,如圖3所示。輸入端輸入尺寸大小為640×640的訓(xùn)練圖像。
圖3 YOLOv5s算法結(jié)構(gòu)
YOLOv5s采用CSPDarknet作為主干網(wǎng)絡(luò),每個模塊包含多個卷積層和一個跨階段連接塊,結(jié)合切片結(jié)構(gòu)(Focus)和空間金字塔池化(spatial pyramid pooling,SPP)共同提取特征。特征融合部分采用FPN+PAN的結(jié)構(gòu),F(xiàn)PN層從頂部傳達語義信息,PAN結(jié)構(gòu)從底部傳達定位信息,以多尺度特征融合的方式將提取到的特征傳入預(yù)測層,輸出尺度分別為80×80、40×40、20×20,實現(xiàn)不同大小目標(biāo)的類型和位置預(yù)測,最后通過非極大抑制算法獲得最終預(yù)測框。
本研究所用茶葉品質(zhì)檢測網(wǎng)絡(luò)以圖3中YOLOv5s作為基準(zhǔn)線網(wǎng)絡(luò),改進后的整體框架結(jié)構(gòu)如圖4所示。在骨干網(wǎng)絡(luò)的末端加入TDC膨脹卷積結(jié)構(gòu),增強網(wǎng)絡(luò)的感受野從而提高網(wǎng)絡(luò)對茶葉的特征提取能力,得到茶葉相鄰區(qū)域的特征信息。由于茶葉目標(biāo)比較密集,容易出現(xiàn)誤檢情況,因此使用CBAM注意力機制進行更新,將網(wǎng)絡(luò)的注意力集中于茶葉的感興趣區(qū)域,進一步增強小目標(biāo)信息的提取能力。接著在頸部C3模塊中添加Swin Transformer,克服CNN卷積操作的局限性,識別底層特征的抽象信息,增強小目標(biāo)茶葉的語義信息。最后使用SimOTA正樣本匹配算法動態(tài)分配正樣本,根據(jù)中心先驗確定茶葉正樣本候選區(qū)域,提升網(wǎng)絡(luò)的檢測速度的同時也提高茶葉品質(zhì)的檢測精度。
注:Conv為卷積,BN為批量歸一化,SILU為激活函數(shù),SPP為池化層,Concat為特征拼接層,Slice為對輸入張量進行切片操作,NMS為非極大抑制算法。
2.2.1 膨脹卷積模塊
由于數(shù)據(jù)集中不同等級茶葉顏色紋理相近,YOLOv5骨干網(wǎng)絡(luò)不能提取到有效的特征。為了增強骨干網(wǎng)絡(luò)對茶葉圖像全局特征的提取能力,本文基于膨脹卷積模塊的感受野優(yōu)勢,提出一種改進的三支路特征融合的結(jié)構(gòu)TDC,如圖5所示。
該模塊利用TridentBlock網(wǎng)絡(luò)得到不同感受野的特征圖,特征圖分別輸入到兩個1×1的卷積層,減少模型參數(shù),上層特征圖經(jīng)過TridentBlock 結(jié)構(gòu)增強網(wǎng)絡(luò)的特征提取能力,并與下層特征圖相加,建立輸入層和輸出層的直接映射關(guān)系,實現(xiàn)特征的轉(zhuǎn)換和提取。
圖5 TDC結(jié)構(gòu)
TridentBlock網(wǎng)絡(luò)通過膨脹率為1、2、3的膨脹卷積擴大網(wǎng)絡(luò)的感受野,在不增加卷積核大小的情況下忽略空間因素的影響,使特征圖覆蓋到更大的范圍,提高計數(shù)準(zhǔn)確率。膨脹卷積的表達式如下所示:
2.2.2 注意力機制模塊
通道注意力利用全局平均池化對每個通道進行池化,然后通過全連接層自適應(yīng)地學(xué)習(xí)通道之間的相互依賴關(guān)系,降低冗余或無關(guān)信息的通道權(quán)重,從而提高特征的判別性。通道注意力的表現(xiàn)形式具體為
空間注意力通過學(xué)習(xí)輸入圖像中不同部分的權(quán)重,加強不同位置特征的關(guān)注度。在茶葉品質(zhì)檢測任務(wù)中,引入空間注意力關(guān)注茶葉的空間位置特征,其計算如下:
2.2.3 C3STR結(jié)構(gòu)
隨著網(wǎng)絡(luò)結(jié)構(gòu)的加深,經(jīng)過多次卷積操作之后,茶葉數(shù)據(jù)集的小尺度目標(biāo)的位置信息粗糙,特征信息易丟失。因此,本研究在特征融合部分引入了Swin Transformer,將其嵌入到C3卷積模塊中,借助窗口自注意力模塊增強小目標(biāo)的語義信息和特征表示,改進后的卷積結(jié)構(gòu)如圖6所示。
Swin Transformer采用了一種新的層次化構(gòu)建方法,通過塊合并層進行下采樣得到更深維度的特征圖。主要由多層感知機、窗口多頭自注意力層、滑動窗口多頭自注意力層、標(biāo)準(zhǔn)化層構(gòu)成,注意力計算式如下:
與傳統(tǒng)Transformer的多頭自注意力模塊(MSA)相比,C3STR模塊在局部窗口內(nèi)計算自注意力,將輸入圖像分成若干個子窗口,并且引入滑動窗口自注意力機制實現(xiàn)各個窗口之間的信息交互,在一定程度上減小了網(wǎng)絡(luò)計算量,提高了計算效率。
圖6 C3STR 結(jié)構(gòu)
2.2.4 SimOTA算法
試驗?zāi)P偷挠布x擇Intel Core i9-10900X CPU@ 3.70 GHz、GeForce RTX3090 Ti GPU,軟件選擇windows10操作系統(tǒng)、PyTorch深度學(xué)習(xí)框架、YOLOv5s網(wǎng)絡(luò)模型、Python編程語言。訓(xùn)練時初始學(xué)習(xí)率設(shè)置為0.001,迭代次數(shù)設(shè)置為200次,批次大小設(shè)置為 16。
選用精準(zhǔn)度(precision,)、召回率(recall,)、平均精度均值(mean average precision,mAP)、模型體積以及檢測速度作為評估指標(biāo)[25]。設(shè)定預(yù)測目標(biāo)和實際目標(biāo)的平均交并比(intersection over union,IoU)閾值為0.5,若IoU超過該閾值則為正樣本,反之則為負(fù)樣本。
為了驗證不同改進方案對茶葉品質(zhì)檢測的優(yōu)化作用,以YOLOv5s為基礎(chǔ)網(wǎng)絡(luò)研究膨脹卷積TDC結(jié)構(gòu)、Swin transformer網(wǎng)絡(luò)以及SimOTA算法的有效性,試驗結(jié)果如表2所示。
將原始YOLOv5s模型中特征提取網(wǎng)絡(luò)的C3結(jié)構(gòu)替換成TDC,改進后的網(wǎng)絡(luò)模型體積有一定量的減小,檢測速度提高了12幀/s,平均精度均值提高了0.8個百分點。證明膨脹卷積TDC結(jié)構(gòu)能夠有效增強網(wǎng)絡(luò)的感受野,提高對茶葉特征的提取能力。在算法中特征融合部分引入Swin Transformer網(wǎng)絡(luò),其余部分不做改動。改進后的網(wǎng)絡(luò)模型的平均精度均值提高了1.3個百分點,模型體積略微變大,檢測速度下降了16幀/s,證明Swin transformer能夠以犧牲部分檢測速度為代價,有效提高對茶葉小尺度特征的全局感知能力。在原有網(wǎng)絡(luò)中加入SimOTA正樣本匹配算法,模型平均精度均值提升了1.1個百分點,模型檢測速度提高了24幀/s,證明SimOTA算法通過動態(tài)分配正樣本,能更好地匹配真實目標(biāo)框,進而提高檢測效率和檢測精度。
表2 YOLOv5s改進后試驗結(jié)果對比
考慮到注意力機制容易受網(wǎng)絡(luò)結(jié)構(gòu)影響,以YOLOv5s+TDC+STR為基礎(chǔ)網(wǎng)絡(luò)研究注意力機制的優(yōu)劣,在基礎(chǔ)網(wǎng)絡(luò)中分別加入CA 注意力模塊、SE注意力模塊、CBAM 注意力模塊和ECA 注意力模塊,可以看出,4種注意力機制都會增加少量的模型體積,但幾乎不影響檢測速度。CA注意力模塊和ECA注意力模塊會影響模型的檢測精度,其中CA注意力模塊使平均精度均值降低了0.2個百分點,ECA模塊使平均精度均值降低了0.1個百分點,明顯不適用于茶葉圖像的目標(biāo)檢測任務(wù)。CBAM注意力模塊和SE注意力模塊對網(wǎng)絡(luò)的平均精度均值帶來一定提升,分別提升了0.5個百分點和0.2個百分點。因此,本文引入的CBAM注意力機制在該模型的應(yīng)用更具有優(yōu)越性。
本研究提出的4種改進方法分別為 TDC、 Swin Transformer、CBAM、SimOTA。在自制茶葉數(shù)據(jù)集上,設(shè)計以下消融試驗方法:1)基于改進前的YOLOv5s 網(wǎng)絡(luò),分別加入4種改進方法,以評估每種改進方法對原算法的優(yōu)化效果;2)基于改進后的YOLOv5s-tea算法,分別移除4種改進方法,以評估每種改進方法對最終算法的效果影響。從表3中可以看出,對比改進前的YOLOv5s 算法,Swin transformer結(jié)構(gòu)對檢測精度的改進效果最為顯著,平均精度均值提高了1.3個百分點,但模型體積增大了0.22 MB,檢測速度降低了14幀/s。引入SimOTA算法對檢測速度的改進效果最強,檢測速度提高了24幀/s,而且對檢測精度也有較好的提升,平均精度均值提高了1.1個百分點。相比于改進后的YOLOv5s-tea算法,移除Swin tranformer,平均精度均值降低了1.2個百分點,對模型的精度影響最大。移除TDC,模型體積增大了0.33 MB,檢測速度降低了12幀/s,對速度影響最大。本文最終提出的YOLOv5s-tea算法相比于改進前的 YOLOv5s,在茶葉數(shù)據(jù)集上檢測平均精度提升了3.8個百分點,模型體積略微增加了0.04 MB,但檢測速度提高了7幀/s,能夠在提高檢測精度的同時保證檢測效率。
表3 消融試驗結(jié)果
注:“√”表示引入改方法。
Note:“√”"indicates the introduction of modification methods.
本文對最終改進后的算法的效果在各個等級數(shù)據(jù)集中進行了對比試驗,如圖7所示。將原始圖放入模型中,各個等級的茶葉均被檢測出并標(biāo)注不同顏色的目標(biāo)框。改進前少數(shù)等級一茶葉被誤識別為等級三和等級四,少數(shù)等級二茶葉被誤識為等級一和等級三,少數(shù)等級四茶葉被誤識別為等級一,改進后誤檢率均有所下降,模型預(yù)測的置信度更高,識別效果更好。
在自制茶葉數(shù)據(jù)集上,將本文算法模型與當(dāng)前主流算法模型在同一環(huán)境下進行對比。選擇Fast-RCNN,SSD,YOLOv3,YOLOv4,YOLOv5s 5種主流網(wǎng)絡(luò)模型進行對比試驗,試驗結(jié)果如表4。
注:淺藍色目標(biāo)框表示等級一的茶葉,黃色目標(biāo)框表示等級二的茶葉,紅色目標(biāo)框表示等級三的茶葉,深藍色目標(biāo)框表示等級四的茶葉,綠色目標(biāo)框表示等級五的茶葉,箭頭指向誤檢或漏檢的茶葉。
表4 與當(dāng)前主流方法的對比試驗結(jié)果
結(jié)果表明,本文提出的算法相比于其他主流檢測模型,在檢測精度和速度上均有很大提升,平均精度均值達到91.9%,檢測速度達到51幀/s相比于YOLOv5s、YOLOv4、YOLOv3、SSD,平均精度均值分別提高了3.8、8.4、18.6、22.9個百分點,檢測速度分別提高了7、28、20、16幀/s。相比于傳統(tǒng)的 two-stage 算法 Faster R-CNN,平均精度均值提高了10.8個百分點,檢測速度提高了44幀/s。綜上所述,本文提出的 YOLOv5s-tea算法不論在檢測精度還是在檢測效率上都具有明顯的優(yōu)勢。
本研究提出基于改進YOLOv5s的名優(yōu)綠茶品質(zhì)檢測算法,為茶葉品質(zhì)的量化評估提供新的科技手段。首先加入膨脹卷積增大網(wǎng)絡(luò)的感受野。其次利用注意力機制改進特征融合模塊。然后根據(jù)Swin Transformer采用新的層次化構(gòu)建方法,借助窗口自注意力模塊增強茶葉小目標(biāo)的語義信息和特征表示。最后配合SimOTA匹配算法動態(tài)分配茶葉正樣本,提高不同品質(zhì)茶葉的識別能力。
試驗結(jié)果表明,本文所提方法可以有效檢測名優(yōu)綠茶的品質(zhì),平均精度均值可以達到91.9%,與Faster-RCNN、SSD、YOLOv3、YOLOv4等模型相比,平均精度均值分別提升10.8、22.9、18.6、8.4個百分點,檢測速度分別提高了44、16、20、28幀/s本文算法的檢測精度和效率均有較大的提升,具有較高的應(yīng)用價值,可以為后續(xù)的茶葉品質(zhì)檢測提供參考。
[1] LIU T, ZHANG L, YANG D, et al. Evaluation of uncertainty in determination of four organophosphorus pesticide residues in fresh tea leaves by gas chromatography[J]. Science and Technology of Food Industry, 2023, 44(1): 323-331.
[2] YE T, YAN H, WANG X, et al. Determination of four aflatoxins on dark tea infusions and aflatoxin transfers evaluation during tea brewing[J]. Food Chemistry, 2023, 405: 134969.
[3] 陶德臣. 近代中國茶葉對外貿(mào)易興盛的社會影響[J]. 重慶大學(xué)學(xué)報(社會科學(xué)版),2023(1):188-200. TAO Dechen. The social influence of the prosperity of tea trade in modern China[J]. Journal of Chongqing University (Social Science Edition),2023(1):188-200. (in Chinese with English abstract)
[4] 郭藝丹. 茶葉檢測評審研究進展與關(guān)鍵技術(shù)分析[J]. 茶葉學(xué)報,2023,64(1):10-20.
[5] 童陽,艾施榮,吳瑞梅,等. 茶葉外形感官品質(zhì)的計算機視覺分級研究[J]. 江蘇農(nóng)業(yè)科學(xué),2019,47(5):170-173.
[6] 何環(huán)珠,蘇成家. 水質(zhì)差異對鐵觀音沖泡品質(zhì)的影響研究[J]. 福建茶葉,2020,42(6):11-12.
[7] 張新. 計算機圖像處理中人工智能算法的運用探究[J]. 信息與電腦(理論版),2022,34(24):43-45. ZHANG Xin. Research on the application of artificial intelligence algorithm in computer image processing[J]. Information and Computer (Theoretical Edition), 2022, 4(24): 43-45. (in Chinese with English abstract)
[8] 張波. 在智能優(yōu)化算法模式下探索計算機圖像處理技術(shù)的應(yīng)用[J]. 電子技術(shù)與軟件工程,2021(1):133-134.
[9] 薛懿威,王玉,王緩,等. 基于高光譜的綠茶加工原料生化成分檢測模型建立[J]. 食品工業(yè)科技,2023,44(10):282-291. XUE Yiwei, WANG Yu, WANG Huan, et al. Establishment of a hyperspectral spectroscopy-based biochemical component detection model for green tea processing materials[J]. Science and Technology of Food Industry, 2023, 44(10): 282-291. (in Chinese with English abstract)
[10] 陳超,齊峰. 卷積神經(jīng)網(wǎng)絡(luò)的發(fā)展及其在計算機視覺領(lǐng)域中的應(yīng)用綜述[J]. 計算機科學(xué),2019,46(3):63-73. CHEN Chao, QI Feng. Review on development of convolutional neural network and its application in computer vision[J]. Computer Science, 2019, 46(3): 63-73. (in Chinese with English abstract)
[11] 孫旭東,廖琪城,韓熹,等. 基于電磁振動上料的茶梗和昆蟲異物近紅外光譜和熒光圖像在線檢測研究[J]. 光譜學(xué)與光譜分析,2023,43(1):100-106. SUN Xudong, LIAO Qicheng, HAN Xi, et al. Research on online detection of tea stalks and insect foreign bodies by near-infrared spectroscopy and fluorescence image combined with electromagnetic vibration feeding[J]. Spectroscopy and Spectral Analysis, 2023, 43(1): 100-106. (in Chinese with English abstract)
[12] 宋彥,汪小中,趙磊,等. 基于近紅外光譜技術(shù)的眉茶拼配比例預(yù)測方法[J]. 農(nóng)業(yè)工程學(xué)報,2022,38(2):307-315. SONG Yan, WANG Xiaozhong, ZHAO Lei, et al. Predicting the blending ratio of Mee tea based on near infrared spectroscopy[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(2): 307-315. (in Chinese with English abstract)
[13] LI X, SUN C, LUO L, et al. Determination of tea polyphenols content by infrared spectroscopy coupled with iPLS and random frog techniques[J]. Computers and Electronics in Agriculture, 2015, 112: 28-35.
[14] HU D, ZHANG Y, LI X, et al. Detection of material on a tray in automatic assembly line based on convolutional neural network[J]. IET Image Processing, 2021, 15(13): 3400-3409.
[15] 徐海衛(wèi),胡常安,湯江文,等. 基于機器視覺的神經(jīng)網(wǎng)絡(luò)在茶葉鑒別中的應(yīng)用[J]. 中國測試,2014,40(3):89-92. XU Haiwei, HU Chang'an, TANG Jiangwen, et al. Application of BP neural network in classification of fresh tea grade based on machine vision[J]. China Measurement and Test, 2014, 40(3): 89-92. (in Chinese with English abstract)
[16] 劉自強,周鐵軍,傅冬和. 基于紋理和分形的鮮茶葉圖像特征提取在茶樹品種識別中的應(yīng)用[J]. 中阿科技論壇,2021(6):123-127.
[17] 汪建,杜世平. 基于顏色和形狀的茶葉計算機識別研究[J]. 茶葉科學(xué),2008,28(6):420-424. WANG Jian, DU Shiping. Identification investigation of tea based on HSI color space and figure[J]. Journal of Tea Science, 2008, 28(6): 420-424. (in Chinese with English abstract)
[18] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Columbus, 2014: 580-587.
[19] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016: 779-788.
[20] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[21] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017: 7263-7271.
[22] 呂軍,夏華鹍,方夢瑞,等. 基于AlexNet的茶葉嫩芽狀態(tài)智能識別研究[J]. 黑龍江八一農(nóng)墾大學(xué)學(xué)報,2019,31(2):72-78. LYU Jun, XIA Huayu, FANG Mengrui, et al. Research on intelligent identification of tea sprouts state based on AlexNet[J]. Journal of Heilongjiang Bayi Agricultural University, 2019, 31(2): 72-78. (in Chinese with English abstract)
[23] 孫肖肖,牟少敏,許永玉,等. 基于深度學(xué)習(xí)的復(fù)雜背景下茶葉嫩芽檢測算法[J]. 河北大學(xué)學(xué)報(自然科學(xué)版),2019,39(2):211-216. SUN Xiaoxiao, MOU Shaomin, XU Yongyu, et al. Detection algorithm of tea tender buds under complex background based on deep learning[J]. Journal of Hebei University (Natural Science Edition), 2019, 39(2): 211-216. (in Chinese with English abstract)
[24] YANG H, CHEN L, CHEN M, et al. Tender tea shoots recognition and positioning for picking robot using improved YOLO-V3 model[J]. IEEE Access, 2019, 7: 180998-181011.
[25] 林森,劉美怡,陶志勇. 采用注意力機制與改進YOLOv5的水下珍品檢測[J]. 農(nóng)業(yè)工程學(xué)報,2021,37(18):307-314. LIN Sen, LIU Meiyi, TAO Zhiyong. Detection of underwater treasures using attention mechanism and improved YOLOv5[J].Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(18): 307-314. (in Chinese with English abstract)
[26] WANG Jinpeng, GAO Kai, JIANG Hongzhe, et al. Method for detecting dragon fruit based on improved lightweight convolutional neural network[J].Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(20): 218-225.
王金鵬,高凱,姜洪喆,等. 基于改進的輕量化卷積神經(jīng)網(wǎng)絡(luò)火龍果檢測方法(英文)[J]. 農(nóng)業(yè)工程學(xué)報,2020,36(20):218-225. (in English with Chinese abstract)
[27] 毛騰躍,張雯娟,帖軍. 基于顯著性檢測和Grabcut算法的茶葉嫩芽圖像分割[J]. 中南民族大學(xué)學(xué)報(自然科學(xué)版),2021,40(1):80-88. MAO Tengyue, ZHANG Wenjuan, TIE Jun Image segmentation of tea buds based on salient object detection and Grabcut[J]. Journal of South-Central University for Nationalities( Natural Science Edition),2021,40( 1) : 80-88.
[28] 王子鈺,趙怡巍,劉振宇. 基于SSD算法的茶葉嫩芽檢測研究[J]. 微處理機,2020,41(4):42-48. WANG Ziyu, ZHAO Yiwei, LIU Zhenyu. Research on tea buds detection based on SSD algorithmon[J]. Microprocessor, 2020, 41(4): 42-48. (in Chinese with English abstract)
[29] BANERJEE M B, ROY R B, TUDU B, et al. Black tea classification employing feature fusion of E-Nose and E-Tongue responses[J]. Journal of Food Engineering, 2018, 244: 55-63.
[30] RICHTER B, RURIK M, GURK S, et al. Food monitoring: Screening of the geographical origin of white asparagus using FT-NIR and machine learning[J]. Food Control, 2019, 104: 318-325.
[31] 安曉飛,王培,羅長海,等. 基于K-means聚類和分區(qū)尋優(yōu)的秸稈覆蓋率計算方法[J]. 農(nóng)業(yè)機械學(xué)報,2021,52(10):84-89. AN Xiaofei, WANG Pei, LUO Changhai, et al. Corn straw coverage calculation algorithm based on K-means clustering and zoning optimization method[J]. Transactions of the Chinese Society for Agricultural Machinery, 2021, 52(10): 84-89. (in Chinese with English abstract)
[32] 張振國,邢振宇,趙敏義,等. 改進YOLOv3的復(fù)雜環(huán)境下紅花絲檢測方法[J]. 農(nóng)業(yè)工程學(xué)報,2023,39(3):162-170. ZHANG Zhenguo, XING Zhenyu, ZHAO Minyi, et al. Detecting safflower filaments using an improved YOLOv3 under complex environments[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2023, 39(3): 162-170. (in Chinese with English abstract)
[33] 馮俊鑫,陳國坤,左麗君,等. 基于GF-6 WFV影像和CSLE模型的山區(qū)耕地侵蝕定量評價及特征分析[J]. 農(nóng)業(yè)工程學(xué)報,2022,38(21):169-179. FENG Junxin, CHEN Guokun, ZUO Lijun, et al. Quantitative evaluation and characteristic analysis of farmland erosion in mountainous areas based on GF-6 WFV image and CSLE model[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(21): 169-179. (in Chinese with English abstract)
[34] 段潔利,王昭銳,鄒湘軍,等. 采用改進YOLOv5的蕉穗識別及其底部果軸定位[J]. 農(nóng)業(yè)工程學(xué)報,2022,38(19):122-130. Duan Jieli, Wang Zhaorui, Zou Xiangjun, et al. Recognition of bananas to locate bottom fruit axis using improved YOLOv5[J].Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(19): 122-130. (in Chinese with English abstract)
[35] 王根,江曉明,黃峰,等. 基于改進YOLOv3網(wǎng)絡(luò)模型的茶草位置檢測算法[J]. 中國農(nóng)機化學(xué)報,2023,44(3):199-207. WANG Gen, JIANG Xiaoming, HUANG Feng, et al. An algorithm for localizing tea bushes and green weeds based on improved YOLOv3 network model[J].Journal of Chinese Agricultural Machinery, 2023, 44(3): 199-207. (in Chinese with English abstract)
[36] 黃少華,梁喜鳳. 基于改進YOLOv5的茶葉雜質(zhì)檢測算法[J]. 農(nóng)業(yè)工程學(xué)報,2022,38(17):329-336. HUANG Shaohua, LIANG Xifeng. Detecting the impurities in tea using an improved YOLOv5 model[J]. Transactions of theChinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(17): 329-336. (in Chinese with English abstract)
Detection of the quality of famous green tea based on improved YOLOv5s
YIN Chuan, SU Yihui, PAN Mian※, DUAN Jinsong
(,,310018,)
The evaluation of tea quality can directly dominate the market value in the tea industry. Among them, the sensory evaluation is widely combined with the rational analysis to assess the quality of tea in recent years. However, some limitations are prone to the random evaluation with high error and strong repeatability. Physical and chemical approaches are also limited in the tea quality assessment, due to the costly, time-consuming, and destructive tasks. In this study, a computer vision-based approach was proposed to assess the appearance and sensory quality of Xinchang's renowned flat green tea. Various characteristics of tea were also considered during detection, such as the small scale, high density, and weak feature significance. A quality detection model was established for the tea shape sensory usingYOLOv5s deep learning and machine vision. A three-band diluted convolution (TDC) structure with the receptive fields was introduced to enhance the extraction of tea features in the backbone network. Additionally, the Convolution Block Attention Module (CBAM) was introduced to determine the attention area in the dense scenes using channel and spatial attention. The local perception of the network was also promoted to improve the detection accuracy of small-scale tea. Furthermore, the Swin Transformer network structure was also introduced to enhance the semantic information and feature representation of small targets with the help of window self-attention in the feature fusion stage. Finally, the positive sample matching was improved by the dynamically allocating positive samples using the SimOTA. An optimal box of sample matching was assigned to each positive tea sample for the high efficiency and the detection accuracy of the network. The ablation experiment was performed on the self-made tea dataset. The results show that the modified model was significantly improved the average accuracy of target detection on tea images. The improved YOLOv5 presented the higher confidence score in the tea quality detection than the conventional one. The higher accuracy was also achieved in the positioning. The detection accuracy increased by 3.8 percentage points in the applied dataset, indicating the greatly reduced false detection. Mean Average Precision (mAP) and Frame Per Second (FPS) reached 91.9% and 51 frames/s, respectively, indicating the multiclass average accuracy of YOLOv5. The FPS was also improved by 7 frames/s. The excellent real-time performance was achieved in the higher recognition accuracy and speed, compared with current mainstream target detections, indicating the feasibility and superiority of this model. These findings can provide a strong reference to improve the quality detection in tea market. In conclusion, the computer vision-based approach of the YOLOv5s can be expected to serve as a novel and effective way for the appearance and sensory quality of tea, with the better accuracy, speed, and efficiency in tea industry.
machine vision;image recognition;tea quality detection; YOLOv5s; receptive field; swin transformer; attention mechanism; SimOTA
2022-12-21
2023-04-24
浙江省自然科學(xué)基金資助項目(LQY20F010001)
尹川,博士,副教授,研究方向為深度學(xué)算法在農(nóng)業(yè)方向的應(yīng)用。Email:yinc@hdu.edu.cn
潘勉,博士,副教授,研究方向為計算機視覺、農(nóng)業(yè)智能信息處理等。Email:ai@hdu.edu.cn
10.11975/j.issn.1002-6819.202212151
TP391.4;S24
A
1002-6819(2023)-08-0179-09
尹川,蘇議輝,潘勉,等. 基于改進YOLOv5s的名優(yōu)綠茶品質(zhì)檢測[J]. 農(nóng)業(yè)工程學(xué)報,2023,39(8):179-187. doi:10.11975/j.issn.1002-6819.202212151 http://www.tcsae.org
YIN Chuan, SU Yihui, PAN Mian, et al. Detection of the quality of famous green tea based on improved YOLOv5s[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2023, 39(8): 179-187. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.202212151 http://www.tcsae.org