楊會(huì)渠 楊國(guó)為 何金鐘 徐健
摘要:為了解決量化模型不支持全整數(shù)推斷及共享指數(shù)受奇異值影響等問(wèn)題,本文提出一種支持全整數(shù)推斷的神經(jīng)網(wǎng)絡(luò)遞增定點(diǎn)量化算法(integer-only incremental quantization,IOIQ)。通過(guò)將神經(jīng)網(wǎng)絡(luò)權(quán)重和特征從浮點(diǎn)數(shù)據(jù)轉(zhuǎn)換為帶有整數(shù)共享指數(shù)(integer-shared exponent,INT-SE)的數(shù)據(jù),實(shí)現(xiàn)浮點(diǎn)模型的有效壓縮。在偽量化訓(xùn)練中,IOIQ算法采用遞增量化策略,對(duì)浮點(diǎn)數(shù)據(jù)進(jìn)行逐步量化和迭代更新,彌補(bǔ)了一次性量化精度損失較大的不足。為解決推理時(shí)數(shù)據(jù)溢出問(wèn)題,通過(guò)分別統(tǒng)計(jì)神經(jīng)網(wǎng)絡(luò)模型每層量化數(shù)據(jù)共享指數(shù)的差異,確定各層輸出特征的最佳截位點(diǎn),并給出了量化模型在推理側(cè)的硬件實(shí)現(xiàn)方案,而經(jīng)IOIQ算法量化的神經(jīng)網(wǎng)絡(luò)模型,在推斷過(guò)程中不含任何浮點(diǎn)數(shù)據(jù),全部為整數(shù)運(yùn)算,易于邊緣側(cè)部署。實(shí)驗(yàn)結(jié)果表明,在8 bit位精度下,經(jīng)IOIQ算法量化后的ResNet50,在CIFAR數(shù)據(jù)集上,top-1準(zhǔn)確率下降0.2%,在ImageNet數(shù)據(jù)集上,top-1準(zhǔn)確率下降0.58%,性能優(yōu)于高效純整數(shù)推理和遞增網(wǎng)絡(luò)等量化方法。該研究具有重要的實(shí)際應(yīng)用價(jià)值。
關(guān)鍵詞:神經(jīng)網(wǎng)絡(luò); 量化; 全整數(shù); 共享指數(shù); 遞增量化; 最佳截位點(diǎn)
中圖分類號(hào):TP183 文獻(xiàn)標(biāo)識(shí)碼:A
文章編號(hào):1006-9798(2023)02-0010-08; DOI:10.13306/j.1006-9798.2023.02.002
基金項(xiàng)目:國(guó)家自然科學(xué)基金面上項(xiàng)目(62172229)
作者簡(jiǎn)介:楊會(huì)渠(1998-),男,碩士研究生,主要研究方向?yàn)锳I加速芯片和神經(jīng)網(wǎng)絡(luò)模型壓縮。
通信作者:楊國(guó)為(1964-),男,博士,教授,碩士生導(dǎo)師,主要研究方向?yàn)橹悄苄畔⑻幚?、模式識(shí)別和智能控制。Email:ygw_ustb@163.com
神經(jīng)網(wǎng)絡(luò)模型壓縮技術(shù)致力于在不顯著降低模型性能的情況下,減少計(jì)算資源和存儲(chǔ)空間的消耗[1-3],主要包括剪枝[4]、量化[5-10]、知識(shí)蒸餾[11]、緊湊神經(jīng)網(wǎng)絡(luò)設(shè)計(jì)[12-15]等方法。模型量化因其能夠簡(jiǎn)易部署和加速硬件推理,引起了世界各國(guó)學(xué)術(shù)界和產(chǎn)業(yè)界的廣泛關(guān)注。LIN D等人[16]提出了一種共享指數(shù)的定點(diǎn)數(shù)據(jù),通過(guò)監(jiān)控權(quán)重中的最大值來(lái)確定共享指數(shù);B.JACOB等人[17]提出了高效純整數(shù)推理的量化算法(integer-arithmetic-only,IAO),并采用了偽量化訓(xùn)練的思想,在硬件設(shè)備上得到了較好的驗(yàn)證;ZHOU A等人[18]提出遞增量化算法(incremental network quantization,INQ),該算法一次量化一部分?jǐn)?shù)據(jù),讓未量化的數(shù)據(jù)微調(diào)后再進(jìn)行量化,直至數(shù)據(jù)全部量化完畢。雖然上述部分研究工作完成了全定點(diǎn)數(shù)據(jù)的設(shè)計(jì),但它們直接通過(guò)最大值和最小值選取共享指數(shù),導(dǎo)致共享指數(shù)受到奇異值影響,使定點(diǎn)數(shù)據(jù)偏移浮點(diǎn)數(shù)據(jù)。一些研究者為了保證模型的性能,沒(méi)有完全量化模型(只量化了權(quán)重),加大了硬件部署的困難。綜合上述分析,為了解決量化模型不支持全整數(shù)推斷、共享指數(shù)受奇異值影響等問(wèn)題,本文提出了支持全整數(shù)推斷的神經(jīng)網(wǎng)絡(luò)遞增定點(diǎn)量化算法。該算法以帶有共享指數(shù)的整型數(shù)據(jù)表示浮點(diǎn)數(shù)據(jù),通過(guò)統(tǒng)計(jì)浮點(diǎn)數(shù)據(jù)分布,確定最佳共享指數(shù),保持了數(shù)據(jù)的有效精度,并通過(guò)遞增量化對(duì)定點(diǎn)數(shù)據(jù)微調(diào),最終得到神經(jīng)網(wǎng)絡(luò)量化模型。該量化模型支持全整數(shù)推斷,能夠加快模型推理速度,降低了輕量化設(shè)備部署的難度。該研究解決了奇異值影響共享指數(shù)的問(wèn)題,具有一定的創(chuàng)新性。
1 IOIQ算法
IOIQ算法是一種為全整數(shù)推斷而設(shè)計(jì)的量化算法,支持將權(quán)重和激活都量化為8 bit,甚至更低的比特。IOIQ的算法框架如圖1所示,輸入浮點(diǎn)訓(xùn)練樣本,待量化訓(xùn)練完成后,輸出量化模型。
IOIQ算法包括INT-SE數(shù)據(jù)量化和遞增量化2部分。其中,INT-SE數(shù)據(jù)量化策略貫穿整個(gè)量化過(guò)程,遞增量化則是通過(guò)每個(gè)迭代周期更新掩碼矩陣,實(shí)現(xiàn)對(duì)權(quán)重?cái)?shù)據(jù)的逐步量化。IOIQ算法接收到一組浮點(diǎn)數(shù)據(jù)后,首先通過(guò)INT-SE量化方法將浮點(diǎn)數(shù)據(jù)量化為INT-SE數(shù)據(jù),并根據(jù)不同的神經(jīng)網(wǎng)絡(luò)層(卷積層/全連接層或批歸一化層),將INT-SE數(shù)據(jù)送入不同的量化模塊。其中,在卷積層/全連接層量化模塊中,先將浮點(diǎn)權(quán)重?cái)?shù)據(jù)量化為INT-SE數(shù)據(jù),再將其與原始浮點(diǎn)權(quán)重?cái)?shù)據(jù)組合成復(fù)合權(quán)重進(jìn)行計(jì)算,最后將該層的輸出特征量化為INT-SE數(shù)據(jù)。在批歸一化層量化模塊中,先迭代學(xué)習(xí)方差和均值,然后鎖定參數(shù),簡(jiǎn)化計(jì)算方式,待所有神經(jīng)網(wǎng)絡(luò)層均計(jì)算完畢后,根據(jù)遞增量化策略,更新掩碼矩陣。重復(fù)上述過(guò)程,直到達(dá)到最大迭代次數(shù),訓(xùn)練完畢。最后統(tǒng)計(jì)各層的共享指數(shù)差異,計(jì)算輸出特征最佳截位點(diǎn),最終得到神經(jīng)網(wǎng)絡(luò)量化模型。
1.1 INT-SE數(shù)據(jù)量化策略
為了最大程度減少存儲(chǔ)消耗,本文提出一種專用于整數(shù)計(jì)算的INT-SE數(shù)據(jù)格式。INT-SE數(shù)據(jù)之間所有的計(jì)算全是整型數(shù)據(jù)之間的乘加運(yùn)算,計(jì)算更加簡(jiǎn)單,方便硬件設(shè)計(jì)。
INT-SE數(shù)據(jù)包含表示數(shù)據(jù)部分的數(shù)據(jù)D(Data)和表示數(shù)據(jù)部分所有成員小數(shù)點(diǎn)位置的共享指數(shù)s(shared exponent)2部分。INT-SE數(shù)據(jù)可表示的浮點(diǎn)數(shù)值T為
在INT-SE數(shù)據(jù)中,共享指數(shù)s是一個(gè)能夠表示數(shù)量級(jí),并且影響量化數(shù)值范圍的值,其扮演的角色相當(dāng)于量化系數(shù)。因此,為了減少參數(shù)量,使計(jì)算更加簡(jiǎn)便,在IOIQ算法中,每一個(gè)層的參數(shù)僅設(shè)置一個(gè)共享指數(shù)。為了更加精確地表示量化數(shù)值,減少誤差,IOIQ算法以均方誤差(mean square error,MSE)為指標(biāo),選取最佳共享指數(shù)。MSE代表了量化前后參數(shù)向量的歐氏距離,MSE越小,表示量化前后的值越接近。INT-SE數(shù)據(jù)量化流程如圖2所示。
在上述過(guò)程中,隨著迭代采用不同的掩碼矩陣,使權(quán)重矩陣實(shí)現(xiàn)了遞增式的量化。IOIQ算法采用兩種策略,以獲得掩碼矩陣,即隨機(jī)策略和剪枝策略。其中,隨機(jī)策略將會(huì)根據(jù)量化比例,隨機(jī)生成掩碼矩陣;剪枝策略根據(jù)當(dāng)前層神經(jīng)網(wǎng)絡(luò)中權(quán)重的絕對(duì)值大小和量化比例,設(shè)置掩碼矩陣。
1.3 批歸一化層及偏置量化方法
卷積層和全連接層可以直接將權(quán)重轉(zhuǎn)化為INT-SE數(shù)據(jù),但批歸一化層有4種不同的參數(shù),且推理側(cè)中的均值和方差是訓(xùn)練過(guò)程中的均值和方差,是通過(guò)滑動(dòng)平均得到。此時(shí),批歸一化層含有除法和開(kāi)方等復(fù)雜運(yùn)算,為了能夠簡(jiǎn)化批歸一化層的計(jì)算方式,讓批歸一化層更加適應(yīng)硬件設(shè)計(jì),并且避免量化批歸一化層模型精度的損失。IOIQ算法針對(duì)批歸一化層量化訓(xùn)練,提出了學(xué)習(xí)參數(shù)和鎖定參數(shù)兩種不同批歸一化層的計(jì)算方式。
首先通過(guò)學(xué)習(xí)參數(shù)計(jì)算方式更新滑動(dòng)均值μ和滑動(dòng)方差σ2,并通過(guò)鎖定參數(shù)將4種參數(shù)融合后量化,進(jìn)而通過(guò)式(8)計(jì)算出浮點(diǎn)輸出特征,最后將浮點(diǎn)輸出特征量化為INT-SE數(shù)據(jù)的輸出特征送往下一層計(jì)算。由于偏置需要與乘法或卷積的臨時(shí)結(jié)果進(jìn)行累加,如果偏置的共享指數(shù)小于臨時(shí)結(jié)果的共享指數(shù),就會(huì)增加計(jì)算困難。因此,在IOIQ算法中,先對(duì)偏置進(jìn)行統(tǒng)計(jì),計(jì)算出參考共享指數(shù),然后對(duì)其進(jìn)行限定,如果加法對(duì)象的共享指數(shù)為s,那么偏置的共享指數(shù)取值范圍確定在[s+7,s]范圍內(nèi)。在確定偏置共享指數(shù)后,將偏置從浮點(diǎn)數(shù)據(jù)量化為INT-SE數(shù)據(jù)。批歸一化層鎖定參數(shù)量化訓(xùn)練流程如圖5所示。
1.4 量化模型在推理側(cè)的硬件實(shí)現(xiàn)方案
神經(jīng)網(wǎng)絡(luò)量化模型在推理側(cè)硬件實(shí)現(xiàn)時(shí),僅乘加運(yùn)算導(dǎo)致數(shù)據(jù)溢出時(shí)會(huì)造成精度的損失,其他的乘加運(yùn)算和移位操作都不會(huì)損失精度。因此,為了避免數(shù)據(jù)溢出,保留數(shù)據(jù)精度,IOIQ算法在完成量化訓(xùn)練后,統(tǒng)計(jì)神經(jīng)網(wǎng)絡(luò)每一層的權(quán)重共享指數(shù)sw、訓(xùn)練過(guò)程中輸入共享指數(shù)的滑動(dòng)平均值sin和輸出共享指數(shù)的滑動(dòng)平均值sout,計(jì)算最佳截位點(diǎn)(optimal truncation point,OTP),最佳截位點(diǎn)POT為
在硬件中,量化后,卷積層、全連接層和批歸一化層全整數(shù)計(jì)算流程如圖6所示。圖6中,輸入的整型數(shù)據(jù)Din和權(quán)重的整型數(shù)據(jù)Dw,會(huì)不斷進(jìn)行整數(shù)乘加運(yùn)算,得到臨時(shí)結(jié)果的整型數(shù)據(jù)Dt1,臨時(shí)結(jié)果的共享指數(shù)st1是輸入共享指數(shù)sin與權(quán)重共享指數(shù)sw的和。為了與臨時(shí)結(jié)果數(shù)據(jù)對(duì)齊,偏置的整型數(shù)據(jù)Db要根據(jù)st1和sb進(jìn)行移位操作后與臨時(shí)結(jié)果Dt1累加,得到累加結(jié)果Dt2。經(jīng)過(guò)多次累加操作,累加結(jié)果的數(shù)據(jù)位寬最大可擴(kuò)展到32位,為了輸出數(shù)據(jù)和輸入數(shù)據(jù)保持同樣的數(shù)據(jù)位寬,將Dt2、st2和最佳截位點(diǎn)POT,通過(guò)舍入與截?cái)嗖僮鳎≧&T),得到與輸入數(shù)據(jù)同樣數(shù)據(jù)位寬的輸出數(shù)據(jù)Dout和sout。
2 實(shí)驗(yàn)與結(jié)果分析
2.1 實(shí)驗(yàn)設(shè)計(jì)
為充分驗(yàn)證IOIQ算法的性能,本文設(shè)計(jì)多比特實(shí)驗(yàn)、共享指數(shù)實(shí)驗(yàn)和遞增量化實(shí)驗(yàn),并與典型的神經(jīng)網(wǎng)絡(luò)量化算法BWN[19-20]、TWN[21]、INQ[18]、FGQ[22]、IAO[17]進(jìn)行了對(duì)比分析。實(shí)驗(yàn)中,考察了ResNet18[23]、ResNet34[23]、ResNet50[23]和輕量級(jí)網(wǎng)絡(luò)MobileNetv2[14]4種神經(jīng)網(wǎng)絡(luò)模型的量化結(jié)果。
數(shù)據(jù)集采用CIFAR和ImageNet。其中,CIFAR數(shù)據(jù)集[24]包含8 000萬(wàn)張微型帶標(biāo)簽圖像;ImageNet數(shù)據(jù)集[25]包含約120萬(wàn)張訓(xùn)練圖像和5萬(wàn)張驗(yàn)證圖像。
2.2 實(shí)驗(yàn)結(jié)果與分析
2.2.1 多比特實(shí)驗(yàn)結(jié)果
神經(jīng)網(wǎng)絡(luò)模型由浮點(diǎn)數(shù)據(jù)量化為不同位數(shù)INTS-SE數(shù)據(jù),IOIQ算法不同比特量化實(shí)驗(yàn)結(jié)果如表1所示。
由表1可以看出,ResNet18在CIFAR10下,16 bit量化后準(zhǔn)確率較FP32上升了0.02%,12 bit上升了0.01%,8 bit下降了0.03%,6 bit下降幅度最高,達(dá)到了4.35%。實(shí)驗(yàn)結(jié)果表明,在高比特出現(xiàn)了超越浮點(diǎn)準(zhǔn)確率的現(xiàn)象。因?yàn)槟壳暗纳疃壬窠?jīng)網(wǎng)絡(luò)普遍存在冗余,將網(wǎng)絡(luò)適當(dāng)量化可能會(huì)減少冗余,從而提升模型泛化能力,而過(guò)低的比特位由于信息量太少,會(huì)導(dǎo)致精度下降,如6 bit量化。本實(shí)驗(yàn)與原浮點(diǎn)模型相比,8 bit量化結(jié)果僅下降了0.03%。因此,本文IOIQ算法在8 bit位上取得最優(yōu)的性能平衡,后續(xù)實(shí)驗(yàn)基于8位量化展開(kāi)。
2.2.2 共享指數(shù)實(shí)驗(yàn)結(jié)果
給出不同共享指數(shù)候選集對(duì)量化模型性能的影響結(jié)果,不同的共享指數(shù)候選集實(shí)驗(yàn)結(jié)果如表2所示。表2中,s代表初始共享指數(shù),+1代表統(tǒng)計(jì)得到的共享指數(shù)+1。
由表2可以看出,當(dāng)s-1,s,s+1,s+2,s-2這5個(gè)共享指數(shù)候選值單獨(dú)作為候選集進(jìn)行量化實(shí)驗(yàn)時(shí),量化模型效果最好的是初始共享指數(shù)s,ResNet18準(zhǔn)確率僅下降了0.43%;最差的是s-2,準(zhǔn)確率下降了22.2%。2個(gè)共享指數(shù)候選值比1個(gè)共享指數(shù)候選值得到的量化模型性能更好,僅僅下降了0.3%左右;4個(gè)共享指數(shù)候選值得到的量化模型性能最優(yōu),ResNet18量化后準(zhǔn)確率僅下降0.18%。實(shí)驗(yàn)結(jié)果表明,單一的共享指數(shù)不能使INT-SE數(shù)據(jù)表達(dá)精確,更多的共享指數(shù)作為候選值,能夠克服奇異值對(duì)共享指數(shù)的影響,讓INT-SE數(shù)據(jù)表達(dá)更加精確。綜合考慮性能與時(shí)間成本,最后選擇4個(gè)候選值作為候選集選取最優(yōu)共享指數(shù)。
2.2.3 遞增量化實(shí)驗(yàn)結(jié)果
給出剪枝遞增和隨機(jī)遞增2種遞增量化策略對(duì)模型性能的影響,不同遞增量化方法對(duì)比結(jié)果如表3所示。由表3可以看出,在CIFAR10數(shù)據(jù)集上,ResNet18使用剪枝遞增策略得到的量化模型,準(zhǔn)確率僅下降0.18%,使用隨機(jī)遞增策得到的量化模型,準(zhǔn)確率下降0.37%。
表3證明了使用剪枝遞增策略得到的量化模型,性能更加優(yōu)秀,因?yàn)樯窠?jīng)網(wǎng)絡(luò)對(duì)于權(quán)重參數(shù)比較敏感,絕對(duì)值大的權(quán)重對(duì)結(jié)果的影響比較大,若進(jìn)行隨機(jī)量化,絕對(duì)值大的權(quán)重可能在最后一次進(jìn)行量化,需要更長(zhǎng)的訓(xùn)練周期進(jìn)行微調(diào),因此IOIQ算法使用剪枝遞增策略對(duì)權(quán)重進(jìn)行量化。
為了查看量化前后數(shù)據(jù)是否一致,分別顯示ResNet18在CIFAR10數(shù)據(jù)集的浮點(diǎn)模型和量化模型權(quán)重參數(shù)概率密度分布。量化前后,卷積層權(quán)重概率密度分布如圖7所示,全連接層權(quán)重概率密度分布如圖8所示,批歸一化層權(quán)重概率密度分布如圖9所示。
由圖7~圖9可以看出,卷積層和全連接層浮點(diǎn)模型和量化模型權(quán)重分布基本一致,只在批歸一化層會(huì)有一些量化損失。因?yàn)榕鷼w一化層4個(gè)參數(shù)量化后直接鎖定,沒(méi)有經(jīng)過(guò)訓(xùn)練進(jìn)行微調(diào),使一些數(shù)據(jù)偏移了浮點(diǎn)數(shù)據(jù)。
2.2.4 與其他算法比較結(jié)果
在與其他算法進(jìn)行對(duì)比之前,首先對(duì)IOIQ算法進(jìn)行擴(kuò)展實(shí)驗(yàn),不同網(wǎng)絡(luò)基于IOIQ算法的實(shí)驗(yàn)結(jié)果如表4所示。由表4可以看出,在CIFAR10數(shù)據(jù)集上,ResNet50量化后,準(zhǔn)確率下降0.2%;在CIFAR100數(shù)據(jù)集上,ResNet50下降0.52%;在ImageNet數(shù)據(jù)集上,ResNet50下降了0.58%,MobileNetv2下降了0.35%。
為了進(jìn)一步證明IOIQ算法的性能,將IOIQ算法與BWN[19-20]、TWN[21]、INQ[18]、FGQ[22]、IAO[17]進(jìn)行對(duì)比試驗(yàn),Resnet50在ImageNet上的不同方法性能對(duì)比結(jié)果如表5所示。由表5可知,IOIQ算法的性能優(yōu)于其他算法,與IAO算法相比,其準(zhǔn)確率高0.92%,并且本文的算法改進(jìn)了其他算法不支持全整數(shù)推斷、共享指數(shù)受奇異值影響和一次量化精度下降的缺點(diǎn),更加適應(yīng)硬件設(shè)計(jì)。
3 結(jié)束語(yǔ)
本文提出了支持全整數(shù)推斷的神經(jīng)網(wǎng)絡(luò)遞增定點(diǎn)量化算法IOIQ,通過(guò)INT-SE量化方法,將神經(jīng)網(wǎng)絡(luò)的參數(shù)量化到整型數(shù)據(jù),并通過(guò)遞增量化微調(diào)數(shù)據(jù)。其中,通過(guò)將MSE作為指標(biāo),選取最優(yōu)的共享指數(shù),解決了奇異值影響共享指數(shù)的問(wèn)題。針對(duì)一次性量化帶來(lái)量化精度下降問(wèn)題,IOIQ使用改進(jìn)的遞增量化算法,每次僅量化一部分?jǐn)?shù)據(jù),且量化后的數(shù)據(jù)仍然可更新。由于對(duì)整數(shù)運(yùn)算存在數(shù)據(jù)溢出的問(wèn)題,在訓(xùn)練結(jié)束時(shí),統(tǒng)計(jì)了神經(jīng)網(wǎng)絡(luò)各層的共享指數(shù),得到輸出特征的最佳截位點(diǎn),在量化完成后,量化模型中參數(shù)全部為INT-SE數(shù)據(jù),極大地節(jié)省了存儲(chǔ)空間,且模型推斷計(jì)算過(guò)程全部是整數(shù)的乘加運(yùn)算,加快了運(yùn)算速度,更加適應(yīng)硬件設(shè)計(jì)。實(shí)驗(yàn)結(jié)果表明,IOIQ算法在ResNet和MobileNet網(wǎng)絡(luò)上取得了較好成果,經(jīng)IOIQ算法量化后的ResNet50,在CIFAR數(shù)據(jù)集上的top1,準(zhǔn)確率下降0.2%,在ImageNet數(shù)據(jù)集上top1,準(zhǔn)確率下降0.58%,證明了本研究方法的有效性。為了探索硬件加速的極限,未來(lái)可進(jìn)一步研究更低比特位量化的策略。
參考文獻(xiàn):
[1] NAN K,LIU S,DU J,et al. Deep model compression for mobile platforms:A survey[J]. Tsinghua Science and Technology,2019,24(6):677-693.
[2] HU D,KRISHNAMACHARI B. Fast and accurate streaming CNN inference via communication compression on the edge[C]∥2020 IEEE/ACM Fifth International Conference on Internet-of-Things Design and Implementation (IoTDI). Sydney,NSW,Australia:ACM,2020.
[3] YEJING L A I,SHANFENG H A O,DINGJIANG H. Methods and progress in deep neural network model compression[J]. Journal of East China Normal University (Natural Science),2020(5):68-82.
[4] ALVAREZ J M,SALZMANN M. Learning the number of neurons in deep networks[C]∥ In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS′16). Red Hook,NY,USA:Curran Associates Inc,2016:2270-2278.
[5] 丁文銳,劉春蕾,李越,等. 二值卷積神經(jīng)網(wǎng)絡(luò)綜述[J]. 航空學(xué)報(bào),2021,42(6):192-206.
[6] RASTEGARI M,ORDONEZ V,REDMON J,et al. Xnor-net:Imagenet classification using binary convolutional neural networks[C]∥European Conference on Computer Vision. Amsterdam,Netherlands:Springer,Cham,2016:525-542.
[7] CHEN W,WILSON J T,TYREE S,et al. Compressing neural networks with the hashing trick[C]∥International Conference on International Conference on Machine Learning. Lille,F(xiàn)rance:JMLR,2015:2285-2294.
[8] HAN S,MAO H,DALLY W J. Deep compression:Compressing deep neural networks with pruning,trained quantization and huffman coding[J/OL]. arXiv preprint arXiv:1510.00149,2015.
[9] COURBARIAUX M,BENGIO Y,DAVID J P. BinaryConnect:Training deep neural networks with binary weights during propagations[C]∥International Conference on Neural Information Processing Systems. Montreal,Canada:MIT Press,2015:3123–3131.
[10] QIN H,GONG R,LIU X,et al. Forward and backward information retention for accurate binary neural networks[C]∥2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle,WA,USA:IEEE,2020:2247-2256.
[11] HINTON G,VINYALS O,DEAN J. Distilling the Knowledge in a Neural Network[J]. Computer Science,2015,14(7):38-39.
[12] IANDOLA F N,HAN S,MOSKEWICZ M W,et al. SqueezeNet:AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size[J/OL]. arXiv preprint arXiv:1602. 07360,2016.
[13] HOWARD A G,ZHU M,CHEN B,et al. Mobilenets:Efficient convolutional neural networks for mobile vision applications[J/OL]. arXiv preprint arXiv:1704. 04861,2017.
[14] SANDLER M,HOWARD A,ZHU M,et al. MobileNetV2:Inverted residuals and linear bottlenecks[C]∥ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City,UT,USA:IEEE,2018:4510-4520.
[15] HOWARD A,SANDLER M,CHEN B,et al. Searching for MobileNetV3[C]∥2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul,Korea(South):IEEE,2020:1314-1324.
[16] LIN D,TALATHI S,ANNAPUREDDY S. Fixed point quantization of deep convolutional networks[C]∥International conference on machine learning. New York,USA:PMLR,2016:2849-2858.
[17] JACOB B,KLIGYS S,CHEN B,et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA:IEEE,2018:2704-2713.
[18] ZHOU A,YAO A,GUO Y,et al. Incremental network quantization:Towards lossless cnns with low-precision weights[J/OL]. arXiv preprint arXiv:1702. 03044,2017.
[19] HUBARA I,COURBARIAUX M,SOUDRY D,et al. Quantized neural networks:Training neural networks with low precision weights and activations[J]. The Journal of Machine Learning Research,2017,18(1):6869-6898.
[20] LENG C,DOU Z,LI H,et al. Extremely low bit neural network:Squeeze the last bit out with admm[C]∥Thirty-Second AAAI Conference on Artificial Intelligence. Louisiana,USA:AAAI,2018:3466-3473.
[21] LI F,ZHANG B,LIU B. Ternary weight networks[J/OL]. arXiv preprint arXiv:1605. 04711,2016.
[22] MELLEMPUDI N,KUNDU A,MUDIGERE D,et al. Ternary neural networks with fine-grained quantization[J/OL]. arXiv preprint arXiv:1705. 01462,2017.
[23] HE K,ZHANG X,REN S,et al. Deep residual learning for image recognition[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas,NV,USA:IEEE,2016:770-778.
[24] ALEX K,HINTON G. Learning multiple layers of features from tiny images[R]. Toronto:University of Toronto,2009.
[25] JIA D,WEI D,SOCHER R,et al. ImageNet:A large-scale hierarchical image database[C]∥ 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami,F(xiàn)L,USA:IEEE,2009:248-255.
Abstract:To port convolutional neural networks to lightweight devices,this paper proposes incremental fixed-point quantization algorithm for neural networks supporting integer-only inference (IOIQ). The effective compression of the floating-point model is achieved by converting the neural network weights and features from floating-point data to data with integer-shared exponent (INT-SE). In the pseudo-quantization training,the IOIQ algorithm adopts incremental quantization strategy to gradually quantize and iteratively update the floating-point data,which makes up for the large loss of one-time quantization accuracy. To solve the data overflow problem during inference,the optimal cutoff point of the output features of each layer is determined by separately counting the difference of the quantized data sharing index of each layer of the neural network model,and the hardware implementation scheme of the quantized model on the inference side is given,while the neural network model quantized by the IOIQ algorithm does not contain any floating-point data during inference,and all of them are integer operations,which are easy to deploy on the edge side. Experimental results show that the top-1 accuracy of ResNet50 quantized by the IOIQ algorithm decreases by 0.2% on the CIFAR dataset and by 0.58% on the ImageNet dataset at eight-bit precision,outperforming the algorithm of efficient integer-arithmetic-only inference (IAO) quantization and incremental network quantization (INQ). This research has important practical applications.
Key words:neural network; quantization; full integer; shared exponential; incremental quantization; optimal cutoff point