楊亮東 楊志霞
摘 要:針對(duì)魯棒非負(fù)矩陣分解(RNMF)的運(yùn)算規(guī)模隨訓(xùn)練樣本數(shù)量逐漸增多而不斷增大的問(wèn)題,提出一種稀疏限制的增量式魯棒非負(fù)矩陣分解算法。首先,對(duì)初始數(shù)據(jù)進(jìn)行魯棒非負(fù)矩陣分解;然后,將其分解結(jié)果參與到后續(xù)迭代運(yùn)算;最后,在對(duì)系數(shù)矩陣增加稀疏限制的情況下與增量式學(xué)習(xí)相結(jié)合,使目標(biāo)函數(shù)值在迭代求解時(shí)下降地更快。該算法在節(jié)省運(yùn)算時(shí)間的同時(shí)提高了分解后數(shù)據(jù)的稀疏度。在數(shù)值實(shí)驗(yàn)中,將所提算法與魯棒非負(fù)矩陣分解算法、稀疏限制的魯棒非負(fù)矩陣分解(RNMFSC)算法進(jìn)行了比較。在ORL和YALE人臉數(shù)據(jù)庫(kù)上的實(shí)驗(yàn)結(jié)果表明,所提算法在運(yùn)算時(shí)間和分解后數(shù)據(jù)的稀疏度等方面均優(yōu)于其他兩個(gè)算法,并且還具有較好的聚類效果,尤其在YALE人臉數(shù)據(jù)庫(kù)上當(dāng)聚類類別數(shù)為3時(shí)該算法的聚類準(zhǔn)確率達(dá)到了91.67%。
關(guān)鍵詞:增量式學(xué)習(xí);非負(fù)矩陣分解;稀疏限制;聚類;人臉識(shí)別
中圖分類號(hào):TP181
文獻(xiàn)標(biāo)志碼:A
Abstract: Aiming at the problem that the operation scale of Robust Nonnegative Matrix Factorization (RNMF) increases with the number of training samples, an incremental robust nonnegative matrix factorization algorithm with sparseness constraints was proposed. Firstly, robust nonnegative matrix factorization was performed on initial data. Then, the factorized result participated in the subsequent iterative operation. Finally, with sparseness constraints, the coefficient matrix was combined with incremental learning, which made the objective function value fall faster in the iterative solution. The cost of computation was reduced and the sparseness of data after factorization was improved. In the numerical experiments, the proposed algorithm was compared with RNMF algorithm and RNMF with Sparseness Constraints (RNMFSC) algorithm. The experimental results on ORL and YALE face databases show that the proposed algorithm is superior to the other two algorithms in terms of operation time and sparseness of factorized data, and has better clustering effect, especially in YALE face database, when the clustering number is 3, the clustering accuracy of the proposed algorithm reaches 91.67%.
英文關(guān)鍵詞Key words: incremental learning; Nonnegative Matrix Factorization (NMF); sparseness constraint; clustering; face recognition
0 引言
圖像處理已經(jīng)廣泛應(yīng)用于生活中各個(gè)行業(yè),涉及機(jī)器學(xué)習(xí)、模式識(shí)別、數(shù)據(jù)挖掘等領(lǐng)域,也與人類大腦的認(rèn)知原理緊密相關(guān),受到了國(guó)內(nèi)外學(xué)者的廣泛關(guān)注。在處理圖像過(guò)程中子空間降維是一種最為常見的方法, 主要思路是將高維空間的原始數(shù)據(jù)(包括向量數(shù)據(jù)、矩陣數(shù)據(jù)、張量數(shù)據(jù))映射到低維子空間進(jìn)行處理,有效避免了維度災(zāi)難[1]所帶來(lái)的困難。為了處理矩陣中的數(shù)據(jù)信息以達(dá)到降維的目的,通常將矩陣進(jìn)行分解, 典型的矩陣分解方法主要包括:主成分分析[2]、線性判別分析[3]、獨(dú)立分量分析[4]、奇異值分解[5]等。通常這些方法是在一定的限制下對(duì)原始數(shù)據(jù)矩陣進(jìn)行線性變換或分解,從而實(shí)現(xiàn)了對(duì)原始矩陣降維,但是這些方法的共同點(diǎn)是在求解時(shí)有可能會(huì)出現(xiàn)負(fù)值。 然而在許多應(yīng)用中,負(fù)值是與客觀現(xiàn)實(shí)相矛盾的,不具有可解釋性。
因此,非負(fù)矩陣分解(Nonnegative Matrix Factorization, NMF)[6]這種新的矩陣分解方法被提出,它最早由Lee等[7-8] 于1999年在《Nature》雜志上發(fā)表的。它不同于上述典型的矩陣分解算法,其獨(dú)特之處是通過(guò)添加非負(fù)限制,不但保證了分解結(jié)果的可解釋性,還使分解后的數(shù)據(jù)具有局部表達(dá)的特性。后來(lái)一些學(xué)者在此基礎(chǔ)上又提出了一些非負(fù)矩陣分解的改進(jìn)算法,并將它們成功地應(yīng)用到人臉識(shí)別、數(shù)據(jù)挖掘等領(lǐng)域:文獻(xiàn)[9]中提出了增量式非負(fù)矩陣分解(Incremental NMF, INMF)算法;進(jìn)一步文獻(xiàn)[10]中提出了圖正則(稀疏)增量式非負(fù)矩陣分解算法等。這些改進(jìn)算法有一個(gè)共同點(diǎn)是損失函數(shù)都是F范數(shù)或KL散度,而文獻(xiàn)[11]中在2011年提出了一種基于L2,1范數(shù)的非負(fù)矩陣分解——魯棒非負(fù)矩陣分解(Robust Nonnegative Matrix Factorization, RNMF)算法,該算法自提出后立即得到了學(xué)術(shù)界的廣泛關(guān)注,并取得了大量的研究成果。為了達(dá)到特定的實(shí)驗(yàn)效果,有些學(xué)者在基本的魯棒非負(fù)矩陣分解算法框架中又添加各種限制,如稀疏性、流形、正交性、判別性等,提出了若干種改進(jìn)的算法,并將其應(yīng)用于各個(gè)領(lǐng)域,取得了良好的實(shí)驗(yàn)效果:文獻(xiàn)[12-13]中考慮到原始數(shù)據(jù)中蘊(yùn)含著幾何結(jié)構(gòu),從而將流行學(xué)習(xí)與魯棒非負(fù)矩陣分解相結(jié)合,提出了圖正則的魯棒非負(fù)矩陣分解(Graph Regularized RNMF, GRNMF)算法,該算法在分解過(guò)程中保留了數(shù)據(jù)集攜帶的局部幾何信息,使得在低維能很好地表示原始樣本近鄰結(jié)構(gòu);? 文獻(xiàn)[14-15]中提出了稀疏限制的魯棒非負(fù)矩陣分解(RNMF with Sparseness Constraints, RNMFSC)算法,對(duì)系數(shù)矩陣進(jìn)行了稀疏限制,使分解后人臉圖像有更高的識(shí)別率。
針對(duì)魯棒非負(fù)矩陣分解的運(yùn)算時(shí)間和分解后數(shù)據(jù)的稀疏程度來(lái)說(shuō),隨著樣本個(gè)數(shù)的實(shí)時(shí)增加,直接利用魯棒非負(fù)矩陣分解時(shí)會(huì)出現(xiàn)運(yùn)算規(guī)模不斷增大的現(xiàn)象。本文就將魯棒非負(fù)矩陣分解算法(RNMF)與增量式學(xué)習(xí)結(jié)合在一起,并在此基礎(chǔ)上增加了L2,1范數(shù)的稀疏限制,從而提出了稀疏限制的增量式魯棒非負(fù)矩陣分解(Incremental Robust Nonnegative Matrix Factorization with Sparseness Constraints, IRNMFSC)算法。該算法不僅有效地提高了分解后數(shù)據(jù)的稀疏度,得到圖像的最佳局部表示,而且結(jié)合增量式學(xué)習(xí)充分利用上一步對(duì)初始數(shù)據(jù)的分解結(jié)果,避免重復(fù)計(jì)算從而降低運(yùn)算時(shí)間,提升算法迭代優(yōu)化求解時(shí)的收斂速度,使得此算法具有較好的聚類準(zhǔn)確率。最后在多個(gè)數(shù)據(jù)庫(kù)上的仿真結(jié)果表明,本文IRNMFSC算法均優(yōu)于RNMF算法和RNMFSC算法。
4 結(jié)語(yǔ)
本文提出了稀疏限制的增量式魯棒非負(fù)矩陣分解(IRNMFSC),并給出了相應(yīng)的迭代法則和算法步驟,該算法隨著新增樣本個(gè)數(shù)的增加依然保持較好的收斂速度。通過(guò)對(duì)YALE人臉數(shù)據(jù)和ORL人臉數(shù)據(jù)進(jìn)行數(shù)值實(shí)驗(yàn),結(jié)果顯示本文算法在迭代過(guò)程中收斂速度更快,尋優(yōu)效果更好,不僅縮短了運(yùn)算時(shí)間,而且使分解后的數(shù)據(jù)更為稀疏,能得到局部表達(dá)能力更強(qiáng)的基圖像和比較高的聚類準(zhǔn)確率。但是,算法中依然存在著不足,如迭代初值和稀疏參數(shù)的選取等,目前依然沒有一個(gè)統(tǒng)一的標(biāo)準(zhǔn),這些問(wèn)題都將成為今后要研究的重點(diǎn)。
參考文獻(xiàn) (References)
[1] PARDALOS P M. Approximate Dynamic Programming: Solving the Curses of Dimensionality[M]. New York: Wiley, 2009: 39-50.
[2] 史衛(wèi)亞,郭躍飛,薛向陽(yáng).一種解決大規(guī)模數(shù)據(jù)集問(wèn)題的核主成分分析算法[J]. 軟件學(xué)報(bào), 2009, 8(20): 2153-2159. (SHI W Y, GUO Y F, XUE X Y. A kernel principal component analysis algorithm for solving large scale data sets problem[J]. Journal of Software, 2009, 8(20): 2153-2159.)
[3] 張燕平,竇蓉蓉,趙姝,等. 基于集成學(xué)習(xí)的規(guī)范化LDA人臉識(shí)別[J].計(jì)算機(jī)工程, 2010, 36(14):144-146.(ZHANG Y P, DOU R R, ZHAO S, et al. Regularized linear discriminant analysis for face recognition based on ensemble learning[J]. Computer Engineering, 2010, 36(14):144-146.)
[4] KOLDOVSKY Z, TICHAVSKY P. Efficient variant of algorithm fastica for independent component analysis attaining the cramerRAO lower bound[C]// Proceedings of the 2005 IEEE/SP Workshop on Statistical Signal Processing. Piscataway, NJ: IEEE, 2005:1090-1095.
[5] GOLUB G H, REINSCH C. Singular Value Decomposition and Least Squares Solutions[M]. Berlin: Springer, 1971: 403-420.
[6] PAATERO P, TAPPER U. Positive matrix factorization: a nonnegative factor model with optimal utilization of error estimates of data values[J]. Environmental Surveying, 1994, 5(2):111-126.
[7] LEE D D, SEUNG H S. Learning the parts of objects with nonnegative matrix factorization[J]. Nature, 1999, 401(6755):788-791.
[8] LEE D D. Algorithms for nonnegative matrix factorization[J]. Advances in Neural Information Processing Systems, 2001, 13(6):556-562.
[9] BUCAK S S, GUNSEL B. Incremental subspace learning via nonnegative matrix factorization[J]. Pattern Recognition, 2009, 42(5):788-797.
[10] YU Z Z, LIU Y H, LI B, et al. Incremental graph regulated nonnegative matrix factorization for face recognition[J]. Journal of Applied Mathematics, 2014, 2014(1): Article ID 928051.
[11] KONG D, HUANG H. Robust nonnegative matrix factorization using L21norm[C]// Proceedings of the 20th ACM International Conference on Information and Knowledge Management. New York: ACM, 2011:673-682.
[12] HUANG S, WANG H, LI T, et al. Robust graph regularized nonnegative matrix factorization for clustering[J]. ACM Transactions on Knowledge Discovery from Data, 2017, 11(3): Article No. 33.
[13] DAI L Y, FENG C M, LIU J X, et al. Robust nonnegative matrix factorization via joint graph laplacian and discriminative information for identifying differentially expressed genes[J]. Complexity, 2017, 2017(40): Article ID 4216797.
[14] YANG S, HOU C, ZHANG C, et al. Robust nonnegative matrix factorization via joint sparse and graph regularization[C]// Proceedings of the 2013 International Joint Conference on Neural Networks. Piscataway, NJ: IEEE, 2014:541-559.
[15] DU S, SHI Y, WANG W. L(3/2) sparsity constrained graph nonnegative matrix factorization for image representation[C]// Proceedings of the 26th Chinese Control and Decision Conference. Piscataway, NJ: IEEE, 2014:2962-2965.
[16] CAI D, HE X, WU X, et al. Nonnegative matrix factorization on manifold[C]// Proceedings of the 8th IEEE International Conference on Data Mining. Piscataway, NJ: IEEE, 2008: 63-72.
[17] XU W, LIU X, GONG Y. Document clustering based on nonnegative matrix factorization[C]// Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2003:267-273.
[18] 汪金濤, 曹玉東, 孫福明. 稀疏約束圖正則非負(fù)矩陣分解的增量學(xué)習(xí)算法[J]. 計(jì)算機(jī)應(yīng)用, 2017,37(4): 1071-1074.(WANG J T, CAO Y D, SUN F M. Incremental learning algorithm based on graph regularized nonnegative matrix factorization with sparseness constraints[J]. Journal of Computer Applications, 2017, 37(4): 1071-1074.)