來驥 盛紅雷
摘? ?要:鏈路預測問題是復雜網絡中數據挖掘領域的重要研究方向,然而復雜網絡的結構與預測方法性能之間關系卻很少受到關注。從聚類分析的角度探討復雜網絡結構對現有基于相似性度量的六種鏈路預測方法的性能影響,通過對合成復雜網絡和真實復雜網絡的對比實驗進行分析。結果表明:隨著聚類簇的增加,這六種方法在預測精度方面的性能均得到了極大的提升。對于具有較低聚類簇的稀疏復雜網絡,疊加隨機游動(SRW)預測性能表現最佳,而對于具有較高聚類簇的密集復雜網絡,資源分配指數(RA)預測性能表現最佳。因此,對于不同類型的復雜網絡應采用不同的方法進行鏈路預測。
關鍵詞:復雜網絡;鏈路預測;聚類分析;相似性度量;
中圖分類號:TP319? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 文獻標識碼:A
Research on Link Prediction Performance of Complex
Networks Based on Clustering Analysis
LAI Ji1?覮,SHENG Hong-lei2
( 1. State Grid Jibei Information & Telecommunication Company ,Beijing 100053,China.
2. Nari Group Co.,Ltd(State Grid Electric Power Research Institute),Nanjing,Jiangsu 210000,China)
Abstract:Link prediction is an important research direction in the field of data mining in complex networks. However,the relationship between the structure of complex networks and the performance of prediction methods has received little attention. this paper discusses the effect of complex network structure on the performance of six existing link prediction methods based on similarity measure from the perspective of clustering analysis. The performance of the method has been greatly improved in terms of prediction accuracy. For sparse complex networks with low clustering,SRW performs best,while for dense complex networks with high clustering,RA performs best. Therefore,different methods should be adopted for link prediction in different types of complex networks.
Key words:complex network;link prediction;clustering analysis;similarity measure