金宇
關(guān)鍵詞:大數(shù)據(jù)兼容性存儲(chǔ); 關(guān)系數(shù)據(jù)模型; 大數(shù)據(jù)存儲(chǔ)模型; 云計(jì)算; 信息儲(chǔ)存; 并行處理
中圖分類號(hào): TN911?34;U665 ? ? ? ? ? ? ? ? ? ? 文獻(xiàn)標(biāo)識(shí)碼: A ? ? ? ? ? ? ? ? ? ? ?文章編號(hào): 1004?373X(2019)01?0024?04
Abstract: In order to better adapt to the development of the environment, and promote the progress of science and technology, the real?time and heterogeneous characteristics of network data are researched, the network data detection and storage system is innovated and optimized in combination with phase cloud computing method, an innovative design scheme of big data storage system is proposed, and the big data storage processing model based on cloud computing is designed. The cloud computing method is used to analyze the characteristics of network data information. The extension of the data model, storage and parallel processing problems are studied to optimize and innovate the data storage system, and improve the practical value of the system. The system performance was detected with experiment. The experimental results show that the system has higher superiority than the traditional information storage system, and achieves the design goal effectively.
Keywords: big data compatibility storage; relational data model; big data storage model; cloud computing; information storage; parallel processing
隨著現(xiàn)代技術(shù)高速發(fā)展,對(duì)當(dāng)前海量數(shù)據(jù)快速地進(jìn)行有效兼容和存儲(chǔ)的要求逐漸受到重視。由于傳統(tǒng)數(shù)據(jù)庫(kù)難以滿足對(duì)大數(shù)據(jù)進(jìn)行精確篩選兼容和海量存儲(chǔ)提取的需求,導(dǎo)致數(shù)據(jù)存儲(chǔ)系統(tǒng)難以快速高效地對(duì)數(shù)據(jù)進(jìn)行傳輸和提取[1]?;谏鲜鲈?,結(jié)合云計(jì)算方法對(duì)大數(shù)據(jù)兼容性儲(chǔ)存系統(tǒng)進(jìn)行創(chuàng)新設(shè)計(jì),對(duì)大數(shù)據(jù)拓展及儲(chǔ)存方法進(jìn)行創(chuàng)新,以達(dá)到對(duì)數(shù)據(jù)兼容關(guān)系和數(shù)據(jù)存儲(chǔ)模型進(jìn)行優(yōu)化和完善的優(yōu)化目標(biāo)。從而有效完成對(duì)數(shù)據(jù)進(jìn)行篩選的工作,提高數(shù)據(jù)兼容性,簡(jiǎn)化系統(tǒng)操作流程,提高信息儲(chǔ)存量和傳輸精準(zhǔn)度[2]。為檢驗(yàn)方法的有效性和實(shí)用性,本文設(shè)計(jì)了多次仿真實(shí)驗(yàn),與傳統(tǒng)的數(shù)據(jù)存儲(chǔ)系統(tǒng)運(yùn)行效果進(jìn)行檢測(cè)和對(duì)比。檢測(cè)結(jié)果證實(shí),基于云計(jì)算環(huán)境的大數(shù)據(jù)兼容性存儲(chǔ)系統(tǒng)可以有效解決關(guān)系數(shù)據(jù)模型的兼容和擴(kuò)展性問(wèn)題,有效提高了對(duì)海量數(shù)據(jù)進(jìn)行查詢的工作效率,有效解決了當(dāng)前對(duì)大數(shù)據(jù)兼容存儲(chǔ)能力不足的問(wèn)題[3]。
與原始數(shù)據(jù)處理方法相比,云計(jì)算對(duì)非結(jié)構(gòu)化數(shù)據(jù)進(jìn)行處理的過(guò)程具有結(jié)構(gòu)性特征,該特征與關(guān)系數(shù)據(jù)模型內(nèi)元組的邏輯形式相似,存在信息特征對(duì)應(yīng)關(guān)系,因此可以利用數(shù)據(jù)特征類型對(duì)數(shù)據(jù)特征進(jìn)行拓展描述[4]。由于結(jié)構(gòu)化數(shù)據(jù)難以保證數(shù)據(jù)類型,因此無(wú)法簡(jiǎn)單地與關(guān)系數(shù)據(jù)進(jìn)行對(duì)應(yīng)。結(jié)合數(shù)據(jù)抽象計(jì)算方法對(duì)結(jié)構(gòu)化數(shù)據(jù)特征進(jìn)行擴(kuò)展和轉(zhuǎn)化,以便進(jìn)行信息存儲(chǔ)和壓縮,針對(duì)信息拓展傳輸方法設(shè)計(jì)了樹形結(jié)構(gòu),如圖1所示。
圖1中,系統(tǒng)中對(duì)數(shù)據(jù)特征及數(shù)據(jù)拓展子系統(tǒng)的處理主要集中于數(shù)據(jù)儲(chǔ)存模塊,以便對(duì)結(jié)構(gòu)化和非結(jié)構(gòu)化數(shù)據(jù)及擴(kuò)展和轉(zhuǎn)化后的子數(shù)據(jù)特征進(jìn)行描述[5]。同時(shí)采用行式或列式的方法對(duì)大量數(shù)據(jù)進(jìn)行壓縮和存儲(chǔ)。在完成數(shù)據(jù)擴(kuò)展分類的基礎(chǔ)上,根據(jù)數(shù)據(jù)特征進(jìn)一步進(jìn)行數(shù)據(jù)類型切分。當(dāng)數(shù)據(jù)擴(kuò)展范圍超過(guò)限定閾值時(shí),擴(kuò)展出來(lái)的虛擬類數(shù)據(jù)信息被視為普通子類信息進(jìn)行儲(chǔ)存處理[6]。為了更好地對(duì)虛擬類擴(kuò)展數(shù)據(jù)繼承關(guān)系特征進(jìn)行展示,以便將數(shù)據(jù)進(jìn)行分類儲(chǔ)存,達(dá)到快速準(zhǔn)確地對(duì)數(shù)據(jù)特征進(jìn)行查詢的設(shè)計(jì)目標(biāo),對(duì)關(guān)系數(shù)據(jù)和非關(guān)系數(shù)據(jù)轉(zhuǎn)化和擴(kuò)展類型切分方法進(jìn)行深入研究,以提高對(duì)大數(shù)據(jù)進(jìn)行管理和分析的性能[7]?;谏鲜鏊悸吩O(shè)計(jì)了關(guān)系數(shù)據(jù)擴(kuò)展類型切分系統(tǒng),如圖2所示。
對(duì)大數(shù)據(jù)擴(kuò)展存儲(chǔ)系統(tǒng)的設(shè)計(jì)需要對(duì)系統(tǒng)底層數(shù)據(jù)庫(kù)信息進(jìn)行組織,以便提高系統(tǒng)中數(shù)據(jù)的兼容性,避免數(shù)據(jù)存儲(chǔ)過(guò)程中出現(xiàn)信息損壞等問(wèn)題[8]。同時(shí)高效快速地為系統(tǒng)用戶提供一致、可擴(kuò)展的數(shù)據(jù)訪問(wèn)接口,方便用戶快速精準(zhǔn)地獲取數(shù)據(jù)源圖文信息[9]。基于上述目標(biāo),對(duì)基于云計(jì)算的數(shù)據(jù)儲(chǔ)存管理訪問(wèn)模塊設(shè)計(jì)思路進(jìn)行展示,如圖3所示。
如圖3所示,用戶可通過(guò)數(shù)據(jù)管理入口處理系統(tǒng)對(duì)數(shù)據(jù)庫(kù)進(jìn)行統(tǒng)一訪問(wèn),利用大數(shù)據(jù)訪問(wèn)統(tǒng)一接口對(duì)所需數(shù)據(jù)進(jìn)行提取和儲(chǔ)存等操作,最后數(shù)據(jù)可在數(shù)據(jù)關(guān)系映射層進(jìn)行緩存,并永久性地儲(chǔ)存在數(shù)據(jù)庫(kù)最底層中[10]。另外,數(shù)據(jù)存儲(chǔ)系統(tǒng)中通常利用非關(guān)系型數(shù)據(jù)特點(diǎn)為緩存數(shù)據(jù)進(jìn)行特征分類,來(lái)實(shí)現(xiàn)數(shù)據(jù)的擴(kuò)展和轉(zhuǎn)換,以便優(yōu)化客戶端的數(shù)據(jù)通信模式[11]。在數(shù)據(jù)量相對(duì)較大的情況下,需在數(shù)據(jù)庫(kù)緩存系統(tǒng)中使用集群方法對(duì)臨時(shí)數(shù)據(jù)進(jìn)行優(yōu)化和儲(chǔ)存,從而實(shí)現(xiàn)對(duì)大量數(shù)據(jù)進(jìn)行轉(zhuǎn)換和兼容的設(shè)計(jì)目標(biāo)[12]。結(jié)合上述思想設(shè)計(jì)了基于云計(jì)算環(huán)境的數(shù)據(jù)儲(chǔ)存優(yōu)化系統(tǒng),系統(tǒng)設(shè)計(jì)流程如圖4所示。
為了在大量數(shù)據(jù)并行的情況下提高數(shù)據(jù)實(shí)時(shí)處理效率,結(jié)合前文內(nèi)容對(duì)大規(guī)模數(shù)據(jù)并行處理系統(tǒng)中的分類特征進(jìn)行提取和研究,以便在系統(tǒng)運(yùn)行過(guò)程中可以及時(shí)獲取待采集的任務(wù)數(shù)據(jù),并準(zhǔn)確地將任務(wù)插入到采集隊(duì)列中,從而達(dá)到準(zhǔn)確劃分、快速調(diào)度和執(zhí)行數(shù)據(jù)任務(wù)的設(shè)計(jì)效果[13]。在完成任務(wù)數(shù)據(jù)采集后,對(duì)所需的數(shù)據(jù)進(jìn)行傳輸和存儲(chǔ)處理。由于數(shù)據(jù)規(guī)模相對(duì)較大,通常情況下需要對(duì)任務(wù)進(jìn)行并行處理,并通過(guò)監(jiān)控系統(tǒng)和數(shù)據(jù)監(jiān)聽(tīng)模塊對(duì)并行任務(wù)處理情況進(jìn)行監(jiān)督和管理,最后對(duì)數(shù)據(jù)的存儲(chǔ)結(jié)果進(jìn)行有效的分析和檢測(cè)。在數(shù)據(jù)并行處理過(guò)程中,一旦檢測(cè)出設(shè)備故障,系統(tǒng)會(huì)立刻發(fā)出告警通知,以便及時(shí)對(duì)系統(tǒng)異常情況進(jìn)行檢測(cè),避免在儲(chǔ)存過(guò)程中出現(xiàn)數(shù)據(jù)失真和異常等情況[14]。綜上所述,對(duì)數(shù)據(jù)并行處理系統(tǒng)進(jìn)行設(shè)計(jì),系統(tǒng)具體設(shè)計(jì)流程如圖5所示。
在數(shù)據(jù)并行過(guò)程中會(huì)呈現(xiàn)出明顯的層次性關(guān)系,為了更好地對(duì)復(fù)雜、多樣的數(shù)據(jù)進(jìn)行兼容和存儲(chǔ),需定時(shí)對(duì)系統(tǒng)數(shù)據(jù)庫(kù)中的待采集系統(tǒng)任務(wù)進(jìn)行提取,并通過(guò)告警系統(tǒng)對(duì)是否終止正在運(yùn)行的數(shù)據(jù)采集任務(wù)進(jìn)行判斷[15]。一旦出現(xiàn)需要停止系統(tǒng)任務(wù)運(yùn)行的情況,則需要針對(duì)任務(wù)編號(hào)對(duì)并行任務(wù)進(jìn)行排查和篩選。為了準(zhǔn)確地對(duì)采集隊(duì)列中出現(xiàn)異常的任務(wù)進(jìn)行移除,需要對(duì)數(shù)據(jù)并行處理系統(tǒng)進(jìn)行優(yōu)化。結(jié)合前文內(nèi)容對(duì)數(shù)據(jù)兼容儲(chǔ)存更新系統(tǒng)進(jìn)行完善,從而有效執(zhí)行并行任務(wù),同時(shí)對(duì)任務(wù)進(jìn)行處理,及時(shí)更新和查詢數(shù)據(jù)特征,保障數(shù)據(jù)儲(chǔ)存的完好性。數(shù)據(jù)兼容儲(chǔ)存更新系統(tǒng)結(jié)構(gòu)優(yōu)化設(shè)計(jì)如圖6所示。
為了對(duì)大數(shù)據(jù)兼容性存儲(chǔ)系統(tǒng)的性能進(jìn)行檢測(cè), 對(duì)比傳統(tǒng)方法進(jìn)行多次仿真實(shí)驗(yàn)。為了保證數(shù)據(jù)儲(chǔ)存效果,首先對(duì)數(shù)據(jù)庫(kù)中失真失效的存儲(chǔ)數(shù)據(jù)情況進(jìn)行檢測(cè),檢測(cè)結(jié)果如圖7所示。
根據(jù)圖7不難看出,隨著存儲(chǔ)系統(tǒng)中數(shù)據(jù)量的增加,傳統(tǒng)的數(shù)據(jù)存儲(chǔ)系統(tǒng)數(shù)據(jù)失真情況明顯增加,且坡度較大,這說(shuō)明傳統(tǒng)數(shù)據(jù)存儲(chǔ)系統(tǒng)難以對(duì)大量數(shù)據(jù)進(jìn)行兼容,難以滿足當(dāng)前對(duì)大量數(shù)據(jù)進(jìn)行有效儲(chǔ)存的要求。而相比之下,改進(jìn)后的數(shù)據(jù)存儲(chǔ)系統(tǒng)失真率折線上升,速度相對(duì)平緩,由此證實(shí)該系統(tǒng)可以更好地對(duì)海量數(shù)據(jù)進(jìn)行儲(chǔ)存。通過(guò)檢測(cè)結(jié)果發(fā)現(xiàn),基于云計(jì)算環(huán)境的大數(shù)據(jù)兼容存儲(chǔ)系統(tǒng)中存儲(chǔ)數(shù)據(jù)的失真率低于50%,相對(duì)于傳統(tǒng)方法有更明顯的使用優(yōu)勢(shì)。
為了進(jìn)一步對(duì)該系統(tǒng)的使用價(jià)值進(jìn)行檢測(cè),在同等條件下對(duì)傳統(tǒng)數(shù)據(jù)兼容存儲(chǔ)系統(tǒng)和基于云計(jì)算的數(shù)據(jù)存儲(chǔ)兼容系統(tǒng)進(jìn)行實(shí)驗(yàn)檢測(cè)。首先對(duì)兩種方法的數(shù)據(jù)兼容儲(chǔ)存效果進(jìn)行檢測(cè),并對(duì)比檢測(cè)數(shù)據(jù)結(jié)果,得到如表1所示的信息。
根據(jù)表1信息不難發(fā)現(xiàn),傳統(tǒng)方法中可存儲(chǔ)數(shù)據(jù)量不足本文方法的[12],且基于云計(jì)算環(huán)境的大數(shù)據(jù)兼容存儲(chǔ)系統(tǒng)無(wú)論是在數(shù)據(jù)并行處理、數(shù)據(jù)存儲(chǔ)還是在數(shù)據(jù)處理時(shí)間方面都明顯優(yōu)于傳統(tǒng)方法。進(jìn)一步對(duì)系統(tǒng)的數(shù)據(jù)兼容效果進(jìn)行檢測(cè),在同等條件下,根據(jù)傳統(tǒng)方法和本文方法的檢測(cè)結(jié)果繪制成折線圖,如圖8所示。
根據(jù)圖8檢測(cè)結(jié)果可以看出,本文設(shè)計(jì)的數(shù)據(jù)處理系統(tǒng)可以更好地完成數(shù)據(jù)兼容,其對(duì)大數(shù)據(jù)的兼容性可達(dá)到100%。能夠有效進(jìn)行數(shù)據(jù)采集、處理和存儲(chǔ)工作,充分滿足了當(dāng)前對(duì)大數(shù)據(jù)進(jìn)行存儲(chǔ)的設(shè)計(jì)要求,適用于實(shí)際工作應(yīng)用中。
本文針對(duì)當(dāng)前網(wǎng)絡(luò)對(duì)海量數(shù)據(jù)存儲(chǔ)和管理系統(tǒng)中存在的問(wèn)題進(jìn)行分析,為了更好地適應(yīng)網(wǎng)絡(luò)信息技術(shù)的發(fā)展要求,設(shè)計(jì)基于云計(jì)算環(huán)境的大數(shù)據(jù)兼容性存儲(chǔ)系統(tǒng),從而達(dá)到對(duì)網(wǎng)絡(luò)中大規(guī)模信息資源進(jìn)行采集處理和兼容存儲(chǔ)的設(shè)計(jì)目標(biāo)。通過(guò)實(shí)驗(yàn)檢測(cè)證實(shí),該系統(tǒng)具有存儲(chǔ)容量大、數(shù)據(jù)兼容性強(qiáng),信息處理高效準(zhǔn)確等優(yōu)點(diǎn),可有效提高數(shù)據(jù)采集處理效率,充分彌補(bǔ)傳統(tǒng)方法中的不足,適用于實(shí)際信息存儲(chǔ)工作中。
參考文獻(xiàn)
[1] 涂俊英,李志敏.云計(jì)算下非結(jié)構(gòu)化大數(shù)據(jù)存儲(chǔ)系統(tǒng)設(shè)計(jì)[J].現(xiàn)代電子技術(shù),2018,41(1):173?177.
TU Junying, LI Zhimin. Design of unstructured large data storage system based on cloud computing [J]. Modern electronics technique, 2018, 41(1): 173?177.
[2] 湯義好.校園網(wǎng)云存儲(chǔ)開放平臺(tái)的設(shè)計(jì):基于云計(jì)算和大數(shù)據(jù)技術(shù)[J].內(nèi)江師范學(xué)院學(xué)報(bào),2017,32(6):43?48.
TANG Yihao. The design of campus network cloud storage open access platform: based on cloud computing and big data technology [J]. Journal of Neijiang Normal University, 2017, 32(6): 43?48.
[3] 羅弦,查志勇,徐煥,等.基于云計(jì)算的大數(shù)據(jù)自動(dòng)分類處理系統(tǒng)設(shè)計(jì)[J].計(jì)算機(jī)測(cè)量與控制,2017,25(10):278?280.
LUO Xian, ZHA Zhiyong, XU Huan, et al. Design of large data automatic classification and processing system based on cloud computing [J]. Computer measurement & control, 2017, 25(10): 278?280.
[4] 鄒華.云計(jì)算環(huán)境下大數(shù)據(jù)分布規(guī)律的結(jié)構(gòu)優(yōu)化設(shè)計(jì)[J].現(xiàn)代電子技術(shù),2016,39(8):18?20.
ZOU Hua. Optimization design for structure of big data distribution regularity in cloud computing environment [J]. Modern electronics technique, 2016, 39(8): 18?20.
[5] 王雪麗,胡波.基于云計(jì)算的銀行信息化模式研究與設(shè)計(jì)[J].陰山學(xué)刊(自然科學(xué)版),2017,31(1):61?63.
WANG Xueli, HU Bo. Research and design of bank information model based on cloud computing [J]. Yinshan Academic Journal (natural science edition), 2017, 31(1): 61?63.
[6] 張露,尚艷玲.云計(jì)算環(huán)境下資源調(diào)度系統(tǒng)設(shè)計(jì)與實(shí)現(xiàn)[J].計(jì)算機(jī)測(cè)量與控制,2017,25(1):131?134.
ZHANG Lu, SHANG Yanling. Design and implementation of resource scheduling system in cloud computing environment [J]. Computer measurement & control, 2017, 25(1): 131?134.
[7] 姜明月.云計(jì)算平臺(tái)下的大數(shù)據(jù)分流系統(tǒng)的設(shè)計(jì)與優(yōu)化[J].現(xiàn)代電子技術(shù),2016,39(2):28?32.
JIANG Mingyue. Design and optimization of large data strea?ming system based on cloud computing platform [J]. Modern electronics technique, 2016, 39(2): 28?32.
[8] 朱亞?wèn)|,高翠芳.基于PSO的云計(jì)算環(huán)境中大數(shù)據(jù)優(yōu)化聚類算法[J].計(jì)算機(jī)技術(shù)與發(fā)展,2016,26(9):178?182.
ZHU Yadong, GAO Cuifang. Big data optimization clustering algorithm based on PSO in cloud computing environment [J]. Computer technology and development, 2016, 26(9): 178?182.
[9] 高新成,王燕,李春生,等.基于云計(jì)算的大數(shù)據(jù)教學(xué)服務(wù)平臺(tái)設(shè)計(jì)[J].綏化學(xué)院學(xué)報(bào),2017,37(8):146?150.
GAO Xincheng, WANG Yan, LI Chunsheng, et al. The research of cloud service platform for higher education based on the big data [J]. Journal of Suihua University, 2017, 37(8): 146?150.
[10] 張潔,薛勝軍.云計(jì)算環(huán)境下氣象大數(shù)據(jù)服務(wù)的應(yīng)用[J].安徽農(nóng)業(yè)科學(xué),2016,42(5):298?301.
ZHANG Jie, XUE Shengjun. Application of the services of meteorological big data in cloud computing [J]. Journal of Anhui Agricultural Sciences, 2016, 42(5): 298?301.
[11] 申琢.基于云計(jì)算和大數(shù)據(jù)挖掘的礦山事故預(yù)警系統(tǒng)研究與設(shè)計(jì)[J].中國(guó)煤炭,2017,43(12):109?114.
SHEN Zhuo. Study on early warning system of coal mine accidents based on cloud computing and big data crunching platform [J]. China coal, 2017, 43(12): 109?114.
[12] 張向睿,向華,董雄報(bào).面向云計(jì)算大數(shù)據(jù)中心的制造業(yè)項(xiàng)目管理系統(tǒng)設(shè)計(jì)[J].現(xiàn)代電子技術(shù),2017,40(12):46?48.
ZHANG Xiangrui, XIANG Hua, DONG Xiongbao. Design of manufacturing project management system for cloud computing big data center [J]. Modern electronics technique, 2017, 40(12): 46?48.
[13] 李鴻雁.大數(shù)據(jù)云計(jì)算環(huán)境下的數(shù)據(jù)安全探討[J].信息與電腦(理論版),2017(3):201?202.
LI Hongyan. Discussion on data security in big data cloud computing environment [J]. China computer & communication (theoretical edition), 2017(3): 201?202.
[14] 蘇樹鵬.基于Hadoop的大數(shù)據(jù)解決方案的設(shè)計(jì)及應(yīng)用[J].河池學(xué)院學(xué)報(bào),2017,37(2):89?93.
SU Shupeng. The design and application of big data solution based on Hadoop [J]. Journal of Hechi University, 2017, 37(2): 89?93.
[15] 毛海波.基于云計(jì)算的OA資源數(shù)據(jù)交換模式探討[J].寧波大學(xué)學(xué)報(bào)(理工版),2016,79(1):28?32.
MAO Haibo. Probe: OA resource data exchange mode based on cloud computing [J]. Journal of Ningbo University (natural science & engineering edition), 2016, 79(1): 28?32.