国产日韩欧美一区二区三区三州_亚洲少妇熟女av_久久久久亚洲av国产精品_波多野结衣网站一区二区_亚洲欧美色片在线91_国产亚洲精品精品国产优播av_日本一区二区三区波多野结衣 _久久国产av不卡

?

WCDMA網(wǎng)管平臺(tái)故障分析及處理

2014-03-24 03:14:13趙金闖
南北橋 2014年2期
關(guān)鍵詞:網(wǎng)管硬盤鏈路

趙金闖

【摘 要】本文主要闡述了愛立信公司搭建的WCDMA網(wǎng)管平臺(tái)硬件及操作系統(tǒng)的功能及故障處理過程,重點(diǎn)介紹了硬件故障及zpool池故障處理的辦法。

【關(guān)鍵詞】SUN M5000 SUN x4600 M2 SUN stk2540 zpool cpu

中圖分類號:G4 文獻(xiàn)標(biāo)識碼:A DOI:10.3969/j.issn.1672-0407.2014.02.160

WCDMA網(wǎng)管平臺(tái)是用來對全省WCDMA網(wǎng)絡(luò)進(jìn)行流量分析及數(shù)據(jù)監(jiān)控和采集的專業(yè)網(wǎng)管平臺(tái),因?yàn)槠涮厥庑?,所以采用特殊的網(wǎng)管系統(tǒng)平臺(tái)來進(jìn)行專業(yè)化處理。其基礎(chǔ)硬件平臺(tái)由小型機(jī)、存儲(chǔ)、帶庫構(gòu)成,采用DAS連接方式進(jìn)行數(shù)據(jù)傳輸,達(dá)到業(yè)務(wù)系統(tǒng)的使用需求。

主要系統(tǒng)組成及功能

系統(tǒng)組成

主要硬件組成

1臺(tái)SUN M5000小型機(jī),一臺(tái)SUN x4600 M2 服務(wù)器,6臺(tái)SUN stk2540 存儲(chǔ)。

連接方式

SUN M5000通過直連方式連接2臺(tái)SUN stk2540存儲(chǔ),SUN x4600 M2通過直連方式連接4臺(tái)SUN stk2540存儲(chǔ)。

存儲(chǔ)配置及RAID級別

每臺(tái)SUN stk2540由12塊300G硬盤組成。與SUN M5000連接的存儲(chǔ),每臺(tái)中的5塊盤做成raid5,2塊熱備盤,共劃分4個(gè)卷;與SUN x4600 M2連接的存儲(chǔ),每臺(tái)的12塊盤做成raid0,劃分12個(gè)卷,4臺(tái)共48個(gè)卷,在主機(jī)SUN x4600上將48個(gè)卷通過zfs方式劃分到zpool池eniq_sp_1里,并將每兩臺(tái)SUN stk2540的對應(yīng)卷做成raid1(即mirror)。

主要實(shí)現(xiàn)功能

通過上述的系統(tǒng)組成,主機(jī)SUN M5000和SUN x4600 M2與存儲(chǔ)SUN stk2540共組成了2套硬件平臺(tái)系統(tǒng)。

SUN M5000與2臺(tái)SUN stk2540存儲(chǔ)直連,配置了130G物理內(nèi)存、8顆(虛擬64顆)2.4GHz主頻的cpu以及4T的硬盤空間,從而為業(yè)務(wù)提供了良好的運(yùn)行性能及足夠的數(shù)據(jù)存放空間。

SUN x4600 M2與4臺(tái)SUN stk2540存儲(chǔ)直連,配置了160G物理內(nèi)存、8顆(虛擬32顆)2.6GHz主頻的cpu以及6T的硬盤空間,從而為業(yè)務(wù)提供了良好的運(yùn)行性能及足夠的數(shù)據(jù)存放空間。

2套硬件平臺(tái)確保了WCDMA網(wǎng)管平臺(tái)系統(tǒng)的正常運(yùn)行。

WCDMA網(wǎng)管平臺(tái)故障分析及處理

本文重點(diǎn)分析了我在實(shí)際工作中碰到的幾個(gè)典型的案例:

案例一:zpool 池循環(huán)同步

故障描述:

主機(jī)上zpool 狀態(tài)為DEGRADED,循環(huán)同步且存在error,其中一條鏈路上的盤均為REMOVED狀態(tài)。存儲(chǔ)上,一個(gè)虛擬磁盤狀態(tài)為失敗,但對應(yīng)的硬盤狀態(tài)是好的。

故障分析及處理方案:

#: zpool status -v

pool: eniq_sp_1

state: DEGRADED

status: One or more devices has experienced an error resulting in data

corruption. Applications may be affected.

action: Restore the file in question if possible. Otherwise restore the

entire pool from backup.

see: http://www.SUN.com/msg/ZFS-8000-8A

scan: resilvered 41.2M in 6h41m with 0 errors on Fri Feb 14 19:13:07 2014

config:

NAME STATE READ WRITE CKSUM

eniq_sp_1 DEGRADED 5 0 0

mirror-0 DEGRADED 0 0 0

c1t0d0 ONLINE 0 0 0

replacing-1 UNAVAIL 9 273 2 insufficient replicas

c4t0d0/old FAULTED 0 0 0 corrupted data

c4t0d0 REMOVED 0 0 0

mirror-1 DEGRADED 0 0 0

c1t0d1 ONLINE 0 0 0

c4t0d1 REMOVED 0 0 0

mirror-2 DEGRADED 0 0 0

c1t0d2 ONLINE 0 0 0

c4t0d2 REMOVED 0 0 0

mirror-3 DEGRADED 0 0 0

c1t0d3 ONLINE 0 0 0

c4t0d3 REMOVED 0 0 0

mirror-4 DEGRADED 0 0 0

c1t0d4 ONLINE 0 0 0

c4t0d4 FAULTED 0 0 0 too many errors

mirror-5 DEGRADED 0 0 0

c1t0d5 ONLINE 0 0 0

c4t0d5 REMOVED 0 0 0

mirror-6 DEGRADED 5 0 0

c1t0d6 ONLINE 5 0 0

c4t0d6 REMOVED 0 0 0

mirror-7 DEGRADED 0 0 0

c1t0d7 ONLINE 0 0 0

c4t0d7 REMOVED 0 0 0

mirror-8 DEGRADED 0 0 0

c1t0d8 ONLINE 0 0 0

c4t0d8 REMOVED 0 0 0

mirror-9 DEGRADED 0 0 0

c1t0d9 ONLINE 0 0 0

c4t0d9 REMOVED 0 0 0

mirror-10 DEGRADED 0 0 0

c1t0d10 ONLINE 0 0 0

c4t0d10 REMOVED 0 0 0

mirror-11 DEGRADED 0 0 0

c1t0d11 ONLINE 0 0 0

c4t0d11 REMOVED 0 0 0

errors: Permanent errors have been detected in the following files:

/eniq/database/dwh_main_dbspace/dbspace_dir_9/main_9

根據(jù)上述報(bào)錯(cuò)及主機(jī)光纖卡的狀態(tài)燈判斷故障點(diǎn)為主機(jī)光纖卡。

在陣列上重新建卷。

將主機(jī)業(yè)務(wù)停掉并關(guān)閉主機(jī),從而更換主機(jī)光纖卡。

cucc-eniq01(root) #:sync;sync;sync;init 5

更換光纖卡,并重啟主機(jī)。

->start /SYS

cucc-eniq01(root) #: devfsadm

經(jīng)過以上操作后,主機(jī)可識別到新鏈路上的硬盤,但磁盤邏輯名已改變,并且zpool的狀態(tài)無法查看。此時(shí)需要重啟主機(jī),以便zpool池自動(dòng)恢復(fù)。

cucc-eniq01(root) #:sync;sync;sync;init 6

重啟后,可查看zpool池狀態(tài),zpool池中原鏈路上的盤已經(jīng)替換成新鏈路的盤,但仍有2塊盤(c4t0d0及c4t0d4)沒有自動(dòng)替換。需通過命令手動(dòng)替換。

cucc-eniq01(root) #: zpool detach eniq_sp_1 c4t0d4

cucc-eniq01(root) #: zpool attach eniq_sp_1 c1t0d4 c10t0d4

cucc-eniq01(root) #: zpool detach eniq_sp_1 c4t0d0

cucc-eniq01(root) #: zpool attach eniq_sp_1 c1t0d0 c10t0d0

cucc-eniq01(root) #: zpool detach eniq_sp_1 c4t0d0

cucc-eniq01(root) #: zpool status

pool: eniq_sp_1

state: ONLINE

status: One or more devices is currently being resilvered. The pool will

continue to function, possibly in a degraded state.

action: Wait for the resilver to complete.

scan: resilver in progress since Wed Feb 19 01:05:25 2014

20.4G scanned out of 4.66T at 255M/s, 5h17m to go

20.4G scanned out of 4.66T at 255M/s, 5h17m to go

1.76G resilvered, 0.43% done

config:

NAME STATE READ WRITE CKSUM

eniq_sp_1 ONLINE 0 0 0

mirror-0 ONLINE 0 0 0

c1t0d0 ONLINE 0 0 0

c10t0d0 ONLINE 0 0 0 (resilvering)

mirror-1 ONLINE 0 0 0

c1t0d1 ONLINE 0 0 0

c10t0d1 ONLINE 0 0 0 (resilvering)

mirror-2 ONLINE 0 0 0

c1t0d2 ONLINE 0 0 0

c10t0d2 ONLINE 0 0 0 (resilvering)

mirror-3 ONLINE 0 0 0

c1t0d3 ONLINE 0 0 0

c10t0d3 ONLINE 0 0 0 (resilvering)

mirror-4 ONLINE 0 0 0

c1t0d4 ONLINE 0 0 0

c10t0d4 ONLINE 0 0 0 (resilvering)

mirror-5 ONLINE 0 0 0

c1t0d5 ONLINE 0 0 0

c10t0d5 ONLINE 0 0 0 (resilvering)

mirror-6 ONLINE 0 0 0

c1t0d6 ONLINE 0 0 0

c10t0d6 ONLINE 0 0 0 (resilvering)

mirror-7 ONLINE 0 0 0

c1t0d7 ONLINE 0 0 0

c10t0d7 ONLINE 0 0 0 (resilvering)

mirror-8 ONLINE 0 0 0

c1t0d8 ONLINE 0 0 0

c10t0d8 ONLINE 0 0 0 (resilvering)

mirror-9 ONLINE 0 0 0

c1t0d9 ONLINE 0 0 0

c10t0d9 ONLINE 0 0 0 (resilvering)

mirror-10 ONLINE 0 0 0

c1t0d10 ONLINE 0 0 0

c10t0d10 ONLINE 0 0 0 (resilvering)

mirror-11 ONLINE 0 0 0

c1t0d11 ONLINE 0 0 0

c10t0d11 ONLINE 0 0 0 (resilvering)

經(jīng)過上述操作后,zpool池狀態(tài)已正常運(yùn)行狀態(tài)。

案例二:SUN M5000硬件cpu板故障

故障描述:

SUN M5000告警燈亮起,但業(yè)務(wù)正常運(yùn)行。

故障分析及處理方案:

通過登陸到xcsf卡下查看硬件信息

XSCF> showstatus

MBU_B Status:Normal;

* CPUM#1-CHIP#0 Status:Degraded;

* CPUM#1-CHIP#1 Status:Faulted;

XSCF> showhardconf

SPARC Enterprise M5000;

+ Serial:BEF0908D65; Operator_Panel_Switch:Locked;

+ Power_Supply_System:Single; SCF-ID:XSCF#0;

+ System_Power:On; System_Phase:Cabinet Power On;

Domain#0 Domain_Status:Running;

MBU_B Status:Normal; Ver:0201h; Serial:BE09071A62 ;

+ FRU-Part-Number:CF00541-0478 07 /541-0478-07 ;

+ Memory_Size:128 GB;

CPUM#0-CHIP#0 Status:Normal; Ver:0401h; Serial:PP084200N1 ;

+ FRU-Part-Number:CA06761-D202 D0 /375-3568-04 ;

+ Freq:2.400 GHz; Type:32;

+ Core:4; Strand:2;

CPUM#0-CHIP#1 Status:Normal; Ver:0401h; Serial:PP084200N1 ;

+ FRU-Part-Number:CA06761-D202 D0 /375-3568-04 ;

+ Freq:2.400 GHz; Type:32;

+ Core:4; Strand:2;

* CPUM#1-CHIP#0 Status:Degraded; Ver:0401h; Serial:PP090603DL ;

+ FRU-Part-Number:CA06761-D202 E0 /375-3568-05 ;

+ Freq:2.400 GHz; Type:32;

+ Core:4; Strand:2;

* CPUM#1-CHIP#1 Status:Faulted; Ver:0401h; Serial:PP090603DL ;

+ FRU-Part-Number:CA06761-D202 E0 /375-3568-05 ;

+ Freq:2.400 GHz; Type:32;

+ Core:4; Strand:2;

根據(jù)上述信息,判斷主機(jī)的cpu板CPUM#1故障。

停止業(yè)務(wù)運(yùn)行并關(guān)閉主機(jī)操作系統(tǒng)

wcuccmas1o{root} # sync

wcuccmas1o{root} # sync

wcuccmas1o{root} # init 5

XSCF> showdomainstatus -a

DID Domain Status

00 Powered Off

01 -

02 -

03 -

確認(rèn)主機(jī)系統(tǒng)已關(guān)閉并拔掉主機(jī)電源線

XSCF> showdomainstatus -a

DID Domain Status

00 Powered Off

01 -

02 -

03 -

根據(jù)手冊及下圖更換CPUM#1

更換完成后,加電檢測并啟動(dòng)主機(jī)操作系統(tǒng)。

XSCF> showstatus

No failures found in System Initialization.

XSCF> showhardconf

SPARC Enterprise M5000;

+ Serial:BEF0908D65; Operator_Panel_Switch:Locked;

+ Power_Supply_System:Single; SCF-ID:XSCF#0;

+ System_Power:Off; System_Phase:Cabinet Power Off;

Domain#0 Domain_Status:Powered Off;

MBU_B Status:Normal; Ver:0201h; Serial:BE09071A62 ;

+ FRU-Part-Number:CF00541-0478 07 /541-0478-07 ;

+ Memory_Size:128 GB;

CPUM#0-CHIP#0 Status:Normal; Ver:0401h; Serial:PP084200N1 ;

+ FRU-Part-Number:CA06761-D202 D0 /375-3568-04 ;

+ Freq:2.400 GHz; Type:32;

+ Core:4; Strand:2;

CPUM#0-CHIP#1 Status:Normal; Ver:0401h; Serial:PP084200N1 ;

+ FRU-Part-Number:CA06761-D202 D0 /375-3568-04 ;

+ Freq:2.400 GHz; Type:32;

+ Core:4; Strand:2;

CPUM#1-CHIP#0 Status:Normal; Ver:0401h; Serial:PP084402BJ ;

+ FRU-Part-Number:CA06761-D202 D0 /375-3568-04 ;

+ Freq:2.400 GHz; Type:32;

+ Core:4; Strand:2;

CPUM#1-CHIP#1 Status:Normal; Ver:0401h; Serial:PP084402BJ ;

+ FRU-Part-Number:CA06761-D202 D0 /375-3568-04 ;

+ Freq:2.400 GHz; Type:32;

+ Core:4; Strand:2;

XSCF> poweron -d 0

DomainIDs to power on:00

Continue? [y|n] :y

00 :Powering on

*Note*

This command only issues the instruction to power-on.

The result of the instruction can be checked by the "showlogs power".

至此,SUN M5000 cpu板故障處理完畢。

結(jié)束語

目前WCDMA網(wǎng)管平臺(tái)系統(tǒng)正在穩(wěn)定運(yùn)行中,但隨著時(shí)間的推移,網(wǎng)管系統(tǒng)平臺(tái)的服務(wù)指標(biāo)不斷增多,所以對硬件平臺(tái)系統(tǒng)要求也越來越多,存儲(chǔ)空間的要求也會(huì)不斷增高。以后會(huì)針對業(yè)務(wù)需求不斷的對整個(gè)系統(tǒng)平臺(tái)進(jìn)行升級操作。

猜你喜歡
網(wǎng)管硬盤鏈路
家紡“全鏈路”升級
天空地一體化網(wǎng)絡(luò)多中繼鏈路自適應(yīng)調(diào)度技術(shù)
HiFi級4K硬盤播放機(jī) 億格瑞A15
Egreat(億格瑞)A10二代 4K硬盤播放機(jī)
我區(qū)電視臺(tái)對硬盤播出系統(tǒng)的應(yīng)用
“五制配套”加強(qiáng)網(wǎng)管
新聞前哨(2015年2期)2015-03-11 19:29:29
基于3G的VPDN技術(shù)在高速公路備份鏈路中的應(yīng)用
一種供鳥有限飛翔的裝置
發(fā)射機(jī)房網(wǎng)管系統(tǒng)的設(shè)計(jì)原則及功能
河南科技(2014年14期)2014-02-27 14:11:59
網(wǎng)管支撐系統(tǒng)運(yùn)行質(zhì)量管控的研究與實(shí)現(xiàn)
金华市| 平谷区| 德安县| 上高县| 泽州县| 合阳县| 日照市| 玉田县| 通榆县| 镇远县| 丰原市| 金湖县| 永福县| 五峰| 万安县| 荥经县| 普格县| 石渠县| 柳州市| 习水县| 英德市| 大英县| 沅江市| 汤阴县| 登封市| 吴江市| 子洲县| 石首市| 通州区| 武宁县| 富民县| 宁晋县| 赣榆县| 新宁县| 巴塘县| 紫阳县| 班玛县| 和林格尔县| 长宁县| 浪卡子县| 凉山|