文/五花肉
從鼓角錚鳴到萬“碼”奔騰
——編碼與漢字信息傳遞標(biāo)準(zhǔn)化
From Horn to Various Code——Encoding and standardization of Chinese Information Transmission
文/五花肉
——《墨子·卷十五》
Each officer has his own six flags with 8.34-meter-long staff and 5-meter-long width.When the enemies reach the bank of moat the defending troops hit the drum 3 times and hang a flag.When the enemies climb up the rampart by half the defending troops keep hitting the drum.In night,the defending troops replace flags with torches.The number of torches equal to the flags.If the enemies retreat,the defending troops will hang equal number of flags but won't hit the drum.
——胡適/《四角號碼檢字法》序
One stands for horizontal stroke;two and three stand for vertical stroke;four and five stand for left-falling stroke;six stands for dot and right-falling stroke;seven stands for cross;and eight and nine stand for left and right hooks.
從甲、金、篆、隸發(fā)展到楷書,再到信息時(shí)代的計(jì)算機(jī)中文字符,漢字伴隨著中華文明而生、而盛。除了以紙為媒、手書印刷等傳統(tǒng)記錄傳播方式外,中華民族也借助推進(jìn)漢字字形的標(biāo)準(zhǔn)化,探索出以文字為內(nèi)容、以編碼為載體的漢字信息傳遞方式。
From oracle;inscriptions on ancient bronze objects; the lesser seal character;official script;regular script to Chinese characters in computer,Chinese characters have witnessed Chinese civilization development.Besides paper media,with the concept of standardization, Chinese people have developed encoding methods of Chinese characters for information transmission.
狼煙旌旗、鼓角錚鳴,這些詞語慣常被用以指代沙場征戰(zhàn),它們既是千百年來軍隊(duì)交換情報(bào)、傳遞命令的常用方法,也是古人利用編碼技術(shù)傳遞信息的最初萌芽。盡管中國古代兵家曾為這些通信手段制定了使用標(biāo)準(zhǔn),但借此傳遞的信息卻始終無法逾越人類的視聽范圍。
Beacon tower and horns were the general methods for information exchange in ancient battle field,which was the origin of encoding technology for information transmission.Even though ancient Chinese developed standards for the communication,the communication could not go beyond the limitation of seeing and hearing.
直到1925年,隨著電報(bào)碼在近代的引入和使用,上海人王云五在其基礎(chǔ)上開發(fā)出具有檢字功能的四角編碼,最原始的漢字編碼誕生了。雖然這種編碼因?yàn)橹卮a較多而無法作為計(jì)算機(jī)的輸入編碼,但它給人們的啟示卻有著劃時(shí)代意義——利用漢字的某些特征加上有序符號,可以使?jié)h字具備有序性、實(shí)現(xiàn)有理化,形成了漢字信息技術(shù)處理的雛形。電報(bào)碼和四角號碼也成為當(dāng)時(shí)中國社會(huì)用字?jǐn)?shù)字化和標(biāo)準(zhǔn)化的兩大成就。
In 1925,with the introduction telegraph code in China,Shanghainese Wang Yunwu developed"four corner number code",which was the origin of Chinese character encoding method.Although the code was not suitable for computer input because of coincident code it inspired the concept,that is to say,we can encoding Chinese characters according to character pattern and font,which was the origin of modern information processing of Chinese characters.Telegraph code and four corner number code were the two achievements in Chinese standardization and digitization at that time.
進(jìn)入20世紀(jì)80年代,隨著《信息交換用漢字編碼字符集基本集》(GB 2313-80)的發(fā)布,漢語言邁入信息化時(shí)代。在短短30年間,中國推出上千種漢字編碼方法和數(shù)十種輸入法,呈現(xiàn)出萬“碼”奔騰的局面。近年,更借助標(biāo)準(zhǔn)化的規(guī)范統(tǒng)一,形成了音碼、形碼、手寫/語音等主流漢字輸入法?!皾h字信息處理與印刷革命”成為僅次于“兩彈一星”的20世紀(jì)我國重大工程建設(shè)成就。
In 1980s,the publication of the Chinese national standards"Information technology-Chineseideograms coded character set basic set"(GB 2313-80)symbolized the informatization of Chinese character.In thirty years,Chinese people have developed thousands of encoding methods and dozens of Chinese input methods.In recent years,with the progress of standardization in Chinese character,the input methods have been integrated into major methods including tone codes;bar codes;handwriting and voice input. Chinese character information processing and revolution in printing were the greatest achievements second to"two bombs and one satellite"in the 20th century in China.
當(dāng)下,隨著大數(shù)據(jù)時(shí)代的開啟和語音識(shí)別技術(shù)的突破,漢字信息處理技術(shù)又一次迎來了發(fā)展高峰。漢字語音識(shí)別技術(shù)廣泛應(yīng)用在IOS、安卓等智能手機(jī)平臺(tái);中文域名日益普遍,漢字及漢語言文化在“地球村”中的地位日漸提升。未來,伴隨著中華民族的復(fù)興,漢字必然會(huì)使中華文明在信息化社會(huì)綻放出更為奪目的光彩!
Recently,with the development oftechnology of massive datasets and speech recognition,Chinese character information processing witnessed the second development peak.Chinese speech recognition has been widely applied in smart phone operation system such as IOS and Android and,Chinese domain names become popular in internet.Chinese language and civilization are playing more and more important role in global village.With the resurrection of Chinese nation,we believe that Chinese character will make Chinese civilization rejuvenate in the information society.
(支持單位:上海市質(zhì)量和標(biāo)準(zhǔn)化研究院)