跳到主要內容

簡易檢索 / 詳目顯示

研究生: 黃文谷
Wen-ku Huang
論文名稱: 以C程式碼相似度比對法加強資訊安全於一企業之資料交換平台
Applying C Source Similarity Judgment to Enhance Information Security on An Enterprise Data Exchange Platform
指導教授: 楊鎮華
Stephen Yang
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系在職專班
Executive Master of Computer Science & Information Engineering
畢業學年度: 100
語文別: 中文
論文頁數: 50
中文關鍵詞: 分類特徵取樣Jaccard係數FTP代理程式
外文關鍵詞: FTP Proxy, Jaccard Coefficient, Classification Feature of sampling
相關次數: 點閱:14下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 由於近年來晶片設計產業蓬勃發展與外在競爭激烈環境下,相關企業已跳脫早期以硬體為主的經營方式。取而代之採用的是軟、硬體相互搭配的營運模式。
    這概念已經成為目前市場主流。如:電視、手機晶片開發商紛紛投入大量資源於相關應用軟體開發。應用軟體對於生產銷售的成敗進而產生重要的影響。
    晶片設計產業的從業人員往往都需透過相關程式碼循環不斷的進行除錯及新功能的開發。過程是如此容易剽竊或洩漏, 因此企業內部對於相關應用軟體原始程式碼進行嚴格控管,來防範程式碼的剽竊與外洩事件發生。所以一個具有安全性高、方便性佳又能夠符合企業需求的資料交換平台是不可或缺的。
    資料外洩或篡改的防範是近年來一個十分熱門的資訊議題,其上述概念主要是透過文件內容的相似度比對來判斷資料風險程度的資訊安全方案。本研究將沿用這種概念 應用在企業內部的資料交換平台。
    本研究將著重在C程式碼,透過C程式碼的分類特徵取樣與結合權重式Jaccard係數作為相似度判斷的基礎。並運用FTP代理程式來實作一個資料交換平台。


    After flourishing development of IC design industry, the design industry has evolved into a keen competitive industrial environment. Related enterprises thus have to get away early hardware-oriented operation mode and combine software with hardware into operation -oriented mode instead. This application-oriented conception is widely adopted and becoming the mainstream of IC design industry. Such as: chip developers of TV and smart phone have devoted considerable resources to related application software development. Application software for production and marketing in turn have major influence on the success of product sales.
    Designers of chips require high frequency of teamwork code developing, retrieval and debugging. The process is then so vulnerable to plagiary or leakage. To prevent incidents of leakage or plagiary of program code, enterprises all enforce their application source code to be strictly regulated. A highly secured, convenient and strict discipline-regulated data exchange platform to meet enterprise needs is therefore essential.
    In recent years, prevention of data leakage and distortion is a popular topic of enterprise information administration. The above concept is a mechanism that compares the similarity of document contents to detect and determine the risk level of security. In this study, we will apply the concept and implement an enterprise document exchange platform.
    The study will utilize "Source code classification feature sampling" of "C" language integrated with weighted "Jaccard coefficient" and running the enterprise document exchange platform on FTP proxy.

    目錄 摘要 ………………………………………………………………………………………………. II Abstract ………………………………………………………………………………………….. III 目錄 ……………………………………………………………………………………….……… IV 圖目錄 ……………………………………………………………………….……………….... VI 表目錄 ….……………………………………………………………………………..……… VIII 第一章、 緒論 …………………………………………………………….………………….. 1 1.1 研究背景 ……………………………………………………………………………… 1 1.2 研究動機 ……………………………………………………………………………… 1 1.3 研究問題 ……………………………………………………………………………… 2 1.4 解決方法 ……………………………………………………………………………… 4 第二章、 相關研究 …..………………………………………………………………….… 6 2.1文件相似度計算方法 ………………………………………………………….. 6 2.2文件向量相似度 ….………………………………………………….…………… 9 第三章、 研究方法 ……………………………………………………………………….. 18 3.1雜湊演算法 …………………………………………………………………………. 18 3.2分類式相似度加權計算 ……………………………………………………… 19 3.3權重與相似係數結合 …….…………………………………………………… 23 第四章、 系統實作 ……………………………………………………….………………. 26 4.1 系統流程 ………………………………………………..…………………………. 26 4.2 開發環境 …………………………………………….…………………………….. 30 4.3取樣及相似度比對實作說明 …………….……………………………….. 31 4.4 系統展示 …………………………………………………………………………… 33 第五章、 實驗結果 ……………………………………………………………………….. 36 5.1段落格式重新編排 ……………………………………………………….…… 36 5.2摻雜稀釋 ……………….…………………………………………………….….… 37 5.3檔案分割 ……………….…………………………………………………….….… 38 第六章、 結論 ……………………………………………………………………………….. 39 參考文獻……………………………………………………………………………………..…..41

    中文文獻
    [1]鍾淑艷、林進村、謝宜芳,Thinking critically About Ethics(2006)
    [2]劉江彬,評威盛VS友訊商業機密的爭議,智慧財產電子報第九期(2003)
    [3]黃美珠,暗藏研發資料 另開新公司 科技新貴侵權 5人吃官司,自由電子報
    (2005)
    [4]Hsin-Yu Luo, Evaluation of Information Retrieval Based Models for
    Recommendation of Pager Reviewers,(2008)
    [5]陳仕達,C程式碼比對分類(2004)
    [6]殷書庸,依據強相似度為基礎之數位權利管理(2004)
    [7]陳治宸,植基於個人郵件之雙層垃圾郵件過濾方法(2007)
    [8]王超亮,領域本體論為基之網頁知識擷取機制設計(1996)
    [9]陳彥霖,應用潛在語意分析於試題相似度比較之可行性(2006)
    [10]MSDN, mdq.Similarity (Transact-SQL)
    http://msdn.microsoft.com/zh-tw/library/ee633878(v=sql.105).aspx
    英文文獻
    [11]Salton, G. and Buckley, C.Term-weighting approaches in automatic text
    retrieval. Information Processing & Management 24(5): 513–523, 1988
    [12]Rabin, M. O., Fingerprinting by random Polynomails Center for
    Research in Computing Technology,Harvard University, Report
    TR-15-81,1981
    [13]Udi Manber,Finding similar files in a large file system In
    Proceedings of the USENIX Winter 1994 Technical Conference,
    1994,pp.1-10
    [14] Salton, G.Wong, A. and C.S.Yang,A Vector Space Model for Automatic
    Indexing (1975)
    [15] G. Salton and M. E. Lesk. Computer evaluation of indexing and text
    processing.Journal of the ACM, 15(1):8-36, 1968.
    [16] Dice, Lee R.Measures of the Amount of Ecologic Association Between
    Species. Ecology 26 (3): 297–302. (1945).
    [17] RFC 1321, section 3.4, Step 4. Process Message in 16-Word Blocks,
    page 5.
    [18] Port Stemming(http://tartarus.org/~martin/PorterStemmer/)
    [19] frox, a transparent ftp proxy(http://frox.sourceforge.net/)
    [20] Parse-RecDescent
    (http://search.cpan.org/~jtbraun/Parse-RecDescent-1.967009/)
    [21] Lingua-EN-Segmenter
    (http://search.cpan.org/~splice/Lingua-EN-Segmenter-0.1/lib/Lingua/EN/StopWords.pm)
    [22] Perlindex
    (http://search.cpan.org/~ulpfr/perlindex-1.302/lib/Text/English.pm)
    [23] Trend Micro LeakProof 5.0
    (http://br.trendmicro.com/imperia/md/content/us/pdf/products/ente
    rprise/leakproof/leakproof_datasheet.pdf)

    QR CODE
    :::