跳到主要內容

簡易檢索 / 詳目顯示

研究生: 林坤賢
Kun-hsien Lin
論文名稱: Cloud-R:以R軟體與雲端技術為基礎的生物統計應用網站
Cloud-R: an R biostatistical computation and graphics environment in the cloud
指導教授: 王孫崇
Sun-chong Wang
口試委員:
學位類別: 碩士
Master
系所名稱: 生醫理工學院 - 系統生物與生物資訊研究所
Graduate Institute of Systems Biology and Bioinformatics
畢業學年度: 98
語文別: 中文
論文頁數: 59
中文關鍵詞: 雲端計算統計分析R軟體web service
外文關鍵詞: cloud computing, statistical analysis, R software
相關次數: 點閱:17下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • R是一種開放原始碼的程式語言,在統計和圖形運算上有著非常出色的表現[1,2],目前廣泛應用在統計與生物資訊的領域中。而Bioconductor則是專在R語言環境中所設計來分析生物晶片資料與基因組數據分析的套件包[3]。目前公開發布的R軟體提供單機版的安裝,並能在微軟的視窗系統、Linux以及MacOS X等作業系統上安裝。當基因資料越趨龐大的同時,單機版R軟體所需的電腦記憶體容量以及儲存容量也就更顯重要。雲端運算憑藉著網際網路的普及和連線速度的增快,讓軟體服務偏向透過網頁介面以減輕使用者端電腦環境的負擔,並透過遠端的資料中心提供穩定的服務[4]。Cloud-R即是一個網路服務,提供了一個線上使用R軟體所有功能的平台。對於使用Cloud-R平台的使用者最基本的要求是能夠有使用過R語言的經驗。為了讓Cloud-R平台能夠有著充足的運算資源,Cloud-R平台提供了一個快速簡單的管道讓使用者能夠貢獻閒置的電腦資源。使用者貢獻的電腦資源能夠隨心所欲的給予或是收回,依照自身的使用需求做合理的安排。貢獻者是透過在R軟體之上擴充安裝的nws套件,經由nws套件提供的函數,可以很輕易的達到電腦共同合作運算的目的[5]。Cloud-R平台可在以下網址:http://epigenomics.ncu.edu.tw/Cloud-R/登入。


    R is an open source programming language for statistical and graphical computation that is popular among statisticians and bioinformaticians [1,2]. In particular, an R project called Bioconductor has been dedicated to the analysis and comprehension of genomic metadata using R [3].R is available in the form of standalone software that runs on Windows, MacOS X and Linux machines. As the volume of genomic data continues to explode, a high capacity R environment is in need. Cloud computing, aided by wider Internet penetration and faster Web communication, represents a paradigm shift in computation toward deployment of applications to remote data centers [4]. We propose to let users run their R programs through web browsers. Cloud-R is such a Web server that provides R utilities over the Internet. A basic requirement of Cloud-R development is that user experience of Cloud-R be identical to that of regular R. More importantly, to the goal of virtually limitless computational resources for R, Cloud-R allows users to freely contribute their computers to Cloud-R. Users’ contributions of hardware can be made and withdrawn at anytime by themselves. Connection of a contributor’s computer is via nws, an R package for coordinate programming [5]. Cloud-R can login at this URL: http://epigenomics.ncu.edu.tw/Cloud-R/.

    目 錄 中文摘要 i ABSTRACT ii 目 錄 iii 圖目錄 v 第一章 緒論 7 1-1 研究背景 7 1-1-1 自由軟體的意義 7 1-1-2 R統計軟體的發展過程 8 1-1-3 大型運算的歷史背景 10 1-1-4 網路服務的興起 12 1-1-5 雲端運算的具體意像 13 1-1-6 生物分析的需求趨勢 17 1-2 Cloud-R的研究動機 19 第二章 Cloud-R的設計與方法介紹 21 2-1 Cloud-R的基礎架構 21 2-2 使用平台、技術與系統架構設計 22 4-2 軟體規格 23 4-3 方法實作 23 4-3-1 實作輸入R命令稿 23 4-3-2 實作調用R 25 4-3-3 實作輸出R結果 27 4-3-4 實作啟動NetWorkSpaces Server、node監控 28 4-3-5 實作Cloud-R雲端運算虛擬化機制 30 第三章 Cloud-R操作步驟與數據分析 32 3-1 Cloud-R功能介紹 32 3-2 Cloud-R功能操作 34 3-3 貢獻節點者名單 41 3-4 Cloud-R效能評估 42 第四章 結論及未來可能的研究方向 45 4-1 總結 45 4-2 未來可能的研究方向 48 第五章 附錄 50 4-3 附錄一 50 4-4 附錄二 56 第六章 參考資料 58

    [1] RDevelopment Core Team, R: A language and environment foRstatistical computing, RFoundation foRStatistical Computing, Vienna, Austria, 2009.
    [2] J.M. Chambers, Software foRData Analysis: Programming with R, 2nd, Aug, Springer, New York, 2009.
    [3] Bioconductor. From: http://www.bioconductor.org/
    [4] B. Hayes, “Cloud computing”, Communications of the ACM , Vol.51, pp.9-11, 2008.
    [5] R.D. Bjornson, N.J. Carriero, M.H. Schultz, P.M. Shields, S.B. Weston “NetWorkSpace: a coordination system foRhigh-productivity environments”, Int J Parallel Prog, Vol. 37, pp.106-125, 2009.
    [6] GNU. From: http://www.gnu.org/
    [7] The Comprehensive RArchive Network. From: http://cran.r-project.org/
    [8] Globus. From: http://www.globus.org/
    [9] J. Fox, “Aspects of the social organization and trajectory of the Rproject”, The RJournal, Vol. 1/2, pp.5-13, DecembeR2009.
    [10] R. Gentleman, V. J. Carey, D. M. Bates, B. Bolstad, M. Dettling, S. Dudoit, B. Ellis, L. Gautier, Y. Ge, et al, ” Bioconductor: Open software development foRcomputational biology and bioinformatics”, Genome Biology, Vol. 5, pp.80, 2004.
    [11] J. Knaus, C. Porzelius, H. Binder, G. Schwarzer, “EasieRparallel computing in Rwith snowfall and sfCluser”, The RJournal, Vol. 1/1, p.54-59, 2009.
    [12] M. Schmidberger, M. Morgan, D. Eddelbuettel, H. Yu, L. Tierney, U. Mansmann, “State of the art in parallel computing with R”, J Stat Software, Vol. 31( 1 ), p.1-27, 2009.
    [13] G. Vera, R.C. Jansen, R.L. Suppi, “R/parallel – speeding up bioinformatics analysis with R”, BMC Bioinformatics, Vol. 9( 1 ) , pp.390, 2008.
    [14] J. Hill, M. Hambley, T. Forster, M. Mewissen, T.M. Sloan, F. Scharinger, A. Trew, P. Ghazal,” SPRINT: A new parallel framework foRR”, BMC Bioinformatics, Vol. 9, pp.558, 2008.
    [15] High-Performance and Parallel Computing with R. From:
    http://cran.r-project.org/web/views/HighPerformanceComputing.html
    [16] 教育部校園自由軟體應用諮詢中心, 開放源碼運動如何改變我們的教育. From: http://ossacc.moe.edu.tw/modules/tadnews/index.php?nsn=997
    [17] Open Source Initiative, Open Source. From: http://www.opensource.org/
    [18] Open Software Foundation. From: http://www.opensoftwarefoundry.org/en-us/
    [19] Openfoundry . From: http://www.openfoundry.org/
    [20] 經濟部工業局自由軟體產業應用推動計畫, 2009開放原始碼創新應用開發大賽. From: http://www.oss.org.tw/contest/index.html
    [21] R. Buyya, S. Venugopal, “A Gentle Introduction to Grid Computing and Technologies”, CSI Communications, Vol.29(1), pp.9-19, 2005
    [22] University of California, SETI@home . From: http://setiathome.berkeley.edu/
    [23] Evolutionary-research, evolution@home. From: http://evolutionary-research.net/
    [24] Software & Information Industry''s (SIIA), “Strategic Backgrounder: Software as a Service”, Feb 2001.
    [25] Salesforce. From: http://www.salesforce.com/
    [26] 數位時代編輯部, “雲端運算+Mobile=未來生活關鍵報告”, 數位時代, 第187期, 2009.
    [27] Message Passing Interface Forum , “MPI: A Message-Passing Interface Standard”, Nov 15, 2003
    [28] L. Ferreira, V. Berstis, J. Armstrong, M. Kendzierski, A. Neukoetter, M. Takagi, R. Bing-Wo, A. Amir, R. Murakawa, O. Hernandez,J. Magowan, N. Bieberstein, “Introduction to Grid Computing with Globus”,IBM Redbooks, Sep 2003.
    [29] S Ghemawat, H. Gobioff, S.T. Leung, “The Google File System”, Oct 2003.
    [30] 數位時代, “雲端運算風暴來襲 Google、微軟加碼,IBM、HP、Cisco搶進”, 第173期, Oct 2008
    [31] J. Banfield, “Rweb: Web-based statistical analysis”, J Stat Software , Vol. 4( 01 ), pp.03, 1999.
    [32] J. Horner, rapache: Web application development with Rand Apache, 2009. From: http://biostat.mc.vanderbilt.edu/rapache/
    [33] T. A. Short, Rpad, 2006. From: http://rpad.googlecode.com/svn-history/r76/Rpad_homepage/index.html
    [34] php . From: http://php.net/index.php
    [35] G. Amdahl, “Validity of the single processor approach to achieving large-scale computing capability”, AFIPS Conference Proceedings,Vol. 30, pp.483-485, NJ, USA, 1967.
    [36] R:Evolution Computing with support, contributions from PfizeRand Inc. nws: Rfunctions foR NetWorkSpaces and Sleigh. R package version 1.7.0.0. From: http://nws-r.sourceforge.net/
    [37] I Foster, Y Zhao, I Raicu, S Lu,” Cloud Computing and Grid Computing 360-Degree Compared”, Grid Computing Environments Workshop, Nov 2008.
    [38] HDFS. From: http://hadoop.apache.org/hdfs/l
    [39] D Nurmi, RWolski, C Grzegorczyk, G Obertelli, S Soman, L Youseff, D Zagorodnov, “The Eucalyptus Open-source Cloud-computing System”, 9th IEEE/ACM International Symposium on ClusteRComputing and the Grid, pp. 124-131, Shanghai, China, May 2009.
    [40] Wikipedia,生物資訊. From: http://upload.wikimedia.org/wikipedia/commons/4/43/Genome_viewer_screenshot_small.png
    [41] CenteRfoRBioinformatics, BLAST. From: http://www.cbi.pku.edu.cn/docs/faq/BlastSpecifics.html#q3
    [42] Jeff Banfield, Rweb. From: http://bayes.math.montana.edu/Rweb
    [43] Karim Chine, “Biocep, Towards a Federative, Collaborative, User-Centric, Grid-Enabled and Cloud-Ready Computational Open Platform”, Fourth IEEE International Conference on eScience, pp.321-322, Indiana, USA, Dec 2008.
    [44] Karim Chine, Biocep. From: http://biocep-distrib.r-forge.r-project.org/
    [45] Gartner, Seven cloud-computing security risks. From: http://www.gartner.com/technology/

    QR CODE
    :::