| 研究生: |
林坤賢 Kun-hsien Lin |
|---|---|
| 論文名稱: |
Cloud-R:以R軟體與雲端技術為基礎的生物統計應用網站 Cloud-R: an R biostatistical computation and graphics environment in the cloud |
| 指導教授: |
王孫崇
Sun-chong Wang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
生醫理工學院 - 系統生物與生物資訊研究所 Graduate Institute of Systems Biology and Bioinformatics |
| 畢業學年度: | 98 |
| 語文別: | 中文 |
| 論文頁數: | 59 |
| 中文關鍵詞: | 雲端計算 、統計分析 、R軟體 、web service |
| 外文關鍵詞: | cloud computing, statistical analysis, R software |
| 相關次數: | 點閱:17 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
R是一種開放原始碼的程式語言,在統計和圖形運算上有著非常出色的表現[1,2],目前廣泛應用在統計與生物資訊的領域中。而Bioconductor則是專在R語言環境中所設計來分析生物晶片資料與基因組數據分析的套件包[3]。目前公開發布的R軟體提供單機版的安裝,並能在微軟的視窗系統、Linux以及MacOS X等作業系統上安裝。當基因資料越趨龐大的同時,單機版R軟體所需的電腦記憶體容量以及儲存容量也就更顯重要。雲端運算憑藉著網際網路的普及和連線速度的增快,讓軟體服務偏向透過網頁介面以減輕使用者端電腦環境的負擔,並透過遠端的資料中心提供穩定的服務[4]。Cloud-R即是一個網路服務,提供了一個線上使用R軟體所有功能的平台。對於使用Cloud-R平台的使用者最基本的要求是能夠有使用過R語言的經驗。為了讓Cloud-R平台能夠有著充足的運算資源,Cloud-R平台提供了一個快速簡單的管道讓使用者能夠貢獻閒置的電腦資源。使用者貢獻的電腦資源能夠隨心所欲的給予或是收回,依照自身的使用需求做合理的安排。貢獻者是透過在R軟體之上擴充安裝的nws套件,經由nws套件提供的函數,可以很輕易的達到電腦共同合作運算的目的[5]。Cloud-R平台可在以下網址:http://epigenomics.ncu.edu.tw/Cloud-R/登入。
R is an open source programming language for statistical and graphical computation that is popular among statisticians and bioinformaticians [1,2]. In particular, an R project called Bioconductor has been dedicated to the analysis and comprehension of genomic metadata using R [3].R is available in the form of standalone software that runs on Windows, MacOS X and Linux machines. As the volume of genomic data continues to explode, a high capacity R environment is in need. Cloud computing, aided by wider Internet penetration and faster Web communication, represents a paradigm shift in computation toward deployment of applications to remote data centers [4]. We propose to let users run their R programs through web browsers. Cloud-R is such a Web server that provides R utilities over the Internet. A basic requirement of Cloud-R development is that user experience of Cloud-R be identical to that of regular R. More importantly, to the goal of virtually limitless computational resources for R, Cloud-R allows users to freely contribute their computers to Cloud-R. Users’ contributions of hardware can be made and withdrawn at anytime by themselves. Connection of a contributor’s computer is via nws, an R package for coordinate programming [5]. Cloud-R can login at this URL: http://epigenomics.ncu.edu.tw/Cloud-R/.
[1] RDevelopment Core Team, R: A language and environment foRstatistical computing, RFoundation foRStatistical Computing, Vienna, Austria, 2009.
[2] J.M. Chambers, Software foRData Analysis: Programming with R, 2nd, Aug, Springer, New York, 2009.
[3] Bioconductor. From: http://www.bioconductor.org/
[4] B. Hayes, “Cloud computing”, Communications of the ACM , Vol.51, pp.9-11, 2008.
[5] R.D. Bjornson, N.J. Carriero, M.H. Schultz, P.M. Shields, S.B. Weston “NetWorkSpace: a coordination system foRhigh-productivity environments”, Int J Parallel Prog, Vol. 37, pp.106-125, 2009.
[6] GNU. From: http://www.gnu.org/
[7] The Comprehensive RArchive Network. From: http://cran.r-project.org/
[8] Globus. From: http://www.globus.org/
[9] J. Fox, “Aspects of the social organization and trajectory of the Rproject”, The RJournal, Vol. 1/2, pp.5-13, DecembeR2009.
[10] R. Gentleman, V. J. Carey, D. M. Bates, B. Bolstad, M. Dettling, S. Dudoit, B. Ellis, L. Gautier, Y. Ge, et al, ” Bioconductor: Open software development foRcomputational biology and bioinformatics”, Genome Biology, Vol. 5, pp.80, 2004.
[11] J. Knaus, C. Porzelius, H. Binder, G. Schwarzer, “EasieRparallel computing in Rwith snowfall and sfCluser”, The RJournal, Vol. 1/1, p.54-59, 2009.
[12] M. Schmidberger, M. Morgan, D. Eddelbuettel, H. Yu, L. Tierney, U. Mansmann, “State of the art in parallel computing with R”, J Stat Software, Vol. 31( 1 ), p.1-27, 2009.
[13] G. Vera, R.C. Jansen, R.L. Suppi, “R/parallel – speeding up bioinformatics analysis with R”, BMC Bioinformatics, Vol. 9( 1 ) , pp.390, 2008.
[14] J. Hill, M. Hambley, T. Forster, M. Mewissen, T.M. Sloan, F. Scharinger, A. Trew, P. Ghazal,” SPRINT: A new parallel framework foRR”, BMC Bioinformatics, Vol. 9, pp.558, 2008.
[15] High-Performance and Parallel Computing with R. From:
http://cran.r-project.org/web/views/HighPerformanceComputing.html
[16] 教育部校園自由軟體應用諮詢中心, 開放源碼運動如何改變我們的教育. From: http://ossacc.moe.edu.tw/modules/tadnews/index.php?nsn=997
[17] Open Source Initiative, Open Source. From: http://www.opensource.org/
[18] Open Software Foundation. From: http://www.opensoftwarefoundry.org/en-us/
[19] Openfoundry . From: http://www.openfoundry.org/
[20] 經濟部工業局自由軟體產業應用推動計畫, 2009開放原始碼創新應用開發大賽. From: http://www.oss.org.tw/contest/index.html
[21] R. Buyya, S. Venugopal, “A Gentle Introduction to Grid Computing and Technologies”, CSI Communications, Vol.29(1), pp.9-19, 2005
[22] University of California, SETI@home . From: http://setiathome.berkeley.edu/
[23] Evolutionary-research, evolution@home. From: http://evolutionary-research.net/
[24] Software & Information Industry''s (SIIA), “Strategic Backgrounder: Software as a Service”, Feb 2001.
[25] Salesforce. From: http://www.salesforce.com/
[26] 數位時代編輯部, “雲端運算+Mobile=未來生活關鍵報告”, 數位時代, 第187期, 2009.
[27] Message Passing Interface Forum , “MPI: A Message-Passing Interface Standard”, Nov 15, 2003
[28] L. Ferreira, V. Berstis, J. Armstrong, M. Kendzierski, A. Neukoetter, M. Takagi, R. Bing-Wo, A. Amir, R. Murakawa, O. Hernandez,J. Magowan, N. Bieberstein, “Introduction to Grid Computing with Globus”,IBM Redbooks, Sep 2003.
[29] S Ghemawat, H. Gobioff, S.T. Leung, “The Google File System”, Oct 2003.
[30] 數位時代, “雲端運算風暴來襲 Google、微軟加碼,IBM、HP、Cisco搶進”, 第173期, Oct 2008
[31] J. Banfield, “Rweb: Web-based statistical analysis”, J Stat Software , Vol. 4( 01 ), pp.03, 1999.
[32] J. Horner, rapache: Web application development with Rand Apache, 2009. From: http://biostat.mc.vanderbilt.edu/rapache/
[33] T. A. Short, Rpad, 2006. From: http://rpad.googlecode.com/svn-history/r76/Rpad_homepage/index.html
[34] php . From: http://php.net/index.php
[35] G. Amdahl, “Validity of the single processor approach to achieving large-scale computing capability”, AFIPS Conference Proceedings,Vol. 30, pp.483-485, NJ, USA, 1967.
[36] R:Evolution Computing with support, contributions from PfizeRand Inc. nws: Rfunctions foR NetWorkSpaces and Sleigh. R package version 1.7.0.0. From: http://nws-r.sourceforge.net/
[37] I Foster, Y Zhao, I Raicu, S Lu,” Cloud Computing and Grid Computing 360-Degree Compared”, Grid Computing Environments Workshop, Nov 2008.
[38] HDFS. From: http://hadoop.apache.org/hdfs/l
[39] D Nurmi, RWolski, C Grzegorczyk, G Obertelli, S Soman, L Youseff, D Zagorodnov, “The Eucalyptus Open-source Cloud-computing System”, 9th IEEE/ACM International Symposium on ClusteRComputing and the Grid, pp. 124-131, Shanghai, China, May 2009.
[40] Wikipedia,生物資訊. From: http://upload.wikimedia.org/wikipedia/commons/4/43/Genome_viewer_screenshot_small.png
[41] CenteRfoRBioinformatics, BLAST. From: http://www.cbi.pku.edu.cn/docs/faq/BlastSpecifics.html#q3
[42] Jeff Banfield, Rweb. From: http://bayes.math.montana.edu/Rweb
[43] Karim Chine, “Biocep, Towards a Federative, Collaborative, User-Centric, Grid-Enabled and Cloud-Ready Computational Open Platform”, Fourth IEEE International Conference on eScience, pp.321-322, Indiana, USA, Dec 2008.
[44] Karim Chine, Biocep. From: http://biocep-distrib.r-forge.r-project.org/
[45] Gartner, Seven cloud-computing security risks. From: http://www.gartner.com/technology/