| 研究生: |
蘇政嘉 Zheng-Jia Su |
|---|---|
| 論文名稱: |
植基於Openstack之高可用性軟體定義叢集與虛擬機器保護機制 Providing High Availability for Virtual Machines on Software-Defined Clusters over Openstack Platforms |
| 指導教授: |
王尉任
Wei-Jen Wang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2016 |
| 畢業學年度: | 104 |
| 語文別: | 中文 |
| 論文頁數: | 53 |
| 中文關鍵詞: | 雲端運算 、Openstack 、高可用性 、虛擬機器 、軟體定義叢集 |
| 外文關鍵詞: | Cloud computing, Openstack, High availability, Virtual machine, Software-defined cluster |
| 相關次數: | 點閱:9 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來雲端運算越來越熱門,許多企業使用雲端運算技術作為自身系統的基礎設施,透過租用虛擬機器,藉此提升系統的可擴展性,並且能夠節省機房的維運成本,使用多少資源付費多少的特性也能避免硬體資源浪費。而目前Openstack是受到許多討論的雲端系統軟體,官方也揭露出目前已經有數百個使用案例。
當越來越多人使用Openstack作為自身系統服務,系統的可靠性成了很重要的議題,若是系統故障導致虛擬機器停止運作,使服務中止,可能會造成非常大的金錢損失。因此,本研究提出軟體可定義之高可用性叢集的概念,讓雲端系統管理者能夠依據自身環境規劃數個不同叢集,使得虛擬機器需要進行故障轉移時能在所屬之叢集內自動復原。為了支援高可用性叢集功能,本研究提出一套故障偵測及復原機制。本研究將這套機制實作於Openstack, 因此這個系統會承襲Openstack現有的各項功能。我們研發的故障偵測及復原機制會針對運算節點上的系統服務、節點本身的硬體、作業系統故障進行監控。當發生故障發生時,就會自動進行修復,並進行故障轉移,讓虛擬機器能在短時間內恢復正常運作,不須透過人為介入。透過本研究所提供的高可用性軟體定義叢集的功能,系統管理者將可以有更大的彈性來分配系統計算資源,也就是以軟體定義叢集為單位分配資源給不同的使用群組,並提供HA功能給軟體定義叢集上的虛擬機器。
The cloud computing technologies have played important roles in both academia and industry. Server virtualization is one of the critical cloud computing technologies that provides different kinds of virtual machines to end-users as their computing resources. Many enterprises have already adopted this technology because it makes a computing system scalable and flexible. The Openstack is a software package to build a cloud computing platform. Till June, 2016, about 200 user stories have been revealed on the official website of Openstack.
The availability of virtual machines on Openstack platform is a big issue. When a virtual machine goes down, the services or system built on that virtual machine also goes down. It may make a great loss to the enterprise. Therefore, we propose the software-defined high-availability clusters function over Openstack platform. The administrator can accord to the datacenter environment to divide the computing pool into multiple software-defined clusters. The virtual machine on the cluster will be protected. When one of the host in the cluster failed, affected virtual machines will be restarted on another functional host in the same cluster. So, we provide failure detection and recovery mechanism to detect failure and recover virtual machine automatically. The detection mechanism can detect the failures including Openstack nova service failure, libvirt failure, host operation system failure and hardware failure. All above failures can cause a virtual machine to go down. Using Openstack platform with our proposed function, administrator can manage computing resources flexibly. The administrator is able to assign clusters to different users or departments, and provide virtual machine protection service through the detection and recovery mechanism.
[1] 陳瑩, 雲端策略:雲端運算與虛擬化技術: 天下雜誌, 2010.
[2] What is Cloud Computing? Available: http://aws.amazon.com/what-is-cloud-computing
[3] IBM, "Virtualization in Education (white paper)," 2007.
[4] J. Sahoo, S. Mohapatra, and R. Lath, "Virtualization: A Survey on Concepts, Taxonomy and Associated Security Issues," in Computer and Network Technology (ICCNT), 2010 Second International Conference on, 2010, pp. 222-226.
[5] J. E. Smith and R. Nair, "The architecture of virtual machines," Computer, vol. 38, pp. 32-38, 2005.
[6] K. Hwang, G. C. Fox, and J. J. Dongarra, Distributed and cloud computing : from parallel processing to the Internet of things. Amsterdam ; Boston: Morgan Kaufmann, 2012.
[7] VMware web site. Available: http://www.vmware.com
[8] Xen project web site. Available: http://www.xenproject.org
[9] KVM web site. Available: http://www.linux-kvm.org
[10] S. Marston, Z. Li, S. Bandyopadhyay, J. Zhang, and A. Ghalsasi, "Cloud computing — The business perspective," Decision Support Systems, vol. 51, pp. 176-189, 2011.
[11] P. Mell and T. Grance, "The NIST definition of cloud computing," 2011.
[12] O. SEFRAOUI, M. AISSAOUI, and M. ELEULDJ, "OpenStack: Toward an Open-Source Solution for Cloud Computing," International Journal of Computer Applications (0975 - 8887), vol. 55, 2012.
[13] D. Grzonka, M. Szczygieł, A. Bernasiewicz, A. Wilczyński, and M. Liszka, "Short Analysis of Implementation and Resource Utilization for the Openstack Cloud Computing Platform," presented at the 29th European Conference on Modelling and Simulation, Albena (Varna), Bulgaria, 2015.
[14] OpenStack Cloud Administrator Guide : Conceptual architecture. Available: http://docs.openstack.org/admin-guide-cloud/common/get_started_conceptual_architecture.html
[15] HP, "Improving Availability and Lowering TCO with HP Integrity Servers and OpenVMS: Comparing mid-range UNIX cluster TCO, availability, and business value (white paper)," 2008.
[16] M. Ram, "On system reliability approaches: a brief survey," International Journal of System Assurance Engineering and Management, vol. 4, pp. 101-117, 2013.
[17] (2015). Hard Drive Reliability Stats for Q1 2015. Available: https://www.backblaze.com/blog/best-hard-drive
[18] OpenStack High Availability Guide. Available: http://docs.openstack.org/ha-guide
[19] "Pacemaker web site."
[20] "Corosync project web site."
[21] DRBD web site. Available: http://www.drbd.org/en
[22] "VMware vSphere web site."
[23] vSphere Availability Guide. Available: http://pubs.vmware.com/vsphere-60/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-601-availability-guide.pdf
[24] XenServer 6.2.0 Administrator's Guide. Available: http://support.citrix.com/article/CTX141500
[25] L. Perkov, N. Pavkovi, x, J. Petrovi, and x, "High-availability using open source software," in MIPRO, 2011 Proceedings of the 34th International Convention, 2011, pp. 167-170.
[26] Y. Yamato, Y. Nishizawa, S. Nagao, and K. Sato, "Fast and reliable restoration method of virtual resources on OpenStack," IEEE Transactions on Cloud Computing, vol. PP, pp. 1-1, 2015.
[27] M. Hormati, F. Khendek, and M. Toeroe, "Towards an Evaluation Framework for Availability Solutions in the Cloud," in Software Reliability Engineering Workshops (ISSREW), 2014 IEEE International Symposium on, 2014, pp. 43-46.
[28] T. Ristenpart, E. Tromer, H. Shacham, and S. Savage, "Hey, you, get off of my cloud: exploring information leakage in third-party compute clouds," presented at the Proceedings of the 16th ACM conference on Computer and communications security, Chicago, Illinois, USA, 2009.