跳到主要內容

簡易檢索 / 詳目顯示

研究生: 呂紹銘
Lyu Shao-Ming
論文名稱: 基於KVM虛擬機器的記憶體層級同步之網路服務容錯架構
A Fault-Tolerant Kernel-Virtual-Machine Architecture for Network Services based on Continuous Checkpointing
指導教授: 王尉任
Wei-Jen Wang
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2015
畢業學年度: 103
語文別: 中文
論文頁數: 39
中文關鍵詞: KVM容錯系統高可用性虛擬機器網路服務
外文關鍵詞: fault tolerance, virtual machine, network service, high availability, KVM
相關次數: 點閱:13下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著虛擬化技術廣泛性的使用,雲端上的各種網路服務已逐漸轉向使用虛擬機器等虛擬化資源,但是虛擬化技術仍無法排除各種雲端服務單點故障的問題,例如實體機器的故障會影響執行於其上的虛擬機器,因此如何提升虛擬機器高可用性就成了一個迫切需要解決的問題。虛擬機自動容錯技術主要是利用備援虛擬機的技術,在遠端實體機上對主虛擬機(服務提供者)進行持續的狀態同步,當主虛擬機器因各種因素停止服務時,就可以立即使用備援虛擬機繼續提供網路服務,讓主虛擬機上的雲端網路服務不會中斷。但根據我們的研究,目前基於KVM的開源容錯系統如Kemari與Micro-Checkpointing無法讓網路服務順暢的運作,因此中央大學資訊工程研究所的平行與分散式運算實驗室提出了一個針對網路服務的容錯架構M-FTVM,而這個架構能夠降低容錯系統對雲端網路服務的影響。本研究承襲了這樣的容錯架構,並進行效能上的改進,經過實驗測試,改進後的效能為改進前的4倍,而且和其它基於KVM的開源容錯系統相比,改進後的容錯系統在網路服務品質的部分獲得相當大幅度的改善。


    With the widespread use of the virtualization technology, many network services on the cloud have been using virtual machines as their computing resources. Although virtualization provides many preferable features to cloud platforms, such as good manageability and sever consolidation, it still faces the problem of the single-point failure. For example, a physical machine failure consequently fails all the virtual machines that are running on it. Automatic fault tolerance for VM is one way to solve this problem. That is, a backup virtual machine keeps synchronized with the virtual machine to be protected, and replaces the role of the protected virtual machine as it is down. Based on our study, the existing open-source fault-tolerant VM solutions, Kemari and Micro-Checkpointing, do not work smoothly when hosting a network service. We even found that, a Micro-Checkpointing fault-tolerant VM crashes very often. Therefore, we have proposed a novel design of a fault-tolerant virtual machine based on KVM, namely M –FTVM. We have also implemented a prototype of the proposed fault-tolerant VM, and keep working on improving its performance. This paper focuses on the techniques of performance improvement for M-FTVM. We have used the DVD-Store benchmark to evaluate the performance of M-FTVM. The experimental result shows that, the latest M-FTVM is about four times as fast as the original version, about three times as fast as Micro-Checkpointing, and about seven times as fast as Kemari, when measured in operations per minute.

    摘要 i Abstract ii 目錄 iii 圖目錄 v 表目錄 vi 第一章 緒論 1 1-1 研究背景 1 1-2 研究動機 2 1-3 論文貢獻 3 1-4 論文架構 3 第二章 相關研究 4 2-1 背景知識 4 2-1-1 Kernel-based Virtual Machine 4 2-1-2 QEMU 5 2-1-3 Continuous Checkpointing 5 2-1-4 Lock-stepping 5 2-2 KVM容錯系統和VMware 5 2-2-1 Kemari 5 2-2-2 Micro-Checkpointing 6 2-2-3 其他研究 7 第三章 系統架構 8 3-1 主要架構 8 3-2 系統運作流程 9 3-3 Self-checkpoint &Active-checkpoint 10 第四章 效能改進 14 4-1 效能分析 14 4-2 降低Active-checkpoint的頻率 16 4-3 降低每次Active-checkpoint的時間 19 第五章 實驗結果 21 5-1 實驗環境與架構 21 5-2 DVD Store Benchmark實驗結果與分析 23 5-2-1 Operation per Minute 23 5-2-2 Response Time 24 5-2-3 每執行一個operation所消耗的頻寬 26 第六章 結論及未來研究方向 28 第七章 參考文獻 30

    [1] G. J. Popek and R. P. Goldberg, “Formal requirements for virtualizable third generation architectures,” Commun ACM, vol. 17, no. 7, pp. 412–421, 1974.

    [2] R. P. Goldberg, “Survey of virtual machine research,” Computer, vol. 7, no. 6, pp. 34–45, Jun. 1974.

    [3] S. N. T. Chiueh and S. Brook, “A survey on virtualization technologies,” RPE Rep., pp. 1–42, 2005.

    [4] W.-C. Feng, “Making a case for efficient supercomputing,” Queue, vol. 1, no. 7, p. 54, 2003.

    [5] I. P. Egwutuoha, D. Levy, B. Selic, and S. Chen, “A survey of fault tolerance mechanisms and checkpoint/restart implementations for high performance computing systems,” J Supercomput, vol. 65, no. 3, pp. 1302–1326, 2013.

    [6] J. Gray and D. P. Siewiorek, “High-availability computer systems,” Computer, vol. 24, no. 9, pp. 39–48, Sep. 1991.

    [7] T. Hirt, “Kvm-the kernel-based virtual machine,” Red Hat Inc, 2010.

    [8] M. Zabaljauregui, Hardware Assisted Virtualization. Intel Virtualization Technology. Buesnos Aires, June, 2008.

    [9] “AMD Virtualization.” [Online]. Available: http://www.amd.com/en-us/solutions/servers/virtualization. [Accessed: 11-Jun-2015].

    [10] F. Bellard, “QEMU, a Fast and Portable Dynamic Translator.,” in USENIX Annual Technical Conference, FREENIX Track, pp. 41–46, 2005

    [11] B. Cully, G. Lefebvre, D. Meyer, M. Feeley, N. Hutchinson, and A. Warfield, “Remus: High availability via asynchronous virtual machine replication,” in Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, pp. 161–174, 2008

    [12] D. J. Scales, M. Nelson, and G. Venkitachalam, “The design and evaluation of a practical system for fault-tolerant virtual machines,” Technical Report VMWare-RT-2010-001, VMWare, 2010.

    [13] T. C. Bressoud and F. B. Schneider, “Hypervisor-based fault tolerance,” ACM Trans. Comput. Syst. TOCS, vol. 14, no. 1, pp. 80–107, 1996.

    [14] Y. Tamura, K. Sato, S. Kihara, and S. Moriai, “Kemari: Virtual machine synchronization for fault tolerance,” in Proc. USENIX Annu. Tech. Conf.(Poster Session), 2008.

    [15] “Features/MicroCheckpointing - QEMU.” [Online]. Available: http://wiki.qemu.org/Features/MicroCheckpointing. [Accessed: 24-Nov-2014].

    [16] M. Lu and T. Chiueh, “Fast memory state synchronization for virtualization-based fault tolerance,” in Dependable Systems & Networks, 2009. DSN’09. IEEE/IFIP International Conference on, pp. 534–543, 2009.

    [17] M. Lu and T. Chiueh, “Speculative Memory State Transfer for Active-Active Fault Tolerance,” pp. 268–275, 2012.

    [18] B. Gerofi and Y. Ishikawa, “Workload Adaptive Checkpoint Scheduling of Virtual Machine Replication,” pp. 204–213, 2011.

    [19] B. Gerofi and Y. Ishikawa, “RDMA Based Replication of Multiprocessor Virtual Machines over High-Performance Interconnects,” pp. 35–44, 2011.

    [20] S. Kasampalis, “Copy On Write Based File Systems Performance Analysis And Implementation,” Dostopno Prek Httpfaif Object. Netdownload-Copy-Onwrite-Based-File-Syst. 12 10 2014, 2010.

    [21] “VMware vSphereTM 4 Fault Tolerance: Architecture and Performance.” [Online]. Available: http://www.vmware.com/files/pdf/perf-vsphere-fault_tolerance.pdf. [Accessed: 27-Jul-2015].

    QR CODE
    :::