跳到主要內容

簡易檢索 / 詳目顯示

研究生: 黃柏儒
Po-Ru Huang
論文名稱: 針對Linux Guest OS虛擬機器容錯系統之VMM層級時間飄移改善機制
A VMM-Level Time Drift Reduction Mechanism for Linux Guest OS on Fault-Tolerant QEMU/KVM Virtual Machine System
指導教授: 王尉任
Wei-Jen Wang
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 39
中文關鍵詞: QEMU-KVM虛擬機器Linux guest容錯系統Time Drift
外文關鍵詞: QEMU-KVM, virtual machine, Linux guest, fault-tolerant system, time drift
相關次數: 點閱:10下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著雲端基礎建設技術的蓬勃發展,自架虛擬機能充分利用硬體資源,減少機房維護成本。不過相較於實體機器,虛擬機器更容易發生故障。為了提高虛擬機的可用性,我們過去提出了虛擬機器容錯系統NCU MFTVM 4。該系統在虛擬機故障發生後,備援機器能夠從故障前的狀態恢復運行。然而,為了讓備援機器擁有最新的狀態,我們需短暫暫停虛擬機器,這會導致少量的時間飄移。當時間飄移累積到一定程度,就會影響虛擬機器內的應用程式正確性。經過我們的研究發現,傳統的時間同步並不適合我們所開發的虛擬機器容錯系統,故我們提出了在虛擬機器監控層(VMM)實現虛擬機器時間同步的方法。根據實驗發現這種方法對系統效能影響不大,同時不需要安裝任何套件及不需要擔心時間同步功能失效,從而實現更穩健的虛擬機器容錯系統。


    With the booming development of cloud infrastructure technology, self-built virtual machines fully utilize hardware resources and reduce data center maintenance costs. However, virtual machines are more prone to failure than physical machines. In order to enhance the availability of virtual machines, we have previously proposed the NCU MFTVM 4 virtual machine fault tolerance system. This system allows backup machines to recover and resume operation from the pre-failure state of a virtual machine after a failure occurs. However, every time the virtual machine state is saved, the virtual machine must be briefly paused, which results in a small amount of time drift. When the time drift accumulates to a large amount, it affects the correctness of applications running inside the virtual machine. we found traditional time synchronization methods are not suitable for our developed virtual machine fault tolerance system. Therefore, we proposed a method to achieve virtual machine time synchronization at the virtual machine monitor (VMM) layer. According to experimental, this method has minimal impact on system performance, does not require the installation of any additional packages, and eliminates concerns about the failure of time synchronization functionality, thereby achieving a more robust virtual machine fault tolerance system.

    摘要 i Abstract ii 目錄 iii 表目錄 iv 圖目錄 v 一、 緒論 1 1-1 研究背景 1 1-2 研究動機 1 1-3 論文貢獻 5 1-4 論文架構 5 二、 背景知識 7 2-1 NCU MFTVM介紹 7 2-2 問題定義 10 2-3 時間跟進的方式 10 2-4 VM-exit 12 2-5 相關研究 13 三、 VMM層級達到虛擬機時間同步 17 3-1 Kvmclock介紹 17 3-2主要架構 17 3-3 運作流程 18 四、實驗分析 22 4-1 VMM層級時間同步後的精準度 23 4-2 VMM層級時間平均花費時間 24 五、結論與未來研究方向 26 參考文獻 27

    [1] F. Bellard, "QEMU, a fast and portable dynamic translator," in USENIX annual technical conference, FREENIX Track, Anaheim, CA, 2005, pp. 41-46: USENIX Association.
    [2] A. Ahuja, V. Jain, and D. Saini, "Measuring Clock Reliability in Cloud Virtual Machines," in Real-Time Intelligence for Heterogeneous Networks: Applications, Challenges, and Scenarios in IoT HetNets, F. Al-Turjman, Ed. Cham: Springer International Publishing, 2021, pp. 87-98.
    [3] Growing, thinning, and shrinking virtual disks in ESXi. Available: https://kb.vmware.com/s/article/1002019. (accessed 29 May, 2023).
    [4] D. M'Raihi, S. Machani, M. Pei, and J. Rydell, RFC 6238: TOTP: Time-Based One-Time Password Algorithm. RFC Editor, 2011.
    [5] supplementing Directive 2014/65/EU of the European Parliament and of the Council with regard to regulatory technical standards for the level of accuracy of business clocks Available: https://ec.europa.eu/finance/securities/docs/isd/mifid/rts/160607-rts-25-annex_en.pdf. (accessed 29 May, 2023).
    [6] L. Lamport, "Time, clocks, and the ordering of events in a distributed system," Commun. ACM, vol. 21, no. 7, pp. 558–565, 1978.
    [7] K. M. Chandy and L. Lamport, "Distributed snapshots: determining global states of distributed systems," ACM Trans. Comput. Syst., vol. 3, no. 1, pp. 63–75, 1985.
    [8] J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, and D. Woodford, "Spanner: Google’s Globally Distributed Database," ACM Trans. Comput. Syst., vol. 31, no. 3, p. Article 8, 2013.
    [9] D. Mills, J. Martin, J. Burbank, and W. Kasch, RFC 5905: Network Time Protocol Version 4: Protocol and Algorithms Specification. RFC Editor, 2010.
    [10] Server Operating Systems: Server OS Types & How to Choose. Available: https://phoenixnap.com/kb/server-operating-system. (accessed 29 May, 2023).
    [11] Timekeeping in VMware Virtual Machines. Available: https://www.vmware.com/files/pdf/techpaper/Timekeeping-In-VirtualMachines.pdf. (accessed 29 May, 2023).
    [12] G. Frieder, "The architecture and operational characteristics of the VMX host machine," ACM SIGMICRO Newsletter, vol. 13, no. 4, pp. 9-16, 1982.
    [13] A Walkthrough on some recent KVM performance improvements. Available: https://www.linux-kvm.org/images/e/ea/2010-forum-mtosatti_walkthrough_entry_exit.pdf. (accessed 29 May, 2023).
    [14] W. D. Ashley, "Using the QEMU Guest Agent," Foundations of Libvirt Development: How to Set Up and Maintain a Virtual Machine Environment with Python, pp. 207-225, 2019.
    [15] A. Najafi and M. Wei, "Graham: Synchronizing Clocks by Leveraging Local Clock Properties," in 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22), 2022, pp. 453-466.
    [16] Timed Linux Kernel Compilation. Available: https://openbenchmarking.org/test/pts/build-linux-kernel-1.15.0. (accessed 29 May, 2023).
    [17] MariaDB MySQL database server benchmark. Available: https://openbenchmarking.org/test/pts/mysqlslap&eval=50e76981b36d5e6fe9c12c60a4c82414f22be623. (accessed 29 May, 2023).

    QR CODE
    :::