跳到主要內容

簡易檢索 / 詳目顯示

研究生: 傅玄堯
FU-XUAN, YAO
論文名稱: 驗證ML-based model在七台主機用於預測虛擬機 開機時間的準確率
Verify the accuracy of the ML-based model for predicting virtual machine boot time on seven hosts
指導教授: 王尉任
梁德容
Wang, Wei-Jen
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2022
畢業學年度: 111
語文別: 中文
論文頁數: 71
中文關鍵詞: 虛擬機開機時間預測機器學習模型雲端運算
外文關鍵詞: VM, boot time prediction, ML model, cloud computing
相關次數: 點閱:17下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近幾年雲端運算日益普及和成熟。但與此同時,雲端服務的downtime問題逐漸被重視,其造成的cost也有逐年上升的趨勢。Virtual Machine (VM) 是大多數cloud service的基礎,雲端系統復原管理過程中,需要重啟VM。然而不同情況下,重啟VM的時間不相同。若愈能精準預測VM boot time,則可以找到花最少時間啟動service的VM擺放方式,復原所需時間也愈短,進而縮短雲端服務downtime。
    過去鮮少有關於VM boot time研究,因為VM boot time通常被認為是固定值,但前人研究指出事實並非如此。Lee提出五種model預測VM boot time,並在四台host的環境進行實驗,且VM背景沒有運行增加host CPU loading的程式。結果顯示 (Random Forest) RF model是accuracy最高的model,但它所需的資料量大小隨host數目增加呈指數成長,所以建議在小規模的雲端環境使用。
    然而,Lee沒有驗證:若host數量增加後,ML-based model能維持accuracy;增加host CPU loading後,ML-based model仍能維持accuracy。因此本研究將針對以上兩問題進行探討。結果顯示增加host數目後,RF model accuracy沒有下降,因為它能適應較複雜的環境;而增加host CPU loading後,RF model accuracy明顯下降。此外,由於收集ML-based model的資料,時間成本高昂。因此本研究建議若在10台host以上的雲端環境,採用YLL’s rule-based model。它的優勢為只需收集少量資料,所需時間相較ML-based model非常短暫。


    In recent years, cloud computing has become increasingly popular and mature. But at the same time, the extension of downtime of cloud service has become more and more common in recent years, and the cost caused by it has also increased year by year. Virtual Machines (VMs) are the foundation of most cloud services. During cloud system recovery management, the VM needs to be restarted. However, the time to restart the VM varies in different situations. If the VM boot time can be predicted more accurately, the VM placement method that takes the least time to start the service can be found, and the recovery time will be shorter, thereby shortening the downtime of the cloud service.
    There has been little research on VM boot time because VM boot time is often considered constant. However, previous studies show this is not correct. Lee proposed five models to predict the VM boot time in the environment of four hosts, and the VM background did not run the program that increase the host CPU loading. The results show that the (Random Forest) RF model is the model with the highest accuracy, but the amount of data it requires grows exponentially with the number of hosts, so it is recommended to be used in a small-scale cloud environment.
    However, Lee did not verify if the number of hosts increases, the ML-based model can maintain accuracy; after increasing host CPU loading, the ML-based model can still maintain accuracy. Therefore, this study will address the above two issues. The results show that after increasing the number of hosts, the RF model accuracy does not decrease because it can adapt to a more complex environment; however, after increasing the host CPU loading, the RF model accuracy decreases significantly. In addition, the time cost is high due to the collection of data for ML-based models. Therefore, this study suggests that YLL's rule-based model should be used in a cloud environment with more than 10 hosts. Its advantage is that it only needs to collect a small amount of data, and the time required is very short compared to ML-based models.

    摘要 i Abstract ii 致謝 iii 目錄 iv 圖目錄 vi 表目錄 viii 第一章 緒論 1 1-1 研究背景 1 1-2 研究動機 4 1-3 問題定義 5 1-4 論文貢獻 8 1-5 論文架構 9 第二章 背景知識 10 2-1 OpenStack 10 2-1-1 創造VM image 10 2-1-2 VM 開機過程 11 第三章 相關研究 13 3-1 Rule-based models 15 3-2 ML-based models 18 第四章 資料收集與處理 20 4-1 問題定義 20 4-2 實驗環境 20 4-3 收集VM boot time dataset 22 第五章 實驗結果與討論 26 5-1 Rule-based models實驗結果 26 5-2 ML-based models實驗結果 32 5-3 實驗結果討論 48 第六章 結論及未來研究方向 52 第七章 參考文獻 54

    [1] The top 3 public cloud providers compare. https://www.futuriom.com/articles/news/the- top3- public- cloud- providers- compared/2022/02 (Accessed: Aug. 12, 2022).
    [2] Gartner. "Top Five IaaS Providers Account for Over 80% of Total Market."
    https://www.gartner.com/en/newsroom/press-releases/2022-06-02-gartner-says-worldwideiaas-public-cloud-services-market-grew-41-percent-in-2021 (accessed Aug. 12, 2022).
    [3] Cloud Tech. "Downtime impact worsening as industry fails to curb outages."
    https://www.cloudcomputing-news.net/news/2022/jun/09/downtime-impact-worsening-asindustry-fails-to-curb-outages/ (accessed Aug. 15, 2022).
    [4] Mlytics. "What is downtime (cloud service)?" https://learning.mlytics.com/originserver/what-is-downtime-cloud-service/ (accessed Aug. 15, 2022).
    [5] Theregister. "IT downtime not itself going down, power failures most common cause"
    https://www.theregister.com/2022/06/08/it_outages_power/ (accessed Aug. 15, 2022).
    [6] A Cloud Guru. " Cloud comparison: AWS EC2 vs Azure Virtual Machines vs Google
    Compute Engine" https://acloudguru.com/blog/engineering/cloud-comparison-aws-ec2-vsazure-virtual-machines-vs-google-compute-engine (accessed Aug. 18, 2022).
    [7] VMware. " Why use containers vs. VMs? | VMware"
    https://www.vmware.com/topics/glossary/content/vms-vs-containers.html (accessed Aug. 29,
    2022).
    [8] Fiixsoftware. "Making sense of maintenance metrics: System availability"
    https://www.fiixsoftware.com/blog/how-do-maintainability-and-reliability-affect-availability/
    (accessed Aug. 15, 2022).
    [9] Endo, P.T., Rodrigues, M., Gonçalves, G.E. et al. High availability in clouds: systematic
    review and research challenges. J Cloud Comp 5, 16 (2016). https://doi.org/10.1186/s13677-
    016-0066-8. (accessed Aug. 15, 2022).
    [10] NetApp. "Azure High Availability: Basic Concepts and a Checklist"
    https://cloud.netapp.com/blog/azure-high-availability-basic-concepts-and-a-checklist
    (accessed Aug. 15, 2022).
    [11] W. Li and A. Kanso, "Comparing Containers versus Virtual Machines for Achieving High
    Availability," in 2015 IEEE International Conference on Cloud Engineering (IC2E), Tempe,
    AZ, USA, 2015 pp. 353-358.
    doi: 10.1109/IC2E.2015.79
    [12] Costache, S., Parlavantzas, N., Morin, C., Kortas, S. (2013). On the Use of a ProportionalShare Market for Application SLO Support in Clouds. In: Wolf, F., Mohr, B., an Mey, D.
    (eds) Euro-Par 2013 Parallel Processing. Euro-Par 2013. Lecture Notes in Computer Science,
    vol 8097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40047-6_35
    [13] M. A. Rodriguez and R. Buyya, "Deadline Based Resource Provisioningand Scheduling
    Algorithm for Scientific Workflows on Clouds," in IEEE Transactions on Cloud Computing,
    vol. 2, no. 2, pp. 222-235, 1 April-June 2014, doi: 10.1109/TCC.2014.2314655.
    [14] A. Singh. "Cloudsim Tutorials." https://www.cloudsimtutorials.online/
    [15] Teixeira Sá, T., Calheiros, R., Gomes, D. (2014). CloudReports: An Extensible Simulation
    Tool for Energy-Aware Cloud Computing Environments. In: Mahmood, Z. (eds) Cloud
    Computing. Computer Communications and Networks. Springer, Cham.
    https://doi.org/10.1007/978-3-319-10530-7_6
    [16] T. L. Nguyen and A. Lebre, "Virtual machine boot time model," in 2017 25th Euromicro
    International Conference on Parallel, Distributed and Network-based Processing (PDP), St.
    Petersburg, Russia, 2017: IEEE, pp. 430-437.
    [17] S. Taherizadeh and V. Stankovski, "Dynamic multi-level auto-scaling rules for containerized
    applications," The Computer Journal, vol. 62, no. 2, pp. 174-197, 2019.
    [18] Yen-Lin Lee et al. “An efficient and detector-fault-resilient mechanism of fault detection and
    recovery for cloud computing system”, June 2022, doctoral dissertation, National Central
    University.
    [19] Cloud-init. "What is cloud-init?"
    https://cloudinit.readthedocs.io/en/latest/topics/datasources/openstack.html (accessed Aug. 17,
    2022).
    [20] OpenStack. "Installation Guide Overview" https://docs.openstack.org/installguide/overview.html (accessed Aug. 17, 2022).
    [21] OpenStack. "Neutron 20.10.dev483 Introduction"
    https://docs.openstack.org/neutron/latest/admin/intro.html (accessed Aug. 17, 2022).
    [22] OpenStack. "nova 25.1.0.dev221 OpenStack Compute (nova)"
    https://docs.openstack.org/nova/latest/?_ga=2.187545133.481415422.1660724220-
    1316198396.1660724220 (accessed Aug. 17, 2022).
    [23] OpenStack. "cinder 20.1.0.dev380 OpenStack Block Storage (Cinder) documentation"
    https://docs.openstack.org/cinder/latest/?_ga=2.125227919.481415422.1660724220-
    1316198396.1660724220 (accessed Aug. 17, 2022).
    [24] OpenStack. "horizon 22.2.1.dev13 Horizon: The OpenStack Dashboard Project"
    https://docs.openstack.org/horizon/latest/?_ga=2.46201506.481415422.1660724220-
    1316198396.1660724220 (accessed Aug. 17, 2022).
    [25] Ncu-software-research-center. " NCU-HASS Installation guide (ver. 5.1)"
    https://github.com/Ncu-software-research-center/NCU-HASS/blob/master/doc/NCUHASS%20Installation%20guide.md (accessed Aug. 17, 2022).
    [26] OpenStack. "Host networking" https://docs.openstack.org/install-guide/environmentnetworking.html (accessed Aug. 29, 2022).
    [27] OpenStack. "Provider network" https://docs.openstack.org/newton/install-guide-rdo/launchinstance-networks-provider.html (accessed Aug. 29, 2022).
    [28] OpenStack. "Message queue for Ubuntu" https://docs.openstack.org/installguide/environment-messaging-ubuntu.html (accessed Aug. 29, 2022).
    [29] OpenStack. "Memcached for Ubuntu" https://docs.openstack.org/install-guide/environmentmemcached-ubuntu.html (accessed Aug. 29, 2022).
    [30] OpenStack. "Etcd for Ubuntu" https://docs.openstack.org/install-guide/environment-etcdubuntu.html (accessed Aug. 29, 2022).
    [31] OpenStack. "Example: Ubuntu image" https://docs.openstack.org/image-guide/ubuntuimage.html (accessed Aug. 29, 2022).
    [32] OpenStack. "Compute schedulers"
    https://docs.openstack.org/nova/latest/admin/scheduling.html (accessed Aug. 29, 2022).
    [33] K. Razavi and T. Kielmann, "Scalable virtual machine deployment using VM image caches,"
    in 2013 SC - International Conference for High Performance Computing, Networking,
    Storage and Analysis, Denver, CO, USA, 2013 pp. 1-12.
    doi: 10.1145/2503210.2503274
    [34] Saurabh, N., Kimovski, D., Ostermann, S., Prodan, R. (2017). VM Image Repository and
    Distribution Models for Federated Clouds: State of the Art, Possible Directions and Open
    Issues. In: , et al. Euro-Par 2016: Parallel Processing Workshops. Euro-Par 2016. Lecture
    Notes in Computer Science(), vol 10104. Springer, Cham. https://doi.org/10.1007/978-3-319-
    58943-5_21
    [35] OpenStack. "Launch an instance from a volume"
    https://docs.openstack.org/nova/queens/user/launch-instance-from-volume.html (accessed Aug. 30,
    2022).
    [36] V. Nitu et al., "Swift birth and quick death: Enabling fast parallel guest boot and destruction
    in the xen hypervisor," ACM SIGPLAN Notices, vol. 52, no. 7, pp. 1-14, 2017.
    [37] Y. Govindaraju, H. A. Duran-Limon, and E. Mezura-Montes, "A regression tree predictive
    model for virtual machine startup time in IaaS clouds," Cluster Computing, vol. 24, no. 2, pp.
    1217-1233, 2021.
    [38] M. Bolte, M. Sievers, G. Birkenheuer, O. Niehorster and A. Brinkmann, "Non-intrusive
    virtualization management using libvirt," in 2010 Design, Automation & Test in Europe
    Conference & Exhibition (DATE 2010), Dresden, 2010 pp. 574-579.
    doi: 10.1109/DATE.2010.5457142
    [39] S.Crago, K.Dunn, P.Eads, L.Hochstein, D.I.Kang, M.Kang, D.Modium, K.Singh, J.Suh,
    J.P.Walters, Heterogeneous cloud computing, Proc. - IEEE Int. Conf. Clust. Comput. ICCC.
    (2011) 378–385. https://doi.org/10.1109/CLUSTER.2011.49.
    [40] John, Wolfgang & Sargor, Chandramouli & Szabo, Robert & Awan, Ahsan Javed & Padala,
    Chakri & Drake, Edvard & Julien, Martin & Opsenica, Miljenko. (2020). The Future of
    Cloud Computing: Highly Distributed with Heterogenous Hardware. Ericsson Review
    (English Edition).
    [41] J. Lee, G. Jo, W. Jung, H. Kim, J. Kim, Y.-J. Lee, J. Park, Chapter 2 - SnuCL: A unified
    OpenCL framework for heterogeneous clusters, Editor(s): Hamid Sarbazi-Azad, In Emerging
    Trends in Computer Science and Applied Computing, Advances in GPU Research and
    Practice, Morgan Kaufmann, 2017, Pages 23-55, ISBN 9780128037386,
    https://doi.org/10.1016/B978-0-12-803738-6.00002-1.
    (https://www.sciencedirect.com/science/article/pii/B9780128037386000021)

    QR CODE
    :::