| 研究生: |
傅玄堯 FU-XUAN, YAO |
|---|---|
| 論文名稱: |
驗證ML-based model在七台主機用於預測虛擬機 開機時間的準確率 Verify the accuracy of the ML-based model for predicting virtual machine boot time on seven hosts |
| 指導教授: |
王尉任
梁德容 Wang, Wei-Jen |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2022 |
| 畢業學年度: | 111 |
| 語文別: | 中文 |
| 論文頁數: | 71 |
| 中文關鍵詞: | 虛擬機 、開機時間預測 、機器學習模型 、雲端運算 |
| 外文關鍵詞: | VM, boot time prediction, ML model, cloud computing |
| 相關次數: | 點閱:17 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近幾年雲端運算日益普及和成熟。但與此同時,雲端服務的downtime問題逐漸被重視,其造成的cost也有逐年上升的趨勢。Virtual Machine (VM) 是大多數cloud service的基礎,雲端系統復原管理過程中,需要重啟VM。然而不同情況下,重啟VM的時間不相同。若愈能精準預測VM boot time,則可以找到花最少時間啟動service的VM擺放方式,復原所需時間也愈短,進而縮短雲端服務downtime。
過去鮮少有關於VM boot time研究,因為VM boot time通常被認為是固定值,但前人研究指出事實並非如此。Lee提出五種model預測VM boot time,並在四台host的環境進行實驗,且VM背景沒有運行增加host CPU loading的程式。結果顯示 (Random Forest) RF model是accuracy最高的model,但它所需的資料量大小隨host數目增加呈指數成長,所以建議在小規模的雲端環境使用。
然而,Lee沒有驗證:若host數量增加後,ML-based model能維持accuracy;增加host CPU loading後,ML-based model仍能維持accuracy。因此本研究將針對以上兩問題進行探討。結果顯示增加host數目後,RF model accuracy沒有下降,因為它能適應較複雜的環境;而增加host CPU loading後,RF model accuracy明顯下降。此外,由於收集ML-based model的資料,時間成本高昂。因此本研究建議若在10台host以上的雲端環境,採用YLL’s rule-based model。它的優勢為只需收集少量資料,所需時間相較ML-based model非常短暫。
In recent years, cloud computing has become increasingly popular and mature. But at the same time, the extension of downtime of cloud service has become more and more common in recent years, and the cost caused by it has also increased year by year. Virtual Machines (VMs) are the foundation of most cloud services. During cloud system recovery management, the VM needs to be restarted. However, the time to restart the VM varies in different situations. If the VM boot time can be predicted more accurately, the VM placement method that takes the least time to start the service can be found, and the recovery time will be shorter, thereby shortening the downtime of the cloud service.
There has been little research on VM boot time because VM boot time is often considered constant. However, previous studies show this is not correct. Lee proposed five models to predict the VM boot time in the environment of four hosts, and the VM background did not run the program that increase the host CPU loading. The results show that the (Random Forest) RF model is the model with the highest accuracy, but the amount of data it requires grows exponentially with the number of hosts, so it is recommended to be used in a small-scale cloud environment.
However, Lee did not verify if the number of hosts increases, the ML-based model can maintain accuracy; after increasing host CPU loading, the ML-based model can still maintain accuracy. Therefore, this study will address the above two issues. The results show that after increasing the number of hosts, the RF model accuracy does not decrease because it can adapt to a more complex environment; however, after increasing the host CPU loading, the RF model accuracy decreases significantly. In addition, the time cost is high due to the collection of data for ML-based models. Therefore, this study suggests that YLL's rule-based model should be used in a cloud environment with more than 10 hosts. Its advantage is that it only needs to collect a small amount of data, and the time required is very short compared to ML-based models.
[1] The top 3 public cloud providers compare. https://www.futuriom.com/articles/news/the- top3- public- cloud- providers- compared/2022/02 (Accessed: Aug. 12, 2022).
[2] Gartner. "Top Five IaaS Providers Account for Over 80% of Total Market."
https://www.gartner.com/en/newsroom/press-releases/2022-06-02-gartner-says-worldwideiaas-public-cloud-services-market-grew-41-percent-in-2021 (accessed Aug. 12, 2022).
[3] Cloud Tech. "Downtime impact worsening as industry fails to curb outages."
https://www.cloudcomputing-news.net/news/2022/jun/09/downtime-impact-worsening-asindustry-fails-to-curb-outages/ (accessed Aug. 15, 2022).
[4] Mlytics. "What is downtime (cloud service)?" https://learning.mlytics.com/originserver/what-is-downtime-cloud-service/ (accessed Aug. 15, 2022).
[5] Theregister. "IT downtime not itself going down, power failures most common cause"
https://www.theregister.com/2022/06/08/it_outages_power/ (accessed Aug. 15, 2022).
[6] A Cloud Guru. " Cloud comparison: AWS EC2 vs Azure Virtual Machines vs Google
Compute Engine" https://acloudguru.com/blog/engineering/cloud-comparison-aws-ec2-vsazure-virtual-machines-vs-google-compute-engine (accessed Aug. 18, 2022).
[7] VMware. " Why use containers vs. VMs? | VMware"
https://www.vmware.com/topics/glossary/content/vms-vs-containers.html (accessed Aug. 29,
2022).
[8] Fiixsoftware. "Making sense of maintenance metrics: System availability"
https://www.fiixsoftware.com/blog/how-do-maintainability-and-reliability-affect-availability/
(accessed Aug. 15, 2022).
[9] Endo, P.T., Rodrigues, M., Gonçalves, G.E. et al. High availability in clouds: systematic
review and research challenges. J Cloud Comp 5, 16 (2016). https://doi.org/10.1186/s13677-
016-0066-8. (accessed Aug. 15, 2022).
[10] NetApp. "Azure High Availability: Basic Concepts and a Checklist"
https://cloud.netapp.com/blog/azure-high-availability-basic-concepts-and-a-checklist
(accessed Aug. 15, 2022).
[11] W. Li and A. Kanso, "Comparing Containers versus Virtual Machines for Achieving High
Availability," in 2015 IEEE International Conference on Cloud Engineering (IC2E), Tempe,
AZ, USA, 2015 pp. 353-358.
doi: 10.1109/IC2E.2015.79
[12] Costache, S., Parlavantzas, N., Morin, C., Kortas, S. (2013). On the Use of a ProportionalShare Market for Application SLO Support in Clouds. In: Wolf, F., Mohr, B., an Mey, D.
(eds) Euro-Par 2013 Parallel Processing. Euro-Par 2013. Lecture Notes in Computer Science,
vol 8097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40047-6_35
[13] M. A. Rodriguez and R. Buyya, "Deadline Based Resource Provisioningand Scheduling
Algorithm for Scientific Workflows on Clouds," in IEEE Transactions on Cloud Computing,
vol. 2, no. 2, pp. 222-235, 1 April-June 2014, doi: 10.1109/TCC.2014.2314655.
[14] A. Singh. "Cloudsim Tutorials." https://www.cloudsimtutorials.online/
[15] Teixeira Sá, T., Calheiros, R., Gomes, D. (2014). CloudReports: An Extensible Simulation
Tool for Energy-Aware Cloud Computing Environments. In: Mahmood, Z. (eds) Cloud
Computing. Computer Communications and Networks. Springer, Cham.
https://doi.org/10.1007/978-3-319-10530-7_6
[16] T. L. Nguyen and A. Lebre, "Virtual machine boot time model," in 2017 25th Euromicro
International Conference on Parallel, Distributed and Network-based Processing (PDP), St.
Petersburg, Russia, 2017: IEEE, pp. 430-437.
[17] S. Taherizadeh and V. Stankovski, "Dynamic multi-level auto-scaling rules for containerized
applications," The Computer Journal, vol. 62, no. 2, pp. 174-197, 2019.
[18] Yen-Lin Lee et al. “An efficient and detector-fault-resilient mechanism of fault detection and
recovery for cloud computing system”, June 2022, doctoral dissertation, National Central
University.
[19] Cloud-init. "What is cloud-init?"
https://cloudinit.readthedocs.io/en/latest/topics/datasources/openstack.html (accessed Aug. 17,
2022).
[20] OpenStack. "Installation Guide Overview" https://docs.openstack.org/installguide/overview.html (accessed Aug. 17, 2022).
[21] OpenStack. "Neutron 20.10.dev483 Introduction"
https://docs.openstack.org/neutron/latest/admin/intro.html (accessed Aug. 17, 2022).
[22] OpenStack. "nova 25.1.0.dev221 OpenStack Compute (nova)"
https://docs.openstack.org/nova/latest/?_ga=2.187545133.481415422.1660724220-
1316198396.1660724220 (accessed Aug. 17, 2022).
[23] OpenStack. "cinder 20.1.0.dev380 OpenStack Block Storage (Cinder) documentation"
https://docs.openstack.org/cinder/latest/?_ga=2.125227919.481415422.1660724220-
1316198396.1660724220 (accessed Aug. 17, 2022).
[24] OpenStack. "horizon 22.2.1.dev13 Horizon: The OpenStack Dashboard Project"
https://docs.openstack.org/horizon/latest/?_ga=2.46201506.481415422.1660724220-
1316198396.1660724220 (accessed Aug. 17, 2022).
[25] Ncu-software-research-center. " NCU-HASS Installation guide (ver. 5.1)"
https://github.com/Ncu-software-research-center/NCU-HASS/blob/master/doc/NCUHASS%20Installation%20guide.md (accessed Aug. 17, 2022).
[26] OpenStack. "Host networking" https://docs.openstack.org/install-guide/environmentnetworking.html (accessed Aug. 29, 2022).
[27] OpenStack. "Provider network" https://docs.openstack.org/newton/install-guide-rdo/launchinstance-networks-provider.html (accessed Aug. 29, 2022).
[28] OpenStack. "Message queue for Ubuntu" https://docs.openstack.org/installguide/environment-messaging-ubuntu.html (accessed Aug. 29, 2022).
[29] OpenStack. "Memcached for Ubuntu" https://docs.openstack.org/install-guide/environmentmemcached-ubuntu.html (accessed Aug. 29, 2022).
[30] OpenStack. "Etcd for Ubuntu" https://docs.openstack.org/install-guide/environment-etcdubuntu.html (accessed Aug. 29, 2022).
[31] OpenStack. "Example: Ubuntu image" https://docs.openstack.org/image-guide/ubuntuimage.html (accessed Aug. 29, 2022).
[32] OpenStack. "Compute schedulers"
https://docs.openstack.org/nova/latest/admin/scheduling.html (accessed Aug. 29, 2022).
[33] K. Razavi and T. Kielmann, "Scalable virtual machine deployment using VM image caches,"
in 2013 SC - International Conference for High Performance Computing, Networking,
Storage and Analysis, Denver, CO, USA, 2013 pp. 1-12.
doi: 10.1145/2503210.2503274
[34] Saurabh, N., Kimovski, D., Ostermann, S., Prodan, R. (2017). VM Image Repository and
Distribution Models for Federated Clouds: State of the Art, Possible Directions and Open
Issues. In: , et al. Euro-Par 2016: Parallel Processing Workshops. Euro-Par 2016. Lecture
Notes in Computer Science(), vol 10104. Springer, Cham. https://doi.org/10.1007/978-3-319-
58943-5_21
[35] OpenStack. "Launch an instance from a volume"
https://docs.openstack.org/nova/queens/user/launch-instance-from-volume.html (accessed Aug. 30,
2022).
[36] V. Nitu et al., "Swift birth and quick death: Enabling fast parallel guest boot and destruction
in the xen hypervisor," ACM SIGPLAN Notices, vol. 52, no. 7, pp. 1-14, 2017.
[37] Y. Govindaraju, H. A. Duran-Limon, and E. Mezura-Montes, "A regression tree predictive
model for virtual machine startup time in IaaS clouds," Cluster Computing, vol. 24, no. 2, pp.
1217-1233, 2021.
[38] M. Bolte, M. Sievers, G. Birkenheuer, O. Niehorster and A. Brinkmann, "Non-intrusive
virtualization management using libvirt," in 2010 Design, Automation & Test in Europe
Conference & Exhibition (DATE 2010), Dresden, 2010 pp. 574-579.
doi: 10.1109/DATE.2010.5457142
[39] S.Crago, K.Dunn, P.Eads, L.Hochstein, D.I.Kang, M.Kang, D.Modium, K.Singh, J.Suh,
J.P.Walters, Heterogeneous cloud computing, Proc. - IEEE Int. Conf. Clust. Comput. ICCC.
(2011) 378–385. https://doi.org/10.1109/CLUSTER.2011.49.
[40] John, Wolfgang & Sargor, Chandramouli & Szabo, Robert & Awan, Ahsan Javed & Padala,
Chakri & Drake, Edvard & Julien, Martin & Opsenica, Miljenko. (2020). The Future of
Cloud Computing: Highly Distributed with Heterogenous Hardware. Ericsson Review
(English Edition).
[41] J. Lee, G. Jo, W. Jung, H. Kim, J. Kim, Y.-J. Lee, J. Park, Chapter 2 - SnuCL: A unified
OpenCL framework for heterogeneous clusters, Editor(s): Hamid Sarbazi-Azad, In Emerging
Trends in Computer Science and Applied Computing, Advances in GPU Research and
Practice, Morgan Kaufmann, 2017, Pages 23-55, ISBN 9780128037386,
https://doi.org/10.1016/B978-0-12-803738-6.00002-1.
(https://www.sciencedirect.com/science/article/pii/B9780128037386000021)