| 研究生: |
陳樂川 Yao-Chuan Chen |
|---|---|
| 論文名稱: |
基於 FaaS 技術之 ML 工作流系統 ML Workflow System based on FaaS technology |
| 指導教授: |
王尉任
Wei-Jen Wang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2023 |
| 畢業學年度: | 111 |
| 語文別: | 中文 |
| 論文頁數: | 44 |
| 中文關鍵詞: | MLOps 、Kubernetes 、FaaS 、ML Workflow |
| 外文關鍵詞: | MLOps, Kubernetes, Serverless, ML Workflow |
| 相關次數: | 點閱:12 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在這個充滿巨量資料的年代,如何在龐大的資料中提取有用的資訊已成為各個企
業要思考的問題,因此各個企業也紛紛投入人工智慧技術(Machine/Deep Learning),利
用人工智慧運算處理大量的數據為企業帶來新的價值。然而 ML(Machine Learning)模型
開發流程複雜,當中包含許多領域的專業人員以及許多環境配置,導致整個 ML 開發
團隊必須花費許多溝通成本,同時也影響了模型為企業帶來的實際效益。近年來有了
MLOps 的概念,即 DevOps on Machine Learning,旨在開發中更減少人力成本且加速開
發生命週期。如今有許多 MLOps 的平台,這些平台利用容器化技術將 ML 的步驟進行
封裝,並利用 Kubernetes 等容器編排工具管理任務。然而在 ML 的開發中有時須使用
叢集外的資源,現有的平台並沒有提供整合外部資源的功能,因此本研究將設計一套
基於 FaaS 技術的 ML Workflow 系統,透過工作流平台讓使用者自定義 ML Workflow,
並將步驟封裝成 FaaS,將內外部的資源部署為一個系統可調用的事件觸發函式,部署
至 Kubernetes 上,最終讓使用者創建可重複使用的 ML Workflow 與 ML 模型。
In this era of big data, extracting useful information from massive amounts of data has
become a challenge for many enterprises. Therefore, many enterprises have invested in
artificial intelligence technologies (Machine/Deep Learning) to process large amounts of data
using AI computations and bring new value to their businesses. However, the development
process of ML (Machine Learning) models is complex and involves many professionals in
various fields, as well as many environment configurations, which results in the entire ML
development team having to spend a lot of communication costs, which also affects the actual
benefits of the model for the enterprise. In recent years, the concept of MLOps has emerged,
which is DevOps on Machine Learning, aimed at reducing human costs and accelerating the
development life cycle during development. There are now many MLOps platforms that use
containerization technology to package the steps of ML and use container orchestration tools
such as Kubernetes to manage tasks. However, sometimes external resources outside the
cluster need to be used in ML development, and existing platforms do not provide the ability
to integrate external resources. Therefore, this study will design an ML Workflow system
based on FaaS technology, allowing users to customize their ML Workflow through a
workflow platform and encapsulate steps into FaaS. This will deploy internal and external
resources as an event-triggered function that the system can call, deployed on Kubernetes, and
ultimately allow users to create reusable ML Workflows and ML models.
[1] S. A. B. a. N.-F. Huang, "Big Data and AI Revolution in Precision Agriculture: Survey
and Challenges," IEEE Access, vol. vol. 9, pp. pp. 110209-110222, 2021.
[2] C. B. Pahl, Antonio & Soldani, Jacopo & Jamshidi, Pooyan. , "Cloud Container
Technologies: A State-of-the-Art Review.," IEEE Transactions on Cloud Computing.,
pp. PP(99):1-1, 2017.
[3] P. R. a. A. A. A. Tosatto, "Container-Based Orchestration in Cloud: State of the Art
and Challenges," presented at the 2015 Ninth International Conference on Complex,
Intelligent, and Software Intensive Systems, Santa Catarina, Brazil, 2015.
[4] C.-Y. M. Fan, Shang-Pin, "Migrating Monolithic Mobile Application to Microservice
Architecture: An Experiment Report," 2017.
[5] C. Anderson. (2015) Docker [Software engineering]. 102-c3. Available:
https://doi.ieeecomputersociety.org/10.1109/MS.2015.62
[6] D. N. Jaramillo, Duy & Smart, Robert., "Leveraging microservices architecture by
using Docker technology," presented at the SoutheastCon 2016, 2016.
[7] L. D. Lauretis, "From Monolithic Architecture to Microservices Architecture,"
presented at the 2019 IEEE International Symposium on Software Reliability
Engineering Workshops (ISSREW), Berlin, Germany, 2019.
[8] A. K. Hossein Shafiei, and Payam Mousavi, "Serverless Computing: A Survey of
Opportunities, Challenges, and Applications," ACM Computing Surveys, vol. 54, no.
11, pp. pp 1–32, 2022.
[9] Kubernetes Doc. Available: https://kubernetes.io/docs/home/
[10] (5/11). OpenFaaS. Available: https://docs.openfaas.com/
[11] (5/11). Airflow Documentation. Available: https://airflow.apache.org/docs/
[12] (5/11). MinIO Docs. Available: https://min.io/docs/minio/kubernetes/upstream/
[13] F. Chen, Z. Li, C. Jiang, T. Xiang, and Y. Yang, "Cloud Object Storage
Synchronization: Design, Analysis, and Implementation," IEEE Transactions on
Parallel & Distributed Systems, vol. 33, no. 12, pp. 4295-4310, 2022.
[14] H. H. O. a. J. B. M. M. John, "Towards MLOps: A Framework and Maturity Model,"
presented at the 2021 47th Euromicro Conference on Software Engineering and
Advanced Applications (SEAA), Palermo, Italy, 2021.
[15] L. Leite, C. Rocha, F. Kon, D. Milojicic, and P. Meirelles, "A Survey of DevOps
Concepts and Challenges," ACM Comput. Surv., vol. 52, no. 6, p. Article 127, 2019.
[16] (5/11). Cloud Architecture Center. Available: https://cloud.google.com/architecture
[17] D. Roman, S. Saxena, V. Robu, M. Pecht, and D. Flynn, "Machine learning pipeline
34
for battery state-of-health estimation," Nature Machine Intelligence, vol. 3, no. 5, pp.
447-456, 2021/05/01 2021.
[18] P. M. Ruf, M.; Reich, C.; Ould-Abdeslam, D, "Demystifying MLOps and Presenting a
Recipe for the Selection of Open-Source Tools," Applied Sciences, vol. 11, no. 19,
2021.
[19] I. Pölöskei, "MLOps approach in the cloud-native data pipeline design," ACTA
TECHNICA JAURINENSIS, vol. Vol. 15, no. No. 1, pp. pp. 1-6, 2021.
[20] (5/11). Kubeflow Documentation. Available: https://www.kubeflow.org/docs