跳到主要內容

簡易檢索 / 詳目顯示

研究生: 陳詩陽
Shih-Yang Chen
論文名稱: AIOps 於 IT 工單分類:混合式NLP模型與不平衡資料處理策略比較
AIOps for IT Ticket Classification: A Comparative Study of Hybrid NLP Models and Strategies for Handling Imbalanced Data
指導教授: 周惠文
Huey-Wen Chou
口試委員:
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理學系
Department of Information Management
論文出版年: 2025
畢業學年度: 113
語文別: 中文
論文頁數: 75
中文關鍵詞: AIOpsIT工單資料增強IT服務管理DevOps
外文關鍵詞: AIOps, IT Ticket, Data Augmentation, ITSM, DevOps
相關次數: 點閱:29下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著企業對資訊科技服務管理自動化需求日增,AIOps已成為降低系統中斷與縮短事件處理時間的關鍵技術;然而,現行 IT 工單分類流程高度仰賴人工判讀,且資料類別分布不均,常導致分類效率與準確度下降。為填補此研究空缺,本文以兩組公開 IT 工單資料集為基礎,探討混合式自然語言處理模型結合資料平衡技術於多類別分類任務中的應用成效。
    研究設計採用 Stratified 5-Fold 交叉驗證,建構並比較三種主流 Transformer 模型(BERT、RoBERTa、DeBERTa)與其分別結合 CNN、LSTM、BiLSTM 等下游架構之混合模型,並引入隨機過採樣、隨機欠採樣與三類資料增強技術(Word2Vec、EDA、T5),評估其於Macro F1 Score、MCC等指標下的分類表現。
    實驗結果顯示,在 IT Ticket Classification 資料集中,BERT-CNN 模型即使未使用資料平衡技術,亦展現出最佳整體效能(Macro F1 Score 達 0.408);使用BERT-BiLSTM導入資料平衡策略後,其表現可再提升約 2%。在 IT Service Ticket Classification Dataset 資料集中,BERT-CNN 模型達到 Macro F1 Score 0.866,顯示其在不同資料集間具備良好的穩健性與泛化能力。整體而言,若能適當結合混合架構與資料平衡策略,將有助於提升不平衡 IT 工單多類別分類任務之整體效能與模型穩定性;而在 AIOps 流程前端導入自動分類機制,亦顯示出潛在應用價值,不僅可提升事件判讀效率,更可作為企業推動智慧化 ITSM 的可行技術參考。


    With the growing demand for automation in IT Service Management (ITSM), AIOps has emerged as a critical technology to reduce system downtime and shorten incident resolution time. However, current IT ticket classification processes still heavily rely on manual interpretation, and the uneven distribution of ticket categories often leads to inefficiencies and decreased classification accuracy. To address this research gap, this study investigates the effectiveness of hybrid natural language processing (NLP) models combined with data balancing techniques for multi-class classification tasks, using two public IT ticket datasets.
    The research design adopts Stratified 5-Fold Cross-Validation to construct and compare three mainstream Transformer models (BERT, RoBERTa, and DeBERTa) and their hybrid architectures incorporating CNN, LSTM, and BiLSTM. It also integrates random oversampling, random undersampling, and three data augmentation techniques (Word2Vec, EDA, and T5) to evaluate model performance using Macro F1 Score and Matthews Correlation Coefficient (MCC).
    Experimental results show that on the IT Ticket Classification dataset, the BERT-CNN model achieved the best overall performance (Macro F1 Score of 0.408) even without applying data balancing techniques. When data balancing was introduced, the BERT-BiLSTM model exhibited a further improvement of approximately 2%. On the IT Service Ticket Classification Dataset, the BERT-CNN model attained a Macro F1 Score of 0.866, demonstrating its robustness and generalizability across datasets. Overall, the combination of hybrid architectures and appropriate data balancing strategies can enhance the performance and stability of multi-class classification tasks for imbalanced IT tickets. Moreover, integrating automated classification mechanisms at the front end of the AIOps workflow reveals promising application potential, not only improving incident interpretation efficiency but also serving as a viable technical reference for enterprises promoting intelligent ITSM.

    摘要 i Abstract ii 目錄 iv 圖目錄 vi 表目錄 vii 一、緒論 1 1-1 研究背景 1 1-2 研究動機 3 1-3 研究問題 4 二、文獻探討 6 2-1 AIOps下的Ticket Classification 6 2-2 混合模型應用於IT 工單分類任務 8 2-3 資料不平衡處理技術 10 三、研究方法 12 3-1 資料集 14 3-1-1 資料集-IT Ticket Classification 14 3-1-2 資料集-IT Service Ticket Classification Dataset 16 3-2 資料前處理 17 3-2-1 資料前處理步驟 17 3-2-2 IT Ticket Classification前處理結果 18 3-2-3 IT Service Ticket Classification Dataset前處理結果 19 3-3 分層5折交叉驗證法 20 3-4 資料重採樣技術 21 3-5 資料增強技術 22 3-5-1 Word2Vec 22 3-5-2 EDA 23 3-5-3 T5 24 3-6 資料調整策略 25 3-7 預測模型 25 3-7-1 BERT 26 3-7-2 RoBERTa 26 3-7-3 DeBERTa 26 3-8 多類別分類任務 27 3-9 模型評估指標 28 四、實驗結果與分析 32 4-1 實驗一:混合模型效能比較 35 4-2 實驗二:資料重採樣和資料增強技術效能比較 37 4-2-1 ROS 與 RUS 效能之比較 38 4-2-2 Word2Vec後效能之比較 41 4-2-3 EDA後效能之比較 44 4-2-4 T5後效能之比較 47 五、研究結論與建議 50 5.1 研究結論 50 5.2 學術貢獻 51 5.3 研究限制 51 5.4 未來研究方向 52 參考文獻 54 附錄一、消融實驗結果 59

    Abbas, S. I., & Garg, A. (2024). AIOps in DevOps: Leveraging Artificial Intelligence for Operations and Monitoring. 2024 3rd International Conference on Sentiment Analysis and Deep Learning (ICSADL), 64-70. https://doi.org/10.1109/ICSADL61749.2024.00016
    Ahmed, S., Singh, M., Doherty, B., Ramlan, E., Harkin, K., Bucholc, M., & Coyle, D. (2023a). An Empirical Analysis of State-of-Art Classification Models in an IT Incident Severity Prediction Framework. Applied Sciences, 13(6), Article 6. https://doi.org/10.3390/app13063843
    Ahmed, S., Singh, M., Doherty, B., Ramlan, E., Harkin, K., Bucholc, M., & Coyle, D. (2023b). Knowledge-based Intelligent System for IT Incident DevOps. 2023 IEEE/ACM International Workshop on Cloud Intelligence & AIOps (AIOps), 1-7. https://doi.org/10.1109/AIOps59134.2023.00005
    Ahmed, S., Singh, M., Doherty, B., Ramlan, E., Harkin, K., & Coyle, D. (2022). Multiple Severity-Level Classifications for IT Incident Risk Prediction. 2022 9th International Conference on Soft Computing & Machine Intelligence (ISCMI), 270-274. https://doi.org/10.1109/ISCMI56532.2022.10068477
    Al-Smadi, B. S. (2024). DeBERTa-BiLSTM: A multi-label classification model of Arabic medical questions using pre-trained models and deep learning. Computers in Biology and Medicine, 170, 107921. https://doi.org/10.1016/j.compbiomed.2024.107921
    Amaro, R. (2024). DevOps Metrics and KPIs: A Multivocal Literature Review | ACM Computing Surveys. https://dl.acm.org/doi/full/10.1145/3652508
    Bird, J. J., Ekárt, A., & Faria, D. R. (2020). Chatbot Interaction with Artificial Intelligence: Human Data Augmentation with T5 and Language Transformer Ensemble for Text Classification (arXiv:2010.05990). arXiv. https://doi.org/10.48550/arXiv.2010.05990
    Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018, October 11). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv.Org. https://arxiv.org/abs/1810.04805v2
    Diaz-de-Arcaya, J., Torre-Bastida, A. I., Zárate, G., Miñón, R., & Almeida, A. (2023). A Joint Study of the Challenges, Opportunities, and Roadmap of MLOps and AIOps: A Systematic Survey. ACM Comput. Surv., 56(4), 84:1-84:30. https://doi.org/10.1145/3625289
    Gandla, P. K. K., Verma, R. K., Panigrahi, C. R., & Pati, B. (2024). Ticket Classification Using Machine Learning. In B. Pati, C. R. Panigrahi, P. Mohapatra, & K.-C. Li (Eds.), Proceedings of the 7th International Conference on Advance Computing and Intelligent Engineering (pp. 487–501). Springer Nature. https://doi.org/10.1007/978-981-99-5015-7_41
    Gorodkin, J. (2004). Comparing two K-category assignments by a K-category correlation coefficient. Computational Biology and Chemistry, 28(5), 367-374. https://doi.org/10.1016/j.compbiolchem.2004.09.006
    Gou, Z., & Li, Y. (2023). Integrating BERT Embeddings and BiLSTM for Emotion Analysis of Dialogue. Computational Intelligence and Neuroscience, 2023(1), 6618452. https://doi.org/10.1155/2023/6618452
    He, P., Liu, X., Gao, J., & Chen, W. (2021). DeBERTa: Decoding-enhanced BERT with Disentangled Attention (arXiv:2006.03654). arXiv. https://doi.org/10.48550/arXiv.2006.03654
    Hu, L. (2022). Performance evaluation of text augmentation methods with BERT on imbalanced datasets [Thesis, University of Missouri--Columbia]. https://doi.org/10.32469/10355/91718
    Hu, L., Li, C., Wang, W., Pang, B., & Shang, Y. (2022). Performance Evaluation of Text Augmentation Methods with BERT on Small-sized, Imbalanced Datasets. 2022 IEEE 4th International Conference on Cognitive Machine Intelligence (CogMI), 125-133. https://doi.org/10.1109/CogMI56440.2022.00027
    Kalusivalingam, A. K., Sharma, A., Patel, N., & Singh, V. (2021). Leveraging BERT and LSTM for Enhanced Natural Language Processing in Clinical Data Analysis. International Journal of AI and ML, 2(3), Article 3. https://www.cognitivecomputingjournal.com/index.php/IJAIML-V1/article/view/82
    Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. https://openreview.net/forum?id=SyxS0T4tvS
    Mahesh, T. R., Vinoth Kumar, V., Dhilip Kumar, V., Geman, O., Margala, M., & Guduri, M. (2023). The stratified K-folds cross-validation and class-balancing methods with high-performance ensemble classifiers for breast cancer classification. Healthcare Analytics, 4, 100247. https://doi.org/10.1016/j.health.2023.100247
    Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013, January 16). Efficient Estimation of Word Representations in Vector Space. arXiv.Org. https://arxiv.org/abs/1301.3781v3
    Mohammed, R., Rawashdeh, J., & Abdullah, M. (2020). Machine Learning with Oversampling and Undersampling Techniques: Overview Study and Experimental Results. 2020 11th International Conference on Information and Communication Systems (ICICS), 243–248. https://doi.org/10.1109/ICICS49469.2020.239556
    Narne, H. (2023). Revolutionizing IT Operations: AI-Driven Service Management for Efficiency and Scalability. https://www.researchgate.net/profile/Harish-Narne-3/publication/386382748_Revolutionizing_IT_Operations_AI-Driven_Service_Management_for_Efficiency_and_Scalability/links/674fd785a7fbc259f1ab0944/Revolutionizing-IT-Operations-AI-Driven-Service-Management-for-Efficiency-and-Scalability.pdf
    Ni, J., Ábrego, G. H., Constant, N., Ma, J., Hall, K. B., Cer, D., & Yang, Y. (2021). Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models (arXiv:2108.08877). arXiv. https://doi.org/10.48550/arXiv.2108.08877
    Putra, D., & Setiawan, E. (2023). Sentiment Analysis on Social Media with Glove Using Combination CNN and RoBERTa. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 7, 457–563. https://doi.org/10.29207/resti.v7i3.4892
    Rahman, M. M., Shiplu, A. I., Watanobe, Y., & Alam, M. A. (2024). RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis (arXiv:2406.00367). arXiv. https://doi.org/10.48550/arXiv.2406.00367
    Remil, Y., Bendimerad, A., Mathonat, R., & Kaytoue, M. (2024). AIOps Solutions for Incident Management: Technical Guidelines and A Comprehensive Literature Review (arXiv:2404.01363). arXiv. https://doi.org/10.48550/arXiv.2404.01363
    Sani, D. A. (2024). A Random Oversampling and BERT-based Model Approach for Handling Imbalanced Data in Essay Answer Correction. JURNAL INFOTEL, 16(4), Article 4. https://doi.org/10.20895/infotel.v16i4.1224
    Shen, S., Zhang, J., Huang, D., & Xiao, J. (2020). Evolving from Traditional Systems to AIOps: Design, Implementation and Measurements (p. 280). https://doi.org/10.1109/AEECA49918.2020.9213650
    Tan, K. L., Lee, C. P., Anbananthen, K. S. M., & Lim, K. M. (2022). RoBERTa-LSTM: A Hybrid Model for Sentiment Analysis With Transformer and Recurrent Neural Network. IEEE Access, 10, 21517–21525. https://doi.org/10.1109/ACCESS.2022.3152828
    Tatineni, S. (2023). Journal of Artificial Intelligence & Cloud Computing. Scilit. https://www.scilit.com/sources/125146
    Wei, J., & Zou, K. (2019). EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks. In K. Inui, J. Jiang, V. Ng, & X. Wan (Eds.), Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 6382–6388). Association for Computational Linguistics. https://doi.org/10.18653/v1/D19-1670
    Zhang, F. (2022). A hybrid structured deep neural network with Word2Vec for construction accident causes classification. International Journal of Construction Management, 22(6), 1120–1140. https://doi.org/10.1080/15623599.2019.1683692

    QR CODE
    :::