跳到主要內容

簡易檢索 / 詳目顯示

研究生: 游家齊
Chia-Chi Yu
論文名稱: 透過再預訓練BERT結合適配器與情感分析 預測線上平台遊戲產品存活年限
Predicting the Lifespan of Online Platform Games through BERT Re-Pretraining with Adapters and Sentiment Analysis
指導教授: 葉英傑
Ying-Chieh Yeh
口試委員:
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理研究所
Graduate Institute of Industrial Management
論文出版年: 2025
畢業學年度: 113
語文別: 中文
論文頁數: 53
中文關鍵詞: 遊戲壽命預測、BERT輕量模型、MLM 預訓練、評論分析、模型微調
外文關鍵詞: Game Lifespan Prediction, BERT Model, MLM Pretraining, User Review Analysis, Model Finetuning
相關次數: 點閱:26下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著數位遊戲產業蓬勃發展,大量遊戲產品不斷上架與下架,遊戲壽命逐漸成
    為玩家課金與廠商營運決策的重要參考依據。然而,現行平台多未揭露遊戲存續風
    險資訊,導致玩家可能投入於即將下架之遊戲,產生資源浪費與信任流失問題。因
    此,本研究旨在建立一套可預測遊戲存活年限之模型架構,協助使用者判斷遊戲潛
    在壽命長度,提升消費決策品質。
    本研究以 Steam 平台為資料來源,透過網路爬蟲蒐集已下架遊戲之上架與下架年份,
    計算遊戲存活年數作為標籤,並搭配遊戲評論文本進行預測建模。模型訓練流程包
    含:BERT 基礎模型之 Masked Language Modeling(MLM)預訓練,後續依序加入
    Adapter 模組、情緒提示(Prefix)與完整參數微調(Finetune),建構逐層強化的預
    測模型。另設計驗證實驗,模擬模型僅能取得歷史資料時,對未來下架遊戲的預測
    能力。
    實驗結果顯示,模型表現隨模組設計漸進提升,其中結合 Adapter、Prefix 並進行完
    整微調之模型於準確率、平均絕對誤差(MAE)、±1 年與 ±2 年準確率等指標皆展
    現最佳表現。此外,跨時期驗證結果顯示,僅以 2023 年以前資料訓練之模型,其
    預測結果與資料量對照組相近,顯示本研究模型具備良好泛化能力,可應用於實務
    中對遊戲壽命進行預測與風險判斷。
    未來將進一步建置查詢介面,使使用者能輸入遊戲 ID,即可獲得預測年限區間與落
    點機率,實現實用化推廣目標。


    With the rapid growth of the digital gaming industry, a vast number of games are
    being released and subsequently delisted. As a result, the lifespan of games has become
    a critical factor for players' spending decisions and developers' operational planning.
    However, current gaming platforms rarely disclose information regarding a game's
    potential longevity, which may lead players to invest in games that are likely to be
    delisted soon, resulting in wasted resources and diminished trust. This study aims to
    construct a predictive model for estimating a game's survival duration, thereby assisting
    users in making more informed decisions.
    Using Steam as the data source, this study collected the release and delisting years of
    games via web scraping and calculated each game's survival duration as the
    classification label. Coupled with user review texts, a series of models were trained to
    predict the number of years a game would remain available. The modeling process
    includes: (1) Masked Language Modeling (MLM) pretraining using a BERT-based
    architecture, followed by (2) the integration of Adapter modules, (3) the inclusion of
    sentiment-aware prefixes, and (4) full-parameter finetuning to progressively enhance
    performance. A validation experiment was also designed to simulate the real-world
    scenario in which only historical data is available, to assess the model’s generalization
    to future game delistings.
    Experimental results show that model performance improved significantly with each
    additional enhancement. The final model—combining Adapters, Prefix, and
    Finetuning—achieved the best results in terms of accuracy, mean absolute error (MAE),
    and ±1 / ±2-year prediction accuracy. Furthermore, in the cross-period validation
    experiment, the model trained solely on data prior to 2023 achieved comparable
    performance to a baseline model trained on a random subset of the full dataset. This
    indicates strong generalization ability and practical potential for forecasting future
    game lifespans.
    Future work will involve the development of a user-facing prediction interface,
    allowing users to input a game ID and receive a predicted lifespan range along with the
    probability of the game falling within that range—thus facilitating the practical
    deployment of this model in real-world scenarios.

    摘要 ................................................................................................................................... ii Abstract ............................................................................................................................ iii 目錄 .................................................................................................................................. iv 圖目錄 .............................................................................................................................. vi 表目錄 ............................................................................................................................. vii 第一章 緒論 ...................................................................................................................... 1 1.1 研究背景與動機 ................................................................................................. 1 1.2 研究挑戰 ............................................................................................................. 2 1.3 研究目的 ............................................................................................................. 2 1.4 研究方法 ............................................................................................................. 3 第二章 文獻回顧 .............................................................................................................. 4 2.1 資料處理分析與挑戰 ......................................................................................... 5 2.2 情感與文本分析 ................................................................................................. 6 2.3 結構型數據 ......................................................................................................... 8 2.4 Adapter的應用 .................................................................................................. 10 2.5 損失函數與結果的表示 .................................................................................... 11 第三章 方法論 ................................................................................................................ 13 3.1 資料蒐集與標籤建構 ....................................................................................... 15 3.1.1 MongoDB ............................................................................................... 16 3.2 資料前處理與語意預訓練 ............................................................................... 17 3.2.1 Masked Language Modeling(MLM)預訓練 ..................................... 17 3.3 模型架構設計與特徵擷取 ............................................................................... 18 3.3.1 MLM預訓練階段 ................................................................................... 18 v 3.3.2 壽命分類階段 ........................................................................................ 19 3.3.3 特徵擴充與融合 .................................................................................... 22 3.3.4 情緒分析與特徵融合策略 .................................................................... 23 3.4 微調訓練策略 ................................................................................................... 24 3.5 評估方法與效能指標 ....................................................................................... 25 3.6 模型驗證設計與合理性說明 ........................................................................... 27 第四章 實驗 .................................................................................................................... 30 4.1 實驗環境與開發工具 ....................................................................................... 30 4.2 資料預處理的方法與資料集的內容 ............................................................... 31 4.3 實驗設計 ........................................................................................................... 32 4.3.1 驗證流程設計 ........................................................................................ 32 4.3.2 主模型設計 ............................................................................................ 32 4.4 實驗結果及模型比較 ....................................................................................... 34 4.4.1 驗證模型泛化能力之實驗結果 ............................................................ 34 4.4.2 主模型實驗結果 .................................................................................... 37 4.5 與其他論文的比較 ........................................................................................... 40 第五章 結論 .................................................................................................................... 42 5.1 研究結論 ............................................................................................................ 42 5.2 未來展望 ............................................................................................................ 42 參考文獻 ......................................................................................................................... 44

    Araci, D. (2019). Finbert: Financial sentiment analysis with pre-trained language models. arXiv
    preprint arXiv:1908.10063.
    Borbora, Z., Srivastava, J., Hsu, K.-W., & Williams, D. (2011). Churn prediction in mmorpgs
    using player motivation theories and an ensemble approach. 2011 IEEE third
    international conference on privacy, security, risk and trust and 2011 IEEE third
    international conference on social computing,
    Cai, B., Pellegrini, F., Pang, M., de Moor, C., Shen, C., Charu, V., & Tian, L. (2023).
    Bootstrapping the cross-validation estimate. arXiv preprint arXiv:2307.00260.
    Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). Bert: Pre-training of deep
    bidirectional transformers for language understanding. Proceedings of the 2019
    conference of the North American chapter of the association for computational linguistics:
    human language technologies, volume 1 (long and short papers),
    Elalem, Y. K., Maier, S., & Seifert, R. W. (2023). A machine learning-based framework for
    forecasting sales of new products with short life cycles using deep neural networks.
    International Journal of Forecasting, 39(4), 1874-1894.
    Fernández Galeote, D., & Hamari, J. (2021). Game-based climate change engagement: analyzing
    the potential of entertainment and serious games. Proceedings of the ACM on Human
    Computer Interaction, 5(CHI PLAY), 1-21.
    Ferreira, C., & Gonçalves, G. (2022). Remaining Useful Life prediction and challenges: A
    literature review on the use of Machine Learning Methods. Journal of manufacturing
    systems, 63(May), 550-562.
    Fu, H., Gong, M., Wang, C., Batmanghelich, K., & Tao, D. (2018). Deep ordinal regression
    network for monocular depth estimation. Proceedings of the IEEE conference on
    43
    computer vision and pattern recognition.
    Gorishniy, Y., Rubachev, I., & Babenko, A. (2022). On embeddings for numerical features in
    tabular deep learning. Advances in Neural Information Processing Systems, 35, 24991
    25004.
    Goulet-Pelletier, J.-C., & Cousineau, D. (2018). A review of effect sizes and their confidence
    intervals, Part I: The Cohen’sd family. The Quantitative Methods for Psychology, 14(4),
    242-265.
    Guo, J., Zhang, Z., Xu, L., Chen, B., & Chen, E. (2021). Adaptive adapters: An efficient way to
    incorporate BERT into neural machine translation. IEEE/ACM Transactions on Audio,
    Speech, and Language Processing, 29, 1740-1751.
    Hadiji, F., Sifa, R., Drachen, A., Thurau, C., Kersting, K., & Bauckhage, C. (2014). Predicting
    player churn in the wild. 2014 IEEE conference on computational intelligence and games,
    Houlsby, N., Giurgiu, A., Jastrzebski, S., Morrone, B., De Laroussilhe, Q., Gesmundo, A.,
    Attariyan, M., & Gelly, S. (2019). Parameter-efficient transfer learning for NLP.
    International conference on machine learning,
    Kohavi, R., & Provost, F. (1998). Confusion matrix. Machine learning, 30(2-3), 271-274.
    Li, X. L., & Liang, P. (2021). Prefix-tuning: Optimizing continuous prompts for generation.
    arXiv preprint arXiv:2101.00190.
    Liu, Q., Wang, H., Wangjiu, C., Awang, T., Yang, M., Qiongda, P., Yang, X., Pan, H., & Wang,
    F. (2024). An artificial intelligence-based bone age assessment model for Han and
    Tibetan children. Frontiers in Physiology, 15, 1329145.
    Marchand, A. (2016). The power of an installed base to combat lifecycle decline: The case of
    video games. International Journal of Research in Marketing, 33(1), 140-154.
    Okoh, C., Roy, R., Mehnen, J., & Redding, L. (2014). Overview of remaining useful life
    prediction techniques in through-life engineering services. Procedia Cirp, 16, 158-163.
    Pedregosa, F., Bach, F., & Gramfort, A. (2017). On the consistency of ordinal regression
    44
    methods. Journal of Machine Learning Research, 18(55), 1-35.
    Sifa, R., Hadiji, F., Runge, J., Drachen, A., Kersting, K., & Bauckhage, C. (2015). Predicting
    purchase decisions in mobile free-to-play games. proceedings of the AAAI conference on
    artificial intelligence and interactive digital entertainment,
    Taşcı, B., Omar, A., & Ayvaz, S. (2023). Remaining useful lifetime prediction for predictive
    maintenance in manufacturing. Computers & Industrial Engineering, 184, 109566.
    Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. R. (2018). GLUE: A multi
    task benchmark and analysis platform for natural language understanding. arXiv preprint
    arXiv:1804.07461.

    QR CODE
    :::