透過再預訓練BERT結合適配器與情感分析預測線上平台遊戲產品存活年限

簡易檢索 / 詳目顯示

回結果列表

研究生：	游家齊 Chia-Chi Yu
論文名稱：	透過再預訓練BERT結合適配器與情感分析預測線上平台遊戲產品存活年限 Predicting the Lifespan of Online Platform Games through BERT Re-Pretraining with Adapters and Sentiment Analysis
指導教授：	葉英傑 Ying-Chieh Yeh
口試委員:
學位類別：	碩士 Master
系所名稱：	管理學院 - 工業管理研究所 Graduate Institute of Industrial Management
論文出版年：	2025
畢業學年度：	113
語文別：	中文
論文頁數：	53
中文關鍵詞：	遊戲壽命預測、BERT輕量模型、MLM 預訓練、評論分析、模型微調
外文關鍵詞：	Game Lifespan Prediction, BERT Model, MLM Pretraining, User Review Analysis, Model Finetuning
相關次數：	點閱：26 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

隨著數位遊戲產業蓬勃發展，大量遊戲產品不斷上架與下架，遊戲壽命逐漸成
為玩家課金與廠商營運決策的重要參考依據。然而，現行平台多未揭露遊戲存續風
險資訊，導致玩家可能投入於即將下架之遊戲，產生資源浪費與信任流失問題。因
此，本研究旨在建立一套可預測遊戲存活年限之模型架構，協助使用者判斷遊戲潛
在壽命長度，提升消費決策品質。
本研究以 Steam 平台為資料來源，透過網路爬蟲蒐集已下架遊戲之上架與下架年份，
計算遊戲存活年數作為標籤，並搭配遊戲評論文本進行預測建模。模型訓練流程包
含：BERT 基礎模型之 Masked Language Modeling（MLM）預訓練，後續依序加入
Adapter 模組、情緒提示（Prefix）與完整參數微調（Finetune），建構逐層強化的預
測模型。另設計驗證實驗，模擬模型僅能取得歷史資料時，對未來下架遊戲的預測
能力。
實驗結果顯示，模型表現隨模組設計漸進提升，其中結合 Adapter、Prefix 並進行完
整微調之模型於準確率、平均絕對誤差（MAE）、±1 年與 ±2 年準確率等指標皆展
現最佳表現。此外，跨時期驗證結果顯示，僅以 2023 年以前資料訓練之模型，其
預測結果與資料量對照組相近，顯示本研究模型具備良好泛化能力，可應用於實務
中對遊戲壽命進行預測與風險判斷。
未來將進一步建置查詢介面，使使用者能輸入遊戲 ID，即可獲得預測年限區間與落
點機率，實現實用化推廣目標。

With the rapid growth of the digital gaming industry, a vast number of games are
being released and subsequently delisted. As a result, the lifespan of games has become
a critical factor for players' spending decisions and developers' operational planning.
However, current gaming platforms rarely disclose information regarding a game's
potential longevity, which may lead players to invest in games that are likely to be
delisted soon, resulting in wasted resources and diminished trust. This study aims to
construct a predictive model for estimating a game's survival duration, thereby assisting
users in making more informed decisions.
Using Steam as the data source, this study collected the release and delisting years of
games via web scraping and calculated each game's survival duration as the
classification label. Coupled with user review texts, a series of models were trained to
predict the number of years a game would remain available. The modeling process
includes: (1) Masked Language Modeling (MLM) pretraining using a BERT-based
architecture, followed by (2) the integration of Adapter modules, (3) the inclusion of
sentiment-aware prefixes, and (4) full-parameter finetuning to progressively enhance
performance. A validation experiment was also designed to simulate the real-world
scenario in which only historical data is available, to assess the model’s generalization
to future game delistings.
Experimental results show that model performance improved significantly with each
additional enhancement. The final model—combining Adapters, Prefix, and
Finetuning—achieved the best results in terms of accuracy, mean absolute error (MAE),
and ±1 / ±2-year prediction accuracy. Furthermore, in the cross-period validation
experiment, the model trained solely on data prior to 2023 achieved comparable
performance to a baseline model trained on a random subset of the full dataset. This
indicates strong generalization ability and practical potential for forecasting future
game lifespans.
Future work will involve the development of a user-facing prediction interface,
allowing users to input a game ID and receive a predicted lifespan range along with the
probability of the game falling within that range—thus facilitating the practical
deployment of this model in real-world scenarios.

摘要 ................................................................................................................................... ii  
Abstract ............................................................................................................................ iii 
目錄 .................................................................................................................................. iv 
圖目錄 .............................................................................................................................. vi  
表目錄 ............................................................................................................................. vii 
第一章 緒論 ...................................................................................................................... 1 
1 研究背景與動機 ................................................................................................. 1 
2 研究挑戰 ............................................................................................................. 2 
3 研究目的 ............................................................................................................. 2 
4 研究方法 ............................................................................................................. 3 
第二章 文獻回顧 .............................................................................................................. 4 
1 資料處理分析與挑戰 ......................................................................................... 5 
2 情感與文本分析 ................................................................................................. 6 
3 結構型數據 ......................................................................................................... 8 
4 Adapter的應用 .................................................................................................. 10 
5 損失函數與結果的表示 .................................................................................... 11 
第三章 方法論 ................................................................................................................ 13 
1 資料蒐集與標籤建構 ....................................................................................... 15 
1.1 MongoDB ............................................................................................... 16 
2 資料前處理與語意預訓練 ............................................................................... 17 
2.1 Masked Language Modeling（MLM）預訓練 ..................................... 17 
3 模型架構設計與特徵擷取 ............................................................................... 18 
3.1 MLM預訓練階段 ................................................................................... 18 
v  
3.2 壽命分類階段 ........................................................................................ 19 
3.3 特徵擴充與融合 .................................................................................... 22 
3.4 情緒分析與特徵融合策略 .................................................................... 23 
4 微調訓練策略 ................................................................................................... 24 
5 評估方法與效能指標 ....................................................................................... 25 
6 模型驗證設計與合理性說明 ........................................................................... 27 
第四章 實驗 .................................................................................................................... 30 
1 實驗環境與開發工具 ....................................................................................... 30 
2 資料預處理的方法與資料集的內容 ............................................................... 31 
3 實驗設計 ........................................................................................................... 32 
3.1 驗證流程設計 ........................................................................................ 32 
3.2 主模型設計 ............................................................................................ 32 
4 實驗結果及模型比較 ....................................................................................... 34 
4.1 驗證模型泛化能力之實驗結果 ............................................................ 34 
4.2 主模型實驗結果 .................................................................................... 37 
5 與其他論文的比較 ........................................................................................... 40 
第五章 結論 .................................................................................................................... 42 
1 研究結論 ............................................................................................................ 42 
2 未來展望 ............................................................................................................ 42 
參考文獻 ......................................................................................................................... 44 

                                

Araci, D. (2019). Finbert: Financial sentiment analysis with pre-trained language models. arXiv
preprint arXiv:1908.10063.
Borbora, Z., Srivastava, J., Hsu, K.-W., & Williams, D. (2011). Churn prediction in mmorpgs
using player motivation theories and an ensemble approach. 2011 IEEE third
international conference on privacy, security, risk and trust and 2011 IEEE third
international conference on social computing,
Cai, B., Pellegrini, F., Pang, M., de Moor, C., Shen, C., Charu, V., & Tian, L. (2023).
Bootstrapping the cross-validation estimate. arXiv preprint arXiv:2307.00260.
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). Bert: Pre-training of deep
bidirectional transformers for language understanding. Proceedings of the 2019
conference of the North American chapter of the association for computational linguistics:
human language technologies, volume 1 (long and short papers),
Elalem, Y. K., Maier, S., & Seifert, R. W. (2023). A machine learning-based framework for
forecasting sales of new products with short life cycles using deep neural networks.
International Journal of Forecasting, 39(4), 1874-1894.
Fernández Galeote, D., & Hamari, J. (2021). Game-based climate change engagement: analyzing
the potential of entertainment and serious games. Proceedings of the ACM on Human
Computer Interaction, 5(CHI PLAY), 1-21.
Ferreira, C., & Gonçalves, G. (2022). Remaining Useful Life prediction and challenges: A
literature review on the use of Machine Learning Methods. Journal of manufacturing
systems, 63(May), 550-562.
Fu, H., Gong, M., Wang, C., Batmanghelich, K., & Tao, D. (2018). Deep ordinal regression
network for monocular depth estimation. Proceedings of the IEEE conference on
43
computer vision and pattern recognition.
Gorishniy, Y., Rubachev, I., & Babenko, A. (2022). On embeddings for numerical features in
tabular deep learning. Advances in Neural Information Processing Systems, 35, 24991
25004.
Goulet-Pelletier, J.-C., & Cousineau, D. (2018). A review of effect sizes and their confidence
intervals, Part I: The Cohen’sd family. The Quantitative Methods for Psychology, 14(4),
242-265.
Guo, J., Zhang, Z., Xu, L., Chen, B., & Chen, E. (2021). Adaptive adapters: An efficient way to
incorporate BERT into neural machine translation. IEEE/ACM Transactions on Audio,
Speech, and Language Processing, 29, 1740-1751.
Hadiji, F., Sifa, R., Drachen, A., Thurau, C., Kersting, K., & Bauckhage, C. (2014). Predicting
player churn in the wild. 2014 IEEE conference on computational intelligence and games,
Houlsby, N., Giurgiu, A., Jastrzebski, S., Morrone, B., De Laroussilhe, Q., Gesmundo, A.,
Attariyan, M., & Gelly, S. (2019). Parameter-efficient transfer learning for NLP.
International conference on machine learning,
Kohavi, R., & Provost, F. (1998). Confusion matrix. Machine learning, 30(2-3), 271-274.
Li, X. L., & Liang, P. (2021). Prefix-tuning: Optimizing continuous prompts for generation.
arXiv preprint arXiv:2101.00190.
Liu, Q., Wang, H., Wangjiu, C., Awang, T., Yang, M., Qiongda, P., Yang, X., Pan, H., & Wang,
F. (2024). An artificial intelligence-based bone age assessment model for Han and
Tibetan children. Frontiers in Physiology, 15, 1329145.
Marchand, A. (2016). The power of an installed base to combat lifecycle decline: The case of
video games. International Journal of Research in Marketing, 33(1), 140-154.
Okoh, C., Roy, R., Mehnen, J., & Redding, L. (2014). Overview of remaining useful life
prediction techniques in through-life engineering services. Procedia Cirp, 16, 158-163.
Pedregosa, F., Bach, F., & Gramfort, A. (2017). On the consistency of ordinal regression
44
methods. Journal of Machine Learning Research, 18(55), 1-35.
Sifa, R., Hadiji, F., Runge, J., Drachen, A., Kersting, K., & Bauckhage, C. (2015). Predicting
purchase decisions in mobile free-to-play games. proceedings of the AAAI conference on
artificial intelligence and interactive digital entertainment,
Taşcı, B., Omar, A., & Ayvaz, S. (2023). Remaining useful lifetime prediction for predictive
maintenance in manufacturing. Computers & Industrial Engineering, 184, 109566.
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. R. (2018). GLUE: A multi
task benchmark and analysis platform for natural language understanding. arXiv preprint
arXiv:1804.07461.

簡易檢索 / 詳目顯示

相關論文