跳到主要內容

簡易檢索 / 詳目顯示

研究生: 賴議翔
Yi-Shiang Lai
論文名稱: An Audio Call Classification System Based on Fine-Tuned BERT
指導教授: 孫敏德
Min-Te Sun
口試委員:
學位類別: 博士
Doctor
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 38
中文關鍵詞: BERT遷移學習通話分類
外文關鍵詞: BERT, Transfer learning, Call classification
相關次數: 點閱:12下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 一家電話行銷公司非常依賴他們的銷售員撥打大量的通話以推銷
    公司的產品,為了能夠優先處理較有購買意願的潛在客戶以及檢視
    銷售員的業績,一個能夠客觀判斷一通促銷通話目前屬於哪個促銷
    階段的機制對電話行銷公司非常重要。
    在這篇論文中,我們設計了一套基於微調 BERT 的語音通話分類系
    統,它能夠自動的將每通銷售員的通話分類為適當的階段。我們的
    提出的系統包含五個組件,包含資料收集、資料前處理、預訓練模
    型微調、通話等級分類、以及網路服務,在資料收集中,語音通話
    會藉由 Kaldi 語音辨識轉換為相對應的文本,在資料前處理,文本
    會經由移除停用詞、切割文本、以及手動標記等處理,在預訓練模
    型微調中,四個基於 BERT 的預訓練模型經由遷移學習進而獲得可對
    段落等級分類的模型,在通話等級分類中,一個基於規則的方法被
    用在通話相對應的段落上進而獲得一通通話的分類結果(階段),最
    後我們提供了一個網路服務以便公司可以容易地使用我們的系統。
    經過密集的實驗後,結果顯示我們提出的系統在通話等級的分類上
    可以達到 97%的 Macro F1 Score 並且比 TextCNN 高出 13%。


    A telemarketing company relies heavily on its telemarketers to make numerous
    calls to customers in order to promote the company products. To prioritize the
    potential customers and evaluate the performance of telemarketers, a objectively
    mechanism to identify which stage of promotion a call belongs to is crucial to a
    telemarketing company. In this thesis, we design an audio call classification system
    based on fine-tuned BERT [1] to automatically classify each telemarketer’s call to an
    appropriate stage. The five components of the proposed system are data collection,
    data pre-processing, pre-trained model fine-tuning, call-level classification, and the
    web service. In data collection, the audio calls are converted into the corresponding transcripts via Kaldi speech recognition. In data pre-processing, transcripts
    are processed to remove stopwords, split into segments, and assign labels manually. In pre-trained model fine-tuning, four BERT-based models are retrained to
    obtain segment-level classification models. In call-level classification, a rule-based
    method is performed to obtain the call-level classification (i.e., stage) of a call from
    the classification results of the corresponding segments of the call. Finally, a web
    service is provided to allow the company access the system easily. The extensive
    experiments show that the proposed system reaches 97% Macro-F1 Score for the
    call-level classification.

    Contents 1 Introduction 1 2 Related Work 3 2.1 Machine learning approaches 3 2.2 Deep learning approaches 3 3 Preliminary 5 3.1 Kaldi 5 3.2 Bidirectional Encoder Representations from Transformers (BERT) 6 3.2.1 Background of BERT 6 3.2.2 Transformer Encoder 6 3.2.3 Pre-training tasks 7 3.2.4 Improved BERT 8 3.3 Flask 9 4 Design 10 4.1 Data Collection 11 4.2 Data pre-processing 12 4.3 Pre-trained Model Fine-Tuning 14 4.4 Call-level Classification 16 4.5 Web Service 17 5 Performance 18 5.1 Experimental Environment 18 5.2 Dataset Description 18 5.3 Evaluation Metrics 19 5.4 Experiment Results and Analysis 22 5.4.1 Segment-level Evaluation Result 22 5.4.2 Misclassification Threshold Tuning 23 5.4.3 Call-level Evaluation Result24 6 Conclusions 25 Reference 27

    [1] Bert. https://github.com/google-research/bert.
    [2] Cocolong. https://www.cocolong.com.tw/zh-tw.
    [3] Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, and
    Guoping Hu. Pre-training with whole word masking for chinese bert, 2019.
    [4] Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, and Guoping Hu.
    Revisiting pre-trained models for chinese natural language processing. Findings of
    the Association for Computational Linguistics: EMNLP 2020, 2020.
    [5] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer
    Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly
    optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
    [6] Y. Zhao, Y. Qian, and C. Li. Improved knn text classification algorithm with mapreduce implementation. In 2017 4th International Conference on Systems and Informatics (ICSAI), pages 1417–1422, 2017.
    [7] S. Wei, J. Guo, Z. Yu, P. Chen, and Y. Xian. The instructional design of chinese text
    classification based on svm. In 2013 25th Chinese Control and Decision Conference
    (CCDC), pages 5114–5117, 2013.
    [8] Q. Jiang, W. Wang, X. Han, S. Zhang, X. Wang, and C. Wang. Deep feature
    weighting in naive bayes for chinese text classification. In 2016 4th International
    Conference on Cloud Computing and Intelligence Systems (CCIS), pages 160–164,
    2016.
    27[9] Yoon Kim. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing
    (EMNLP), pages 1746–1751, Doha, Qatar, October 2014. Association for Computational Linguistics.
    [10] Junmei Zhong and William Li. Predicting customer call intent by analyzing phone
    call transcripts based on CNN for multi-class classification. CoRR, abs/1907.03715,
    2019.
    [11] Changshun Du and Lei Huang. Text classification research with attention-based recurrent neural networks. International Journal of Computers Communications Control, 13:50, 02 2018.
    [12] Xuewei Li and Hongyun Ning. Chinese text classification based on hybrid model of
    cnn and lstm. In Proceedings of the 3rd International Conference on Data Science
    and Information Technology, DSIT 2020, page 129–134, New York, NY, USA, 2020.
    Association for Computing Machinery.
    [13] Boulianne Gilles Burget Lukas Glembek Ondrej Goel Nagendra Hannemann Mirko
    Motlicek Petr Qian Yanmin Schwarz Petr Silovsky Jan Stemmer Georg Vesely Karel
    Povey Daniel, Ghoshal Arnab. The kaldi speech recognition toolkit. 2011.
    [14] Apache licence v2.0. https://www.apache.org/licenses/LICENSE-2.0.
    [15] Openfstlibrary. http://www.openfst.org/twiki/bin/view/FST/WebHome.
    [16] Basic linear algebra subprograms. http://www.netlib.org/blas/.
    [17] Linear algebra package. http://www.netlib.org/lapack//.
    28[18] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N.
    Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. CoRR,
    abs/1706.03762, 2017.
    [19] Guillaume Lample and Alexis Conneau. Cross-lingual language model pretraining,
    2019.
    [20] Flask. https://github.com/pallets/flask.
    [21] Werkzeug. https://github.com/pallets/werkzeug.
    [22] Jinja. https://github.com/pallets/jinja.
    [23] Stopwords. https://www.overcoded.net/stop-words-lists-removal-195521/.
    [24] Pytorch. https://pytorch.org/.
    [25] Chi Sun, Xipeng Qiu, Yige Xu, and Xuanjing Huang. How to fine-tune BERT for
    text classification? CoRR, abs/1905.05583, 2019.

    QR CODE
    :::