跳到主要內容

簡易檢索 / 詳目顯示

研究生: 陳正浩
Zheng-Hao Chen
論文名稱: 半監督學習下自定義編碼特徵的大規模比較
A Large-scale Comparison of Customized Feature Encodings under Semi-supervised Learning
指導教授: 陳弘軒
Hung-Hsuan Chen
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 軟體工程研究所
Graduate Institute of Software Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 中文
論文頁數: 37
中文關鍵詞: 自監督學習對比學習表格資料自定義編碼
外文關鍵詞: Self-supervised learnin, Contrastive learnin, tabular data, custom encoding
相關次數: 點閱:16下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 自定義編碼方式可有效提升深度學習模型在監督式任務的表現,但自定義編碼方式在自監督對比學習的效果則尚未被大規模驗證。本論文設計並實現了一個靈活的自定義特徵編碼框架,讓研究者可以大規模比較比較不同編碼方式在自監督任務的效果。同時,我們提出了一種新的編碼方式,探索其在不同資料集上的潛力和應用價值。


    Custom encoding methods can effectively enhance the performance of deep learning models in supervised tasks. However, custom encoding's effectiveness in self-supervised contrastive learning has yet to be extensively validated. This paper designs and implements a flexible framework for custom feature encoding evaluation, allowing researchers to comprehensively compare the effects of different encoding methods on self-supervised tasks. Additionally, we propose a new encoding method to explore its potential and application value across various datasets.

    目錄 頁次 摘要 v Abstract vi 誌謝 vii 目錄 viii 使用符號與定義 xiii 一、 緒論 1 二、 相關研究 3 2.1 SCARF .................................................................... 3 2.1.1 損壞特徵的數據增強 .......................................... 3 2.1.2 SCARF 方法 .................................................... 3 2.2 數值特徵編碼 ............................................................ 5 2.2.1 分段線性編碼 ................................................... 5 2.2.2 分段線性編碼方法 ............................................. 6 2.2.3 週期激勵函數 ................................................... 7 2.2.4 週期激勵函數方法 ............................................. 8 三、 框架設計 9 3.1 框架設計理念 ............................................................ 9 3.2 框架架構 .................................................................. 9 3.2.1 核心組件 ......................................................... 9 3.2.2 模組間的交互 ................................................... 10 3.3 應用程式界面 (API).................................................... 11 四、 研究模型及方法 12 4.1 整體模型架構 ............................................................ 12 4.2 標準差編碼 ............................................................... 14 4.3 自定義編碼器設計 ...................................................... 15 4.3.1 基本訓練 ......................................................... 15 4.3.2 快速實驗 ......................................................... 15 4.3.3 自定義模型 ...................................................... 16 五、 實驗結果與分析 21 5.1 實驗環境、參數細節及設定 .......................................... 21 5.2 資料集 ..................................................................... 21 5.3 實驗結果 .................................................................. 25 5.3.1 二元分類 ......................................................... 25 5.3.2 多元分類 ......................................................... 26 5.3.3 高維度特徵分類 ................................................ 27 5.3.4 大型資料集分類 ................................................ 28 5.4 討論 ........................................................................ 29 六、 總結 30 6.1 結論 ........................................................................ 30 6.2 未來展望 .................................................................. 30 參考文獻 32 附錄 A 實驗程式碼 35 附錄 B 資料集 36

    [1] D. Bahri, H. Jiang, Y. Tay, and D. Metzler, “Scarf: Self-supervised contrastive
    learning using random feature corruption,” arXiv preprint arXiv:2106.15147, 2021.
    [2] Y. Gorishniy, I. Rubachev, and A. Babenko, “On embeddings for numerical features
    in tabular deep learning,” Advances in Neural Information Processing Systems,
    vol. 35, pp. 24 991–25 004, 2022.
    [3] T. Yao, X. Yi, D. Z. Cheng, et al., “Self-supervised learning for large-scale item
    recommendations,” in Proceedings of the 30th ACM international conference on
    information & knowledge management, 2021, pp. 4321–4330.
    [4] Z. Wu, Y. Xiong, S. X. Yu, and D. Lin, “Unsupervised feature learning via nonparametric instance discrimination,” in Proceedings of the IEEE conference on
    computer vision and pattern recognition, 2018, pp. 3733–3742.
    [5] S. Purushwalkam and A. Gupta, “Demystifying contrastive self-supervised learning:
    Invariances, augmentations and dataset biases,” Advances in Neural Information
    Processing Systems, vol. 33, pp. 3407–3418, 2020.
    [6] R. Gontijo-Lopes, S. J. Smullin, E. D. Cubuk, and E. Dyer, “Affinity and diversity:
    Quantifying mechanisms of data augmentation,” arXiv preprint arXiv:2002.08973,
    2020.
    [7] R. G. Lopes, D. Yin, B. Poole, J. Gilmer, and E. D. Cubuk, “Improving robustness
    without sacrificing accuracy with patch gaussian augmentation,” arXiv preprint
    arXiv:1906.02611, 2019.
    [8] L. Perez and J. Wang, “The effectiveness of data augmentation in image classification using deep learning,” arXiv preprint arXiv:1712.04621, 2017.
    [9] D. S. Park, W. Chan, Y. Zhang, et al., “Specaugment: A simple data augmentation
    method for automatic speech recognition,” arXiv preprint arXiv:1904.08779, 2019.
    [10] A. J. Ratner, H. Ehrenberg, Z. Hussain, J. Dunnmon, and C. Ré, “Learning to compose domain-specific transformations for data augmentation,” Advances in neural
    information processing systems, vol. 30, 2017.
    32
    [11] E. D. Cubuk, B. Zoph, J. Shlens, and Q. V. Le, “Randaugment: Practical automated
    data augmentation with a reduced search space,” in Proceedings of the IEEE/CVF
    conference on computer vision and pattern recognition workshops, 2020, pp. 702–
    703.
    [12] D. Ho, E. Liang, X. Chen, I. Stoica, and P. Abbeel, “Population based augmentation: Efficient learning of augmentation policy schedules,” in International conference on machine learning, PMLR, 2019, pp. 2731–2741.
    [13] S. Lim, I. Kim, T. Kim, C. Kim, and S. Kim, “Fast autoaugment,” Advances in
    neural information processing systems, vol. 32, 2019.
    [14] X. Zhang, Q. Wang, J. Zhang, and Z. Zhong, “Adversarial autoaugment,” arXiv
    preprint arXiv:1912.11188, 2019.
    [15] T. Tran, T. Pham, G. Carneiro, L. Palmer, and I. Reid, “A bayesian data augmentation approach for learning deep models,” Advances in neural information
    processing systems, vol. 30, 2017.
    [16] A. Tamkin, M. Wu, and N. Goodman, “Viewmaker networks: Learning views for
    unsupervised representation learning,” arXiv preprint arXiv:2010.07432, 2020.
    [17] J. Yoon, Y. Zhang, J. Jordon, and M. Van der Schaar, “Vime: Extending the
    success of self-and semi-supervised learning to tabular domain,” Advances in Neural
    Information Processing Systems, vol. 33, pp. 11 033–11 043, 2020.
    [18] T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in International conference on machine
    learning, PMLR, 2020, pp. 1597–1607.
    [19] A. Dosovitskiy, L. Beyer, A. Kolesnikov, et al., “An image is worth 16x16 words:
    Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929,
    2020.
    [20] A. Vaswani, N. Shazeer, N. Parmar, et al., “Attention is all you need,” Advances
    in neural information processing systems, vol. 30, 2017.
    [21] Y. Gorishniy, I. Rubachev, V. Khrulkov, and A. Babenko, “Revisiting deep learning
    models for tabular data,” Advances in Neural Information Processing Systems,
    vol. 34, pp. 18 932–18 943, 2021.
    [22] X. Huang, A. Khetan, M. Cvitkovic, and Z. Karnin, “Tabtransformer: Tabular data
    modeling using contextual embeddings,” arXiv preprint arXiv:2012.06678, 2020.
    [23] J. Kossen, N. Band, C. Lyle, A. N. Gomez, T. Rainforth, and Y. Gal, “Self-attention
    between datapoints: Going beyond individual input-output pairs in deep learning,”
    Advances in Neural Information Processing Systems, vol. 34, pp. 28 742–28 756,
    2021.
    33
    參考文獻
    [24] G. Somepalli, M. Goldblum, A. Schwarzschild, C. B. Bruss, and T. Goldstein,
    “Saint: Improved neural networks for tabular data via row attention and contrastive
    pre-training,” arXiv preprint arXiv:2106.01342, 2021.
    [25] N. Rahaman, A. Baratin, D. Arpit, et al., “On the spectral bias of neural networks,”
    in International conference on machine learning, PMLR, 2019, pp. 5301–5310.
    [26] M. Tancik, P. Srinivasan, B. Mildenhall, et al., “Fourier features let networks learn
    high frequency functions in low dimensional domains,” Advances in neural information processing systems, vol. 33, pp. 7537–7547, 2020.
    [27] Y. Li, S. Si, G. Li, C.-J. Hsieh, and S. Bengio, “Learnable fourier features for
    multi-dimensional spatial positional encoding,” Advances in Neural Information
    Processing Systems, vol. 34, pp. 15 816–15 829, 2021.
    [28] V. Sitzmann, J. Martel, A. Bergman, D. Lindell, and G. Wetzstein, “Implicit neural
    representations with periodic activation functions,” Advances in neural information
    processing systems, vol. 33, pp. 7462–7473, 2020.
    [29] B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and
    R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,”
    Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021.
    [30] B. Bischl, G. Casalicchio, M. Feurer, et al., “Openml benchmarking suites,” arXiv:1708.03731v2
    [stat.ML], 2019.

    QR CODE
    :::