跳到主要內容

簡易檢索 / 詳目顯示

研究生: 林子淨
Zih-Jing Lin
論文名稱: 錯標邏輯斯迴歸之D-最適設計
D-Optimal Designs for Mislabelled Logistic Regression
指導教授: 黃世豪
Shih-Hao Huang
口試委員:
學位類別: 碩士
Master
系所名稱: 理學院 - 統計研究所
Graduate Institute of Statistics
論文出版年: 2025
畢業學年度: 113
語文別: 中文
論文頁數: 57
中文關鍵詞: 邏輯斯迴歸錯誤標記模型最適設計隨機交換演算法檢驗錯 誤
外文關鍵詞: Logistic Regression, Mislabelled Model, Optimal Design, Randomized Exchange Algorithm, Test Error
相關次數: 點閱:83下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在實務應用中,二元反應資料常帶有一些錯誤標記。例如醫學檢驗 的偽陰及偽陽,或是敏感問卷中隨機作答技術導致作答反應值的錯誤分 類。在這些應用中,使用邏輯斯迴歸模型並加入錯誤標記之考量將有助 於建構更準確的統計推論。在本論文我們將研究錯標邏輯斯迴歸模型之 D-最適設計問題。當只有單一解釋變數時,我們發現其D-最適設計與 一般邏輯斯迴歸之D-最適設計同為等權重兩點設計,但這兩個支撐點不 具對稱性。在多個解釋變數的情況下,我們則推廣隨機交換演算法,以 求得錯標邏輯斯迴歸模型的D-最適設計,並討論其與一般邏輯斯迴歸之 結果的異同之處。


    In practical applications, binary response data often contain some misclassificationerrors, such as false positives and false negatives in medicaltesting, or response misclassification caused by random response techniquein sensitive questionnaires. Therefore, incorporating misclassification intologistic regression models for such problems can lead to more suitable statisticalinferences. In this thesis, we study D-optimal designs for mislabelledlogistic regression. When there is only one explanatory variable, wefind that there exists a D-optimal design having two support points withequal weights, as that in standard logistic regression, but these two pointsare not symmetric. In the case of multiple explanatory variables, we adaptthe randomized exchange algorithm to obtain a D-optimal design for themislabelled logistic regression and discuss the similarities and differencescompared to the results from the standard logistic regression.

    摘要i Abstract ii 致謝iii 目錄iv 一、緒論1 二、預備知識4 2.1 邏輯斯迴歸模型......................................................... 4 2.2 錯標邏輯斯迴歸模型................................................... 5 2.3 無關聯RR 模型......................................................... 7 2.4 D-最適設計概念介紹................................................... 9 三、D-最適設計12 3.1 單一解釋變數下的D-最適設計...................................... 12 3.1.1 兩點等權重設計猜想.......................................... 13 3.1.2 基於猜想1 之D-最適設計數值結果...................... 15 3.2 多個解釋變數下的D-最適設計...................................... 18 3.2.1 多個解釋變數下的本質完備類.............................. 19 3.2.2 設計空間離散化................................................ 20 3.2.3 D-效率與效率下界介紹....................................... 20 iv 目錄 3.2.4 REX 演算法概述............................................... 21 3.2.5 無錯標發生率的近似D-最適設計.......................... 23 3.2.6 有錯標發生率的近似D-最適設計.......................... 26 四、結論31 附錄A 理論證明與數學式推導34 A.1 數學式(3.1) 推導....................................................... 34 A.2 引理1 證明............................................................... 35 A.3 定理3 證明............................................................... 35 A.4 推論1 證明............................................................... 36 附錄B REX 演算法細節37 B.1 LBE 權重更新方式..................................................... 37 B.2 隨機交換設計點決定方式............................................. 37 B.3 隨機交換權重更新方式................................................ 38 附錄C 比較效率閾值39 附錄D 範圍不對稱D-最適設計41 D.1 四點等權重D-最適設計............................................... 41 D.2 四點非等權重D-最適設計............................................ 44 D.3 三點等權重D-最適設計............................................... 46 參考文獻47

    Biedermann, S., Dette, H., & Zhu, W. (2006). Optimal designs for dose–response modelswith restricted design spaces. Journal of the American Statistical Association, 101,747–759.
    Böhning, D. (1986). A vertex-exchange-method in D-optimal design theory. Metrika, 33,337–347.
    Copas, J. B. (1988). Binary regression models for contaminated data. Journal of the RoyalStatistical Society: Series B (Methodological), 50, 225–253.
    Ford, I., Torsney, B., & Wu, C. J. (1992). The use of a canonical form in the constructionof locally optimal designs for non-linear problems. Journal of the Royal StatisticalSociety Series B: Statistical Methodology, 54, 569–583.
    Greenberg, B. G., Abul-Ela, A.-L. A., Simmons, W. R., & Horvitz, D. G. (1969). Theunrelated question randomized response model: Theoretical framework. Journal ofthe American Statistical Association, 64, 520–539.
    Harman, R., Filová, L., & Richtárik, P. (2020). A randomized exchange algorithm forcomputing optimal approximate designs of experiments. Journal of the AmericanStatistical Association, 115, 348–361.
    Huang, S.-H., Huang, M.-N. L., & Lin, C.-W. (2020). Optimal designs for binary responsemodels with multiple nonnegative variables. Journal of Statistical Planning andInference, 206, 75–83.
    Hung, H., Jou, Z.-Y., & Huang, S.-Y. (2018). Robust mislabel logistic regression withoutmodeling mislabel probabilities. Biometrics, 74, 145–154.
    Kabera, G. M. (2009). D-optimal Designs for Drug Synergy [PhD thesis]. School of Statisticsand Actuarial Science, University of KwaZulu-Natal.
    Kabera, G. M., Haines, L. M., & Ndlovu, P. (2015). The analytic construction of D-optimaldesigns for the two-variable binary logistic regression model without interaction.Statistics, 49, 1169–1186.
    Kiefer, J. (1974). General equivalence theory for optimum designs (approximate theory).The Annals of Statistics, 2, 849–879.
    Lyles, R. H., Tang, L., Superak, H. M., King, C. C., Celentano, D. D., Lo, Y., & Sobel,J. D. (2011). Validation data-based adjustments for outcome misclassification inlogistic regression: An illustration. Epidemiology, 22, 589–597.
    Marshall, A. W., Olkin, I., & Arnold, B. C. (2011). Inequalities: Theory of Majorizationand Its Applications (2nd edition). Springer, New York.
    Pham, A., Cummings, M., Lindeman, C., Drummond, N., & Williamson, T. (2019). Recognizingmisclassification bias in research and medical practice. Family Practice,36, 804–807.
    Pukelsheim, F. (2006). Optimal Design of Experiments. SIAM, Philadelphia.
    Scheers, N. J., & Dayton, C. M. (1988). Covariate randomized response models. Journalof the American Statistical Association, 83, 969–974.
    van den Hout, A., van der Heijden, P. G. M., & Gilchrist, R. (2007). The logistic regressionmodel with response variables subject to randomized response. ComputationalStatistics & Data Analysis, 51, 6060–6069.
    Warner, S. L. (1965). Randomized response: A survey technique for eliminating evasiveanswer bias. Journal of the American Statistical Association, 60, 63–69.
    Yang, M., Zhang, B., & Huang, S. (2011). Optimal designs for generalized linear modelswith multiple design variables. Statistica Sinica, 21, 1415–1430.

    QR CODE
    :::