跳到主要內容

簡易檢索 / 詳目顯示

研究生: 侯思綺
Szu-Chi Hou
論文名稱: 適用教育資料中成績模糊化與其他特徵面向之關聯規則挖掘方法設計
Adapting Association Data Mining Method With Fuzzy Grade Values in Multi-Feature Educational Data
指導教授: 蔡孟峰
Meng-Feng Tsai
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2025
畢業學年度: 113
語文別: 中文
論文頁數: 36
中文關鍵詞: 校務研究關聯規則挖掘FP-Growth模糊集合
外文關鍵詞: Institutional research, Association Rule Mining, FP-Growth, Fuzzy Set
相關次數: 點閱:15下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 許多教育資料特徵,例如成績、教學滿意度等,本身定義即難以被單一、明確的數值精確量化,若隨意將其硬性劃分可能造成部分資訊被錯誤處理或者遺失,導致其模糊的特性在後續挖掘關聯規則時,沒有被考慮進去。因此,本研究旨在設計一套適用於教育資料分析的關聯規則挖掘方法,針對成績資料特徵進行模糊化處理,以更真實地反映學生表現的語意層次。
    由於原始 FP-Growth 演算法無法直接處理模糊資訊,故本研究延伸其架構,提出一種融合模糊邏輯與頻繁模式挖掘的方法,能夠同時處理模糊特徵資料與離散特徵資料,除了達到以往頻繁樣式集挖掘的效果外,也希望能得到更具代表性的關聯規則。
    本研究使用中央大學校務資料倉儲作為資料來源,首先透過 Fuzzy C-Means 方法將畢業平均成績欄位分為「低、中、高」三個語意群作為模糊類別項目,並計算成績對於此三類別個別之隸屬度,以作為 FP-Growth 演算法中 FP-Tree 累計的依據。後續透過設計模糊類別合併機制,新增「中高」、「中低」等中間模糊類別,並根據原始模糊類別累積隸屬度之加權平均,得到代表成績欄位之最終語意標籤,以此產生對應頻繁樣式集,並進行關聯規則挖掘。
    根據實驗結果顯示,此方法透過利用成績特徵模糊的特性,找出了可能被一般原始 FP-Growth 演算法忽略的,較為稀有但可能具實質意義的規則;且在保有挖掘一般明確特徵關聯規則的能力下,提供了與此規則相關的模糊特徵的附加資訊。展現了本研究未來在教育資料探勘領域的應用潛力。


    Many educational data features, such as grades and teaching satisfaction, are difficult to be accurately quantified by a single, clear value. If they are arbitrarily divided, some information may be mishandled or lost, resulting in their fuzzy characteristics not being taken into account when subsequent association rules are generated. Therefore, the study aims to design an association rule mining method suitable for educational data analysis, which perform fuzzy processing on the characteristics of grade data to reflect the semantic level of student performance more realistically.
    Since the traditional FP-Growth algorithm cannot directly process fuzzy information, we extends its framework and proposes a method that integrates fuzzy logic and frequent pattern mining, which can simultaneously process fuzzy feature data and crisp feature data, and achieve the effect of previous frequent item set mining at the same time. It is also hoped that more representative association rules can be mined.
    This study uses National Central University Data Warehouse as the data source. First, the grade field is divided into three semantic groups of "low, medium, and high" as fuzzy items through the Fuzzy C-Means method, and the individual fuzzy degree of the grades is calculated for these three items which will later be used by FP-Tree accumulation in the FP-Growth algorithm. Subsequently, by designing a fuzzy item merging mechanism, additional intermediate fuzzy items such as "medium-high" and "medium-low" are added, and the final semantic label representing the grade field is obtained based on the weighted average of the accumulated membership of the original fuzzy items, thereby generating the corresponding frequent item sets and performing association rule mining.
    According to the experimental results, we uses the fuzzy characteristics of the grades to find out the relatively rare but potentially meaningful rules that may be ignored by the original FP-Growth algorithm; and while retaining the ability to mine general clear feature association rules, it provides additional information about the fuzzy features related to this rule. This shows the potential for future application of this research in the field of educational data mining.

    摘要 i Abstract ii 誌謝 iv 目錄 v 圖目錄 vii 表目錄 viii 一、 緒論 1 1-1. 研究背景與動機 1 1-2. 研究目的 1 二、 文獻探討 2 2-1. 模糊理論 2 2-2. 資料倉儲 4 2-3. 資料分群 4 2-4. 資料探勘 5 三、 研究方法 8 3-1. 系統架構與流程 8 3-2. 資料前處理 8 3-3. 關聯規則挖掘 9 四、 實驗 17 4-1. 實驗環境 17 4-2. 資料集選擇 17 4-3. 資料前處理 17 4-4. 演算法輸入參數設定 18 4-5. 實驗結果與比較 19 4-4-1. 關聯規則數量與執行時間 19 4-4-2. 關聯規則之差異 19 五、 結論與未來展望 21 5-1結論 21 5-2未來展望 21 六、 參考文獻 23

    [1] Zadeh, L. A (1965). “Fuzzy sets”, Information and Control, Volume 8, Issue 3, Pages 338-353.
    [2] J. MacQueen. (1967). "Some methods for classification and analysis of multivariate observations", in Proc. 5th Berkeley Symp. Math. Statist. Probab., vol. 1, pp. 281–297.
    [3] J. C. Dunn. (1973). "A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters", Journal of Cybernetics, vol. 3, no. 3, pp. 32–57.
    [4] Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function algorithms. Springer.
    [5] Han, J., Kamber, M. & Pei, J. (2012) Data Mining Concepts and Techniques., 3rd edition., Morgan Kaufmann Publishers.
    [6] Han, J., Pei, J., Yin, Y., & Mao, R. (2004, January). Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Springer Link.
    https://link.springer.com/article/10.1023/B:DAMI.0000005258.31418.83.
    [7] Agrawal, R. & Srikant, R. (1994, September)., “Fast Algorithms for Mining Association Rules in Large Databases”, Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487-499, Santiago, Chile.
    [8] Wang, C., lee, W., & Pang, C. (2010, May). Applying Fuzzy FP-Growth to Mine Fuzzy Association Rules. Research Gate. https://www.researchgate.net/publication/265561519_Applying_Fuzzy_FP-Growth_to_Mine_Fuzzy_Association_Rules
    [9] Pearson, K. (1894). Contributions to the mathematical theory of evolution. Philosophical Transactions of the Royal Society of London. A, 185, 71–110. https://doi.org/10.1098/rsta.1894.0003
    [10] Hu Y. C., “Mining association rules at a concept hierarchy using fuzzy partition,” Journal of Information Management, Vol. 13, no.3, pp.63-80, 2006.
    [11] Hu, Y. C., Chen, R. S. and Tzeng, G. H., “Finding Fuzzy Classification Rules Using Data Mining Techniques,” Pattern Recognition Letters, vol.24, pp.509-519, 2003.
    [12] Hu, Y. C., “Finding Useful fuzzy concepts for pattern classification using genetic algorithm,“ Information Sciences, vol. 175, pp. 1-19, 2005.
    [13] Zadeh, L. A., “The Concept of a Linguistic Variable and Its Application to Approximate Reasoning,” Information Science (part 1)”, Information Science, vol. 8, pp. 199-249, 1975a.
    [14] Zadeh, L. A., “The Concept of a Linguistic Variable and Its Application to Approximate Reasoning,” Information Science (part 2)” Information Science, vol. 8, pp. 301-357, 1975b.
    [15] Zadeh, L. A., “The Concept of a Linguistic Variable and Its Application to Approximate Reasoning,” Information Science (part 3), Vol. 9, no. 1, pp.43-80, 1976.
    [16] Jin Zhao and B. K. Bose, "Evaluation of membership functions for fuzzy logic controlled induction motor drive," IEEE 2002 28th Annual Conference of the Industrial Electronics Society. IECON 02, Seville, Spain, 2002, pp. 229-234 vol.1

    QR CODE
    :::