多窗格長度之短時傅立葉變換–依局部頻率自適應

簡易檢索 / 詳目顯示

回結果列表

研究生：	賴喬祐 Chiao-You Lai
論文名稱：	多窗格長度之短時傅立葉變換–依局部頻率自適應 Short-Time Fourier Transform with Multiple Window Lengths based on Local Frequencies
指導教授：	陳弘軒 Hung-Hsuan Chen
口試委員:
學位類別：	碩士 Master
系所名稱：	資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering
論文出版年：	2025
畢業學年度：	113
語文別：	中文
論文頁數：	45
中文關鍵詞：	傅立葉變換、頻譜圖
外文關鍵詞：	Fourier transform, Spectrogram
相關次數：	點閱：11 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

短時距傅立葉變換（Short-time Fourier Transform, STFT）是一種用
於時頻分析中的工具，可以描繪資料中頻域與時域的變化。其核心概念
為將長時間訊號根據設定之窗格大小切割為數個較短等長訊號，並將分
割出的訊號做傅立葉變換，以彌補傳統傅立葉變換無法描述時間變化的
劣勢。然而，窗格大小的設置會影響到頻率與時間的解析度折衷，且固
定窗格大小的情況不一定能適應資料中不同的頻率變化。本論文提出了
自適應窗格調整的方法，能夠根據資料的變動調整窗格大小，以符合該
資料區段最適合之頻率與時間解析度，解決上述所提及無法根據資料頻
率所變化的窗格大小問題。我們在四種資料集中比較並評估自適應窗格
之短時距傅立葉變換與固定窗格之短時距傅立葉變換的優劣。實驗結果
顯示，我們的方法能夠保持更為穩健的表現。

The Short-Time Fourier Transform (STFT) is a widely used tool for
time-frequency analysis that captures variations in both frequency and
time domains within a signal. The core concept involves segmenting a
long-duration signal into multiple shorter, equal-length segments using a
predefined window size, then applying the Fourier Transform to each seg-
ment. This approach solves the limitation of traditional Fourier Trans-
form, which lacks time localization. However, the choice of window size
affects the trade-off between frequency and time resolution. A fixed win-
dow size may not be suitable for signals with varying frequency charac-
teristics. This paper proposes an adaptive window adjustment method
that dynamically modifies the window size based on changes in the data,
allowing for the most appropriate frequency and time resolution in each
segment. To evaluate the effectiveness of the proposed method, we com-
pare the adaptive-window STFT with traditional STFT across four differ-
ent datasets. Experimental results demonstrate that our method delivers
more robust performance.

摘要 v
Abstract vi
致謝 vii
目錄 viii
一、 緒論 1
二、 相關研究 8
2.1 傳統短時傅立葉變換 ................................................... 8
2.2 調整視窗的短時傅立葉變換 .......................................... 9
2.3 調整跳變長度 (Hop size) 的短時傅立葉變換 ..................... 10
2.4 頻譜圖於不同領域中的應用 .......................................... 11
三、 研究模型及方法 13
3.1 符號和定義 ............................................................... 13
3.2 整體架構 .................................................................. 14
3.3 演算法細節 ............................................................... 16
3.3.1 窗格微調 ......................................................... 18
3.3.2 頻譜圖輸出調整 ................................................ 20
四、 實驗設計與結果分析 22
4.1 資料集介紹與頻譜圖生成 ............................................. 22
4.2 實驗環境與設置 ......................................................... 24
4.3 實驗結果與分析 ......................................................... 24
五、 總結 29
5.1 結論 ........................................................................ 29
5.2 未來展望 .................................................................. 29
參考文獻 31
附錄 A 實驗程式碼 34
                                

[1] J. Allen and L. Rabiner, “A unified approach to short-time fourier analysis and
synthesis,” Proceedings of the IEEE, vol. 65, no. 11, pp. 1558–1564, 1977. doi:
10.1109/PROC.1977.10770.
[2] A. Zhao, K. Subramani, and P. Smaragdis, “Optimizing short-time fourier transform parameters via gradient descent,” in ICASSP 2021 - 2021 IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 736–
740. doi: 10.1109/ICASSP39728.2021.9413704.
[3] E. Au-Yeung and J. J. Benedetto, “Balayage and pseudo-differential operator frame
inequalities,” in 2015 International Conference on Sampling Theory and Applications (SampTA), 2015, pp. 573–577. doi: 10.1109/SAMPTA.2015.7148956.
[4] M. Leiber, Y. Marnissi, A. Barrau, and M. E. Badaoui, “Differentiable short-time
fourier transform with respect to the hop length,” in 2023 IEEE Statistical Signal
Processing Workshop (SSP), 2023, pp. 230–234. doi: 10.1109/SSP53291.2023.
10208006.
[5] C. He, H. Shi, R. Li, J. Li, and Z. Yu, “Interpretable modulated differentiable stft
and physics-informed balanced spectrum metric for freight train wheelset bearing
cross-machine transfer fault diagnosis under speed fluctuations,” Advanced Engineering Informatics, vol. 62, p. 102 568, 2024, issn: 1474-0346. doi: https : / /
doi.org/10.1016/j.aei.2024.102568. [Online]. Available: https://www.
sciencedirect.com/science/article/pii/S1474034624002167.
[6] D. Gabor, “Theory of communication. part 1: The analysis of information,” Journal
of the Institution of Electrical Engineers - Part III: Radio and Communication
Engineering, vol. 93, pp. 429–441, 26 1946. doi: 10.1049/ji-3-2.1946.0074.
eprint: https://digital- library.theiet.org/doi/pdf/10.1049/ji- 3-
2.1946.0074. [Online]. Available: https://digital-library.theiet.org/doi/
abs/10.1049/ji-3-2.1946.0074.
[7] K. Gharaibeh, “Assessment of various window functions in spectral identification
of passive intermodulation,” Electronics, vol. 10, no. 9, 2021, issn: 2079-9292. doi:
10.3390/electronics10091034. [Online]. Available: https://www.mdpi.com/
2079-9292/10/9/1034.
[8] D.-H. Park, M.-W. Jeon, and H.-N. Kim, Resolution-adaptive micro-doppler spectrogram for human activity recognition, 2024. arXiv: 2411.15057 [eess.SP]. [Online].
Available: https://arxiv.org/abs/2411.15057.
[9] V. Chen, F. Li, S.-S. Ho, and H. Wechsler, “Micro-doppler effect in radar: Phenomenon, model, and simulation study,” IEEE Transactions on Aerospace and Electronic Systems, vol. 42, no. 1, pp. 2–21, 2006. doi: 10.1109/TAES.2006.1603402.
[10] T. Kusano, Y. Masuyama, K. Yatabe, and Y. Oikawa, Designing nearly tight window for improving time-frequency masking, 2019. arXiv: 1811.08783 [eess.SP].
[Online]. Available: https://arxiv.org/abs/1811.08783.
[11] J. Casebeer, U. Isik, S. Venkataramani, and A. Krishnaswamy, Efficient trainable
front-ends for neural speech enhancement, 2020. arXiv: 2002.09286 [eess.AS].
[Online]. Available: https://arxiv.org/abs/2002.09286.
[12] C. Mateo and J. A. Talavera, “Short-time fourier transform with the window size
fixed in the frequency domain,” Digital Signal Processing, vol. 77, pp. 13–21, 2018,
Digital Signal Processing SoftwareX - Joint Special Issue on Reproducible Research
in Signal Processing, issn: 1051-2004. doi: https://doi.org/10.1016/j.dsp.
2017.11.003. [Online]. Available: https://www.sciencedirect.com/science/
article/pii/S1051200417302555.
[13] F. Auger, E. Chassande-Mottin, and P. Flandrin, “Making reassignment adjustable:
The levenberg-marquardt approach,” in 2012 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), 2012, pp. 3889–3892. doi: 10.
1109/ICASSP.2012.6288767.
[14] D. Marx and K. Gryllias, “Differentiable short-time fourier transform window
length selection driven by cyclo-stationarity,” Annual Conference of the PHM Society, vol. 15, Oct. 2023. doi: 10.36001/phmconf.2023.v15i1.3566.
[15] Q. Yin, L. Shen, M. Lu, X. Wang, and Z. Liu, “Selection of optimal window length
using stft for quantitative snr analysis of lfm signal,” Journal of Systems Engineering and Electronics, vol. 24, no. 1, pp. 26–35, 2013. doi: 10.1109/JSEE.2013.
00004.
[16] M. Leiber, A. Barrau, Y. Marnissi, and D. Abboud, A differentiable short-time
fourier transform with respect to the window length, 2022. arXiv: 2208 . 10886
[cs.LG]. [Online]. Available: https://arxiv.org/abs/2208.10886.
[17] P. Arun, S. A. Lincon, and N. Prabhakaran, “An automated method for the analysis
of bearing vibration based on spectrogram pattern matching,” Journal of Applied
Research and Technology, vol. 17, no. 2, Oct. 2019. doi: 10.22201/icat.16656423.
2019.17.2.805. [Online]. Available: https://jart.icat.unam.mx/index.php/
jart/article/view/805.
[18] M. N. A. Tawhid, S. Siuly, H. Wang, F. Whittaker, K. Wang, and Y. Zhang, “A
spectrogram image based intelligent technique for automatic detection of autism
spectrum disorder from eeg,” PLOS ONE, vol. 16, no. 6, pp. 1–20, Jun. 2021. doi:
10.1371/journal.pone.0253094. [Online]. Available: https://doi.org/10.
1371/journal.pone.0253094.
[19] W. Lu and F. Li, “Seismic spectral decomposition using deconvolutive short-time
fourier transform spectrogram,” GEOPHYSICS, vol. 78, no. 2, pp. V43–V51, 2013.
doi: 10.1190/geo2012-0125.1. eprint: https://doi.org/10.1190/geo2012-
0125.1. [Online]. Available: https://doi.org/10.1190/geo2012-0125.1.
[20] Y. LeCun and Y. Bengio, “Convolutional networks for images, speech, and time
series,” in The Handbook of Brain Theory and Neural Networks. Cambridge, MA,
USA: MIT Press, 1998, pp. 255–258, isbn: 0262511029.
[21] A. Dempster, N. M. Foumani, C. W. Tan, et al., Monster: Monash scalable time
series evaluation repository, 2025. arXiv: 2502.15122 [cs.LG]. [Online]. Available:
https://arxiv.org/abs/2502.15122.
[22] K. J. Piczak, “ESC: Dataset for Environmental Sound Classification,” in Proceedings of the 23rd Annual ACM Conference on Multimedia, Brisbane, Australia:
ACM Press, Oct. 13, 2015, pp. 1015–1018, isbn: 978-1-4503-3459-4. doi: 10.1145/
2733373.2806390. [Online]. Available: http://dl.acm.org/citation.cfm?doid=
2733373.2806390.

簡易檢索 / 詳目顯示

相關論文