遞迴神經網路於多媒體信號處理之研究｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	童武弘 Vu Hong Toan
論文名稱：	遞迴神經網路於多媒體信號處理之研究 Recurrent Neural Networks for Multimedia Signals
指導教授：	王家慶 Jia-Ching Wang
口試委員:
學位類別：	博士 Doctor
系所名稱：	資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering
論文出版年：	2019
畢業學年度：	108
語文別：	英文
論文頁數：	74
中文關鍵詞：	基於控制門的遞歸神經網絡、自門控遞歸神經網絡、人類行為識別、環境聲音識別、駕駛員困倦檢測、單一影像除霧
外文關鍵詞：	control gate based recurrent neural network, self-gated recurrent neural network, human activity recognition, environmental sound recognition, driver drowsy detection, single image dehazing
相關次數：	點閱：16 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

深度學習（DL）已經成為資訊處理問題的首選算法，因為它可以在許多領域達到最高水平。在DL出現之前，研究人員只能依賴人工的方式尋找資訊的特徵，這往往也需要耗費大量人力資源和領域知識。有了足夠的數據和高性能計算設備，DL模型可以從數據中學習豐富的表示形式，以滿足給定的條件或是以端到端的方式做出決策或預測。雖然DL模型種類繁多，但我們對開發遞歸神經網絡（RNN）尤其感興趣，因為這是一種可以解決真實問題的類神經網絡（NN）。我們的目標不僅是獲得高精度，還要從模型複雜度、資源消耗等方面來謹慎評估我們的大多數設計，使得系統可以在現實中適用。

RNN非常適合處理與時間有關的信號。RNN通過時間順序接收輸入信號，其中網路的隱藏狀態會累積信息並逐步更新；因此RNN能有效地學習輸入信號的順序動態特性，並且可以在當前時間做出決策或預測將來的變量。但是，由於隱藏狀態會在每個時步更新，權重也會在每個時步重複使用，導致RNN模型在訓練時可能會出現梯度消失/爆炸的問題，因此難以學習長期依賴關係，從而降低了RNN在許多情況下的性能。所以我們提出了新的RNN結構，該結構不存在梯度問題，並且對於我們的目標問題非常有效。

在本論文中，我們開發了多個RNN模型來解決不同多媒體信號的現實問題。信號有多種類型，包括從集成傳感器收集的時間序列信號、音頻信號、圖像和視頻。我們特別針對以下四種信號進行研究：首先我們針對可穿戴裝置上的人類行為識別問題引入了兩種新的RNN結構。由於目標設備的功率、運算和記憶體資源有限，著名的RNN結構（如長短期內存（LSTM）和門控循環單元（GRU））並不合適；因此我們提出了基於控制門的遞歸神經網絡（CGRNN）和自門控的遞歸神經網絡（SGRNN），前者僅使用一個額外的門，後者則沒有使用額外的門。與LSTM和GRU相比，這兩個新模型不僅實現了競爭性準確性，資源消耗還少得多。其次，我們提出應用於現實生活中環境聲音識別的RNN模型。我們對DCASE 2016挑戰的數據集進行實驗並且我們的結果優於基線。第三，我們引入用於實時駕駛員睡意檢測（DDD）問題的DL模型。該模型由卷積神經網絡（CNN）、CGRNN的捲積版本（ConvCGRNN）和投票層構成。CNN會從駕駛員的完整面孔中提取有關的面部表情，並將其饋送到ConvCGRNN，好在投票層做出最終預測之前學習時間相關性。該系統不僅在檢測駕駛員困倦方面具有顯著的準確性，其高速處理還可以實現即時運算。最後，我們開發名為編碼器循環解碼器網絡（ERDN）的DL模型，來解決單一圖像的除霧問題。ERDN模型具有編碼器-解碼器體系結構。一方面，我們提出了剩餘有效空間金字塔（rESP）模塊，它是ESP模塊的擴展，以構造編碼器，而編碼器會從多個級別的模糊圖像中提取特徵；一方面，我們採用卷積遞歸神經網絡（ConvRNN），特別是ConvCGRNN，作為解碼器的主要組件，因為這個架構可以依序將編碼後的特徵從高級別聚合到低級別，以恢復清晰圖像。我們在RESIDE-Standard資料集中，證實了ERDN的效能和執行效率。

Deep learning (DL) has been becoming the first choice of algorithms for information processing problems as it can achieve state of the art in many areas. Before the emergence of DL, researchers had to design features manually, which requires a lot of human labor and domain knowledge. With enough of data and high performance computing devices, a DL model can learn a rich representation from data to satisfy given constraints or to make decisions or predictions in an end-to-end way. Despite of the variety of DL models, we are particularly interested in developing recurrent neural network (RNN) which is a class of neural networks (NN) to solve real problems. The goal is not only about getting high accuracy, but other aspects like model complexity, resource consumption are also carefully considered in most of our designs to make systems applicable in reality.

RNNs are best suited for time-dependent signals. An RNN sequentially receives input signals through time where its hidden states accumulate information and update themselves time-step by time-step. Thus, an RNN is strong to learn sequential dynamics of input signals, so it can make decisions at the present time or predict future variables. However, as the hidden states are updated and recurrent weights are reused at every timesteps, RNN models could have problems of the vanishing / exploding gradients in training, so can they be difficult in learning long-term dependencies, which decreases performance of RNNs in many tasks. Hence, we propose new RNN structures that do not have the gradients problem, and are very effective and efficient at our target problems.

In this dissertation, we develop RNN models to address real problems of different multimedia signals. The signals are in various types including time-series signals collected from integrated sensors, audio signals, images, and videos. In particular, first we introduce two new RNN structures for the problem of human activity recognition on wearable devices. Because the target devices are limited at their resource including power, memory, and computational capacity, famous RNN structures like long short-term memory (LSTM) and gated recurrent unit (GRU) are not quite suitable. We propose control gate-based recurrent neural network (CGRNN) and self-gated recurrent neural network (SGRNN) that employ only one additional gate and no additional gate, respectively. The two new models achieve competitive accuracy but with much less resource consumption in comparison to that of LSTM and GRU. Secondly, we introduce RNN models applied for environmental sound recognition in real life. We conduct experiments on datasets of the DCASE 2016 challenge; our results outperform the baselines. Thirdly, we introduce a DL model for realtime driver drowsiness detection (DDD) problem. The model is constructed by a convolutional neural network (CNN), a convolutional version of CGRNN (ConvCGRNN), and a voting layer. The CNN is to extract relevant facial representations from global faces that are then fed to the ConvCGRNN to learn temporal dependencies before the voting layer makes final predictions. The system not only yields significant accuracy in detecting driver drowsiness, but it also can run in real-time with a high processing speed. Lastly, we tackle the problem of single image dehazing by developing a DL model called encoder-recurrent decoder network (ERDN). The ERDN model has an encoder-decoder architecture. On the one hand, we propose residual efficient spatial pyramid (rESP) module which is an extension of the efficient spatial pyramid (ESP) module to construct the encoder. Thus, the encoder can effectively process hazy images at any resolution to extract relevant features at multiple contextual levels. On the other hand, we newly introduce the use of convolutional recurrent neural network (ConvRNN), specifically the use of ConvCGRNN, as the main component of the decoder to sequentially aggregate the encoded features from high levels to low levels to recover clear images. The proposed ERDN demonstrates its effectiveness and efficiency on the RESIDE-Standard dataset.

Introduction 1
1 Overview 1
2 Contributions and Outline 2
3 List of Relevant Publications 3
Human Activity Recognition on Wearable Devices 5
1 Introduction 5
2 Related Works 6
3 Human Activity Recognition System 7
3.1 General Architecture 7
3.2 Datasets 8
4 Control Gate-based Recurrent Neural Networks 9
4.1 CGRNN Structure 9
4.2 Experiments 10
5 Self-Gated Recurrent Neural Networks 13
5.1 SGRNN Structure 13
5.2 Gradient Analysis 15
5.3 Experiments 16
6 Conclusions 20
Environmental Sound Recognition in Real Life 21
1 Introduction 21
2 Proposed Systems 23
3 Experiments on Acoustic Scene Classification 24
3.1 Dataset 24
3.2 Feature Extraction and Training Details 25
3.3 Results 25
4 Experiments on Sound Event Detection 25
4.1 Dataset 25
4.2 Feature Extraction and Training Details 26
4.3 Results 27
5 Conclusions 27
Real-time Driver Drowsiness Detection in Videos 28
1 Introduction 28
2 Our Approach 30
2.1 Preprocessing 30
2.2 Deep Neural Network Model 31
3 Experiments 34
3.1 Dataset 34
3.2 Training Details 34
3.3 Experimental Results 35
4 Conclusions 36
Encoder-Recurrent Decoder Network for Single Image Dehazing 38
1 Introduction 38
2 The Proposed Method 41
3 Experiments 43
3.1 Dataset 43
3.2 Training Details 43
3.3 Evaluation Results 44
3.4 Ablation Study 45
4 Conclusions 46
Conclusions 49
Bibliography 52

                                

[1] M. A. Alsheikh, A. Selim, D. Niyato, L. Doyle, S. Lin, and H.-P. Tan. Deep activity recognition models with triaxial accelerometers. In Workshops at the Thirtieth AAAI Conference on Artificial Intelligence, 2016.
[2] E. Barsoum, C. Zhang, C. C. Ferrer, and Z. Zhang. Training deep networks for facial expression recognition with crowd-sourced label distribution. In Proceedings of the 18th ACM International Conference on Multimodal Interaction, pages 279–283. ACM, 2016.
[3] D. Berman, S. Avidan, et al. Non-local image dehazing. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1674–1682, 2016.
[4] A. Buslaev, A. Parinov, E. Khvedchenya, V. I. Iglovikov, and A. A. Kalinin. Albumentations: fast and flexible image augmentations. arXiv preprint arXiv:1809.06839, 2018.
[5] B. Cai, X. Xu, K. Jia, C. Qing, and D. Tao. Dehazenet: An end-to-end system for single image haze removal. IEEE Transactions on Image Processing, 25(11):5187–5198, 2016.
[6] L. Celona, L. Mammana, S. Bianco, and R. Schettini. A multi-task cnn framework for driver face monitoring. In 2018 IEEE 8th International Conference on Consumer Electronics-Berlin (ICCE-Berlin), pages 1–4. IEEE, 2018.
[7] C. Chen, M. N. Do, and J. Wang. Robust image and video dehazing with visual artifact suppression via gradient residual minimization. In European Conference on Computer
Vision, pages 576–591. Springer, 2016.
[8] L.-l. Chen, Y. Zhao, J. Zhang, and J.-z. Zou. Automatic detection of alertness/drowsiness from physiological signals using wavelet-based nonlinear features and machine learning. Expert Systems with Applications, 42(21):7344–7355, 2015.
[9] S. Chennuru, P.-W. Chen, J. Zhu, and J. Y. Zhang. Mobile lifelogger–recording, indexing, and understanding a mobile user’ s life. In International Conference on Mobile Computing, Applications, and Services, pages 263–281. Springer, 2010.
[10] C.-Y. Chiou, W.-C. Wang, S.-C. Lu, C.-R. Huang, P.-C. Chung, and Y.-Y. Lai. Driver monitoring using sparse representation with part-based temporal face descriptors. IEEE Transactions on Intelligent Transportation Systems, 2019.
[11] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.
[12] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.
[13] H. Dai, B. Dai, Y.-M. Zhang, S. Li, and L. Song. Recurrent hidden semi-markov model. 2016.
[14] A. Dang, T. H. Vu, and J.-C. Wang. Deep learning for dcase2017 challenge. DCASE2017 Challenge, Tech. Rep, 2017.
[15] A. Dang, T. H. Vu, and J.-C. Wang. A survey of deep learning for polyphonic sound event detection. In 2017 International Conference on Orange Technologies (ICOT), pages 75–78. IEEE, 2017.
[16] A. Dang, T. H. Vu, and J.-C. Wang. Acoustic scene classification using convolutional neural networks and multi-scale multi-feature extraction. In 2018 IEEE International Conference on Consumer Electronics (ICCE), pages 1–4. IEEE, 2018.
[17] R. Fattal. Dehazing using color-lines. ACM transactions on graphics (TOG), 34(1):13, 2014.
[18] D. Figo, P. C. Diniz, D. R. Ferreira, and J. M. Cardoso. Preprocessing techniques for context recognition from accelerometer data. Personal and Ubiquitous Computing, 14(7):645–662, 2010.
[19] A. Graves, A.-r. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing, pages 6645–6649. IEEE, 2013.
[20] S. Ha and S. Choi. Convolutional neural networks for human activity recognition using multiple accelerometer and gyroscope sensors. In 2016 International Joint Conference on Neural Networks (IJCNN), pages 381–388. IEEE, 2016.
[21] N. Y. Hammerla, S. Halloran, and T. Plötz. Deep, convolutional, and recurrent models for human activity recognition using wearables. arXiv preprint arXiv:1604.08880, 2016.
[22] K. He, J. Sun, and X. Tang. Single image haze removal using dark channel prior. IEEE transactions on pattern analysis and machine intelligence, 33(12):2341–2353, 2010.
[23] K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015.
[24] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
[25] S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
[26] T. Huynh and B. Schiele. Analyzing features for activity recognition. In Proceedings of the 2005 joint conference on Smart objects and ambient intelligence: innovative context-aware services: usages and technologies, pages 159–163. ACM, 2005.
[27] M. Inoue, S. Inoue, and T. Nishida. Deep recurrent neural network for mobile human activity recognition with high throughput. Artificial Life and Robotics, 23(2):173–185, 2018.
[28] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.
[29] W. Jiang and Z. Yin. Human activity recognition using wearable sensors by deep convolutional neural networks. In Proceedings of the 23rd ACM international conference on
Multimedia, pages 1307–1310. Acm, 2015.
[30] R. Jozefowicz, W. Zaremba, and I. Sutskever. An empirical exploration of recurrent network architectures. In International Conference on Machine Learning, pages 2342–2350, 2015.
[31] D. E. King. Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research, 10(Jul):1755–1758, 2009.
[32] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
[33] Q. V. Le, N. Jaitly, and G. E. Hinton. A simple way to initialize recurrent networks of rectified linear units. arXiv preprint arXiv:1504.00941, 2015.
[34] B. Li, X. Peng, Z. Wang, J. Xu, and D. Feng. Aod-net: All-in-one dehazing network. In Proceedings of the IEEE International Conference on Computer Vision, pages 4770–4778, 2017.
[35] B. Li, W. Ren, D. Fu, D. Tao, D. Feng, W. Zeng, and Z. Wang. Benchmarking single-image dehazing and beyond. IEEE Transactions on Image Processing, 28(1):492–505, 2018.
[36] Z. Li, L. Chen, J. Peng, and Y. Wu. Automatic detection of driver fatigue using driving operation information for transportation safety. Sensors, 17(6):1212, 2017.
[37] Z. Liu, B. Xiao, M. Alrabeiah, K. Wang, and J. Chen. Single image dehazing with a generic model-agnostic convolutional neural network. IEEE Signal Processing Letters, 26(6):833–837, 2019.
[38] E. J. McCartney. Optics of the atmosphere: scattering by molecules and particles. New York, John Wiley and Sons, Inc., 1976. 421 p., 1976.
[39] S. Mehta, M. Rastegari, A. Caspi, L. Shapiro, and H. Hajishirzi. Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), pages 552–568, 2018.
[40] G. Meng, Y. Wang, J. Duan, S. Xiang, and C. Pan. Efficient image dehazing with boundary constraint and contextual regularization. In Proceedings of the IEEE international conference on computer vision, pages 617–624, 2013.
[41] A. Mesaros, T. Heittola, and T. Virtanen. Metrics for polyphonic sound event detection. Applied Sciences, 6(6):162, 2016.
[42] A. Mesaros, T. Heittola, and T. Virtanen. Tut database for acoustic scene classification and sound event detection. In 2016 24th European Signal Processing Conference (EUSIPCO), pages 1128–1132. IEEE, 2016.
[43] S. G. Narasimhan and S. K. Nayar. Vision and the atmosphere. International journal of computer vision, 48(3):233–254, 2002.
[44] S. Park, F. Pan, S. Kang, and C. D. Yoo. Driver drowsiness detection system based on feature representation learning using various deep networks. In Asian Conference on
Computer Vision, pages 154–164. Springer, 2016.
[45] R. Pascanu, T. Mikolov, and Y. Bengio. On the difficulty of training recurrent neural networks. In International conference on machine learning, pages 1310–1318, 2013.
[46] T. Plötz, N. Y. Hammerla, and P. L. Olivier. Feature learning for activity recognition in ubiquitous computing. In Twenty-Second International Joint Conference on Artificial Intelligence, 2011.
[47] Y. Qu, Y. Chen, J. Huang, and Y. Xie. Enhanced pix2pix dehazing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8160–8168, 2019.
[48] D. Ravi, C. Wong, B. Lo, and G.-Z. Yang. Deep learning for human activity recognition: A resource efficient implementation on low-power devices. In 2016 IEEE 13th International Conference on Wearable and Implantable Body Sensor Networks (BSN), pages 71–76. IEEE, 2016.
[49] W. Ren, S. Liu, H. Zhang, J. Pan, X. Cao, and M.-H. Yang. Single image dehazing via multi-scale convolutional neural networks. In European conference on computer vision, pages 154–169. Springer, 2016.
[50] W. Ren, L. Ma, J. Zhang, J. Pan, X. Cao, W. Liu, and M.-H. Yang. Gated fusion network for single image dehazing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3253–3261, 2018.
[51] J.-L. Reyes-Ortiz, L. Oneto, A. Samà, X. Parra, and D. Anguita. Transition-aware human activity recognition using smartphones. Neurocomputing, 171:754–767, 2016.
[52] T.-H. Shih and C.-T. Hsu. Mstn: Multistage spatial-temporal network for driver drowsiness detection. In Asian Conference on Computer Vision, pages 146–153. Springer, 2016.
[53] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[54] B. C. Tefft et al. Prevalence of motor vehicle crashes involving drowsy drivers, United States, 2009-2013. Citeseer, 2014.
[55] T. Tieleman and G. Hinton. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning, 4(2):26–31, 2012.
[56] D. Ulyanov, A. Vedaldi, and V. Lempitsky. Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022, 2016.
[57] T. H. Vu, A. Dang, L. Dung, and J.-C. Wang. Self-gated recurrent neural networks for human activity recognition on wearable devices. In Proceedings of the on Thematic Workshops of ACM Multimedia 2017, pages 179–185. ACM, 2017.
[58] T. H. Vu, A. Dang, and J.-C. Wang. A deep neural network for real-time driver drowsiness detection. IEICE TRANSACTIONS on Information and Systems, Vol.E102-D No.12:2637–
2641, 2019.
[59] T. H. Vu, L. Dung, and J.-C. Wang. Transportation mode detection on mobile devices using recurrent nets. In Proceedings of the 24th ACM international conference on Multimedia, pages 392–396. ACM, 2016.
[60] T. H. Vu and J.-C. Wang. Acoustic scene and event recognition using recurrent neural networks. Detection and Classification of Acoustic Scenes and Events, 2016, 2016.
[61] C.-H. Weng, Y.-H. Lai, and S.-H. Lai. Driver drowsiness detection via a hierarchical temporal deep belief network. In Asian Conference on Computer Vision, pages 117–133. Springer, 2016.
[62] R. J. Williams and D. Zipser. Gradient-based learning algorithms for recurrent. Backpropagation: Theory, architectures, and applications, 433, 1995.
[63] P. Wu, H.-K. Peng, J. Zhu, and Y. Zhang. Senscare: Semi-automatic activity summarization system for elderly care. In International Conference on Mobile Computing, Applications, and Services, pages 1–19. Springer, 2011.
[64] P. Wu, J. Zhu, and J. Y. Zhang. Mobisens: A versatile mobile sensing platform for real-world applications. Mobile Networks and Applications, 18(1):60–80, 2013.
[65] Y. Wu, S. Zhang, Y. Zhang, Y. Bengio, and R. R. Salakhutdinov. On multiplicative integration with recurrent neural networks. In Advances in neural information processing
systems, pages 2856–2864, 2016.
[66] Z. Xu, X. Yang, X. Li, and X. Sun. The effectiveness of instance normalization: a strong baseline for single image dehazing. arXiv preprint arXiv:1805.03305, 2018.
[67] H. Yao, W. Zhang, R. Malhan, J. Gryak, and K. Najarian. Filter-pruned 3d convolutional neural network for drowsiness detection. In 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 1258–1262. IEEE, 2018.
[68] F. Yu, V. Koltun, and T. Funkhouser. Dilated residual networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 472–480, 2017.
[69] J. Yu, S. Park, S. Lee, and M. Jeon. Representation learning, scene understanding, and feature fusion for drowsiness detection. In Asian Conference on Computer Vision, pages 165–177. Springer, 2016.
[70] M.-C. Yu, T. Yu, S.-C. Wang, C.-J. Lin, and E. Y. Chang. Big data small footprint: the design of a low-power classifier for detecting transportation modes. Proceedings of the VLDB Endowment, 7(13):1429–1440, 2014.
[71] Y. Yuan, W. Yang, W. Ren, J. Liu, W. J. Scheirer, and Z. Wang. Ug 2+ track 2: A collective benchmark effort for evaluating and advancing image understanding in poor visibility environments. arXiv preprint arXiv:1904.04474, 2019.
[72] P. Zappi, C. Lombriser, T. Stiefmeier, E. Farella, D. Roggen, L. Benini, and G. Tröster. Activity recognition from on-body sensors: accuracy-power trade-off by dynamic sensor selection. In European Conference on Wireless Sensor Networks, pages 17–33. Springer, 2008.
[73] M. Zeng, L. T. Nguyen, B. Yu, O. J. Mengshoel, J. Zhu, P. Wu, and J. Zhang. Convolutional neural networks for human activity recognition using mobile sensors. In 6th International Conference on Mobile Computing, Applications and Services, pages 197–205. IEEE, 2014.
[74] H. Zhang and V. M. Patel. Densely connected pyramid dehazing network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3194–3203, 2018.
[75] H. Zhang, V. Sindagi, and V. M. Patel. Multi-scale single image dehazing using perceptual pyramid deep network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 902–911, 2018.
[76] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10):1499–1503, 2016.
[77] M. Zhang and A. A. Sawchuk. A feature selection-based framework for human activity recognition using wearable multimodal sensors. In Proceedings of the 6th International Conference on Body Area Networks, pages 92–98. ICST (Institute for Computer Sciences, Social-Informatics and ..., 2011.
[78] Q. Zhu, J. Mai, and L. Shao. A fast single image haze removal algorithm using color attenuation prior. IEEE transactions on image processing, 24(11):3522–3533, 2015.

簡易檢索 / 詳目顯示

相關論文