[7
Wang S. General constructive representations for continuous piecewise-linear functions. IEEE Trans. Circuits Syst. I Regul. Pap. 51 1889–1896 (2004). This paper considers a general constructive method for representing an arbitrary PWL function in which significant differences and connections between different representation models are vigorously discussed. Many theoretical analyses on deep PWLNNs adopt the theorems and lemmas proposed.
[8
Wang S. & Sun X. Generalization of hinging hyperplanes. IEEE Trans. Inf. Theory 51 4425–4431 (2005). This paper presents the idea of inserting multiple linear functions in the hinge and formal proofs are given for the universal representation ability for continuous PWL functions. The connection with maxout in deep PWLNNs can be referred to.
[9
Xu J. Huang X. & Wang S. Adaptive hinging hyperplanes and its applications in dynamic system identification. Automatica 45 2325–2332 (2009).
[10
Tao Q. et al. Learning with continuous piecewise linear decision trees. Expert. Syst. Appl. 168 114–214 (2021).
[11
Tao Q. et al. Toward deep adaptive hinging hyperplanes. IEEE Transactions on Neural Networks and Learning Systems (IEEE 2022).
[12
Chien M.-J. Piecewise-linear theory and computation of solutions of homeomorphic resistive networks. IEEE Trans. Circuits Syst. 24 118–127 (1977)
[13
Pucar P. & Sj?berg J. On the hinge-finding algorithm for hinging hyperplanes. IEEE Trans. Inf. Theory 44 3310–3319 (1998).
[14
Huang X. Xu J. & Wang S. in Proc. American Control Conf. 4431–4936 (IEEE 2010). This paper proposes a gradient descent learning algorithm for PWLNNs where domain partitions and parameter optimizations are both elucidated.
[15
Hush D. & Horne B. Efficient algorithms for function approximation with piecewise linear sigmoidal networks. IEEE Trans. Neural Netw. 9 1129–1141 (1998).
[16
LeCun Y. et al. Gradient-based learning applied to document recognition. Proc. IEEE 86 2278–2324 (1998). This work formally introduces the basic learning framework for generic deep learning including deep PWLNNs.
[17
He K. Zhang X. Ren S. & Sun J. in Proc. IEEE Int. Conf. Computer Vision 1026–1034 (IEEE 2015). This paper presents modifications of optimization strategies on the PWL-DNNs and a novel PWL activation function where PWL-DNNs can be delved into fairly deep.
[18
Tao Q. Xu J. Suykens J. A. K. & Wang S. in Proc. IEEE Conf. Decision and Control 1482–1487 (IEEE 2018).
[19
Wang G. Giannakis G. B. & Chen J. Learning ReLU networks on linearly separable data: algorithm optimality and generalization. IEEE Trans. Signal. Process. 67 2357–2370 (2019).
[20
Tsay C. Kronqvist J. Thebelt A. & Misener R. Partition-based formulations for mixed-integer optimization of trained ReLU neural networks. Adv. Neural Inf. Process. Syst. 34 2993–3003 (2021).
[21
Nair V. & Hinton G. in Proc. Int. Conf. on Machine Learning (eds Fürnkranz J. & Joachims T.) 807–814 (2010). This paper initiates the prevalence and state-of-theart performance of PWL-DNNs and establishes the most popular ReLU.
[22
Glorot X. Bordes A. & Bengio Y. Deep sparse rectifier neural networks. PMLR 15 315–323 (2011).
[23
Lin J. N. & Unbehauen R. Canonical piecewise-linear networks. IEEE Trans. Neural Netw. 6 43–50 (1995). This work depicts network topology for Generalized Canonical Piecewise Linear Representations and also discusses the idea of introducing general PWL activation functions for deep PWLNNs yet without numerical evaluations.
[24
Pascanu R. Montufar G. & Bengio Y. in Adv. Neural Inf. Process. Syst. 2924–2932 (NIPS 2014). This paper presents the novel perspective of measuring the capacity of deep PWLNNs namely the number of linear sub-regions where how to utilize the locally linear property is introduced with mathematical proofs and intuitive visualizations.
[25
Bemporad A. Borrelli F. & Morari M. Piecewise linear optimal controllers for hybrid systems. Proc. Am. Control. Conf. 2 1190–1194 (2000). This work introduces the characteristics of PWL in control systems and the applications of PWL non-linearity.
[26
Goodfellow I. Warde-Farley D. Mirza M. Courville A. & Bengio Y. in Proc. Int. Conf. Machine Learning Vol. 28 (eds Dasgupta S. & McAllester D.) 1319–1327 (PMLR 2013). This paper proposes a flexible PWL activation function for deep PWLNNs and ReLU can be regarded as its special case and analysis on the universal approximation ability and the relations to the shallow-architectured PWLNNs are given.
[27
Yarotsky D. Error bounds for approximations with deep ReLU networks. Neural Netw. 94 103–114 (2017).
[28
Bunel R. Turkaslan I. Torr P. H. S. Kohli P. & Mudigonda P. K. in Adv. Neural Inf. Process. Syst. Vol. 31 (eds Bengio S. et al.) 4795–4804 (2018).
[29
Jia J. Cao X. Wang B. & Gong N. Z. in Proc. Int. Conf. Learning Representations (ICLR 2020).
推荐阅读
- 《前沿》子刊神经科学报道!诱导多能干细胞治疗缺血性中风
- 该领域4本被剔除SCIE的期刊,你知道吗?
- 如果百万吨重的黄金小行星撞击地球,会引发全球金融危机吗?
- 努力在当下,别的都随缘。因为生命本身就是个偶然
- 巴甫洛夫大脑由什么构成?
- 处于地远山险的瑰丽物种,其实就在我们「身边」
- 南佛罗里达大学医学院:通过干细胞疗法可以隔离帕金森病的炎症
- 神秘巨人石像是外星人建造的?运输和打造过程成为谜团,非常艰难
- 韦伯太空望远镜为我们展示了一个前所未见的海王星