A Combination of Multi-state Activation Functions, Mean-normalisation and Singular Value Decomposition for learning Deep Neural Networks
MetadataShow full item record
In this paper, we propose Multi-state Activation Functions (MSAFs) for Deep Neural Networks (DNNs). These multi-state functions do extra classification based on the 2-state Logistic function. Discussions on the MSAFs reveal that these activation functions have potentials for altering the parameter distribution of the DNN models, improving model performances and reducing model sizes. Meanwhile, an extension of the XOR problem indicates how neural networks with the multistate functions facilitate classifying patterns. Furthermore, basing on running average mean-normalisation rules, we actualise a combination of mean-normalised optimisation with the MSAFs as well as Singular Value Decomposition (SVD). Experimental results on TIMIT reveal that acoustic models based on DNNs can be improved by applying the MSAFs. The models obtain better phone error rates when the Logistic function is replaced with the multi-state functions. Further experiments on large vocabulary continuous speech recognition tasks reveal that the MSAFs and mean-normalised Stochastic Gradient Descent (MN-SGD) bring better recognition performances for DNNs in comparison with the conventional Logistic function and SGD learning method. Beyond this, the combination of the MSAFs, the SVD method and MN-SGD shrinks the parameter scales of DNNs to 44% approximately, leading to considerable increasing on decoding speed and decreasing on model sizes without any loss of recognition performances.
2015 International Joint Conference on Neural Networks (IJCNN)
Distributed Computing not elsewhere classified