Multiclass voice commands classification with multiple binary convolution neural networks

Jarosław Szkoła

University of Rzeszow


Abstract

In machine learning, in order to obtain good models, it is necessary to train the network on a large data set. It is very often a long process, and any changes to the input dataset require re-training the entire network. If it is necessary to extend the model with new output classes, the use of the existing model becomes problematic, and in the case of extension with new decision classes, it is required to re-train the entire model based on all data. To improve this process, a new neural network architecture was proposed, which allows for easy extension of the already existing models with new classes, without the need to re-train the entire network, as well as the time needed to train the sub-model is much shorter than the time needed to re-train the entire neural network. The presented network architecture is designed for data that has at least two decision classes.


Keywords:

multiclass convolution neural networks, voting decision mechanism, voice commands classification, multiclass classifier, sound wave processing and classification


CORNELIO C., DONINI M., LOREGGIA A., PINI M.S., ROSSI F. 2021. Voting with random classifiers (VORACE): theoretical and experimental analysis. Autonomous Agents and Multi-Agent Systems, 35(22). https://doi.org/10.1007/s10458-021-09504-y.   Google Scholar

DONINI M., LOREGGIA A., PINI M.S., ROSSI F. 2018. Voting with Random Neural Networks: a Democratic Ensemble Classifier. RiCeRcA 2018. arXiv:1909.08996. https://doi.org/10.48550/arXiv.1909.08996.   Google Scholar

HOFFMANN J., BORGEAUD S., MENSCH A., BUCHATSKAYA E., CAI T., RUTHERFORD E., DE LAS CASAS D., HENDRICKS L.A., WELBL J., CLARK A., HENNIGAN T., NOLAND E., MILLICAN K., VAN DEN DRIESSCHE G., DAMOC B., GUY A., OSINDERO S., SIMONYAN K., ELSEN E., RAE J.W., VINYALS O., SIFRE L. 2022.Training Compute-Optimal Large Language Models. https://arxiv.org/abs/2203.15556. https://doi.org/10.48550/arXiv.2203.15556.   Google Scholar

O’SHEA K., NASH R. 2015. An Introduction to Convolutional Neural Networks. arXiv:1511.08458. https://doi.org/10.48550/arXiv.1511.08458.   Google Scholar

SHAFAHI A., SAADATPANAH P., ZHU CH., GHIASI A. , STUDER C., JACOBS D., GOLDSTEIN T. 2020. Adversarially Robust Transfer Learning. ICLR 2020 Conference Blind Submission. https://openreview.net/pdf?id=ryebG04YvB.   Google Scholar

WARDEN P. 2017. Speech Commands: A public dataset for single-word speech recognition. http://download.tensorflow.org/data/speech_commands_v0.01.tar.gz.   Google Scholar

WARDEN P. 2018. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition. arXiv:1804.03209. https://doi.org/10.48550/arXiv.1804.03209.   Google Scholar

ZEGHIDOUR N., XU Q., LIPTCHINSKY V., USUNIER N., SYNNAEVE G., COLLOBERT R. 2019. Fully Convolutional Speech Recognition. arXiv:1812.06864. https://doi.org/10.48550/arXiv.1812.06864.   Google Scholar

Download


Published
2022-11-03

Cited by

Szkoła, J. (2022). Multiclass voice commands classification with multiple binary convolution neural networks. Technical Sciences, 25, 149–170. https://doi.org/10.31648/ts.8098

Jarosław Szkoła 
University of Rzeszow



License

Copyright (c) 2022 Jarosław Szkoła

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.





-->