In:
Journal of the Acoustical Society of America, Acoustical Society of America (ASA), Vol. 140, No. 4_Supplement ( 2016-10-01), p. 3404-3404
Abstract:
In this study, we propose a polyphonic sound event detection method based on a hybrid system of Convolutional Bidirectional Long Short-Term Memory Recurrent Neural Network and Hidden Markov Model (CBLSTM-HMM). Inspired by the state-of-the-art approach to integrating neural networks to HMM in speech recognition, the proposed method develops the hybrid system using CBLSTM to estimate the HMM state output probability, making it possible to model sequential data while handling its duration change. The proposed hybrid system is capable of detecting a segment of each sound event without post-processing, such as a smoothing process of detection results over multiple frames, usually required in the frame-wise detection methods. Moreover, we can easily apply it to a multi-label classification problem to achieve polyphonic sound event detection. We conduct experimental evaluations using the DCASE2016 task two dataset to compare the performance of the proposed method to that of the conventional methods, such as non-negative matrix factorization (NMF) and standard BLSTM-RNN, and also investigate the effect of network structures on the detection performance.
Type of Medium:
Online Resource
ISSN:
0001-4966
,
1520-8524
Language:
English
Publisher:
Acoustical Society of America (ASA)
Publication Date:
2016
detail.hit.zdb_id:
1461063-2
Permalink