In:
The Journal of the Acoustical Society of America, Acoustical Society of America (ASA), Vol. 125, No. 4_Supplement ( 2009-04-01), p. 2730-2730
Abstract:
Non-speech audio event detection (AED) could be used for low-cost, spatially diffuse surveillance applications, e.g., monitoring of vehicle activity in a national park, or of footsteps in a hallway. Experiments have shown that non-speech AED benefits from the dynamic inference strategies such as the hidden Markov model (HMM), but that the acoustic features useful for non-speech events may not be the same as those useful for speech. One possible solution is a tandem HMM: an HMM whose observation vector is constructed from the output of an instantaneous discriminative classifier, e.g., a neural network. The use of tandem HMMs for non-speech AED is hindered, however, by the relatively small size of most non-speech-audio training corpora. This talk will demonstrate that tandem HMMs can be trained to detect non-speech audio events using a novel form of regularized training: Baum–Welch back-propagation (as proposed by Bengio et al.), using the conjugate-gradient adaptive form of the Baum–Welch auxiliary function (as proposed by Lee et al., and as commonly used in maximum a posteriori HMM adaptation).
Type of Medium:
Online Resource
ISSN:
0001-4966
,
1520-8524
Language:
English
Publisher:
Acoustical Society of America (ASA)
Publication Date:
2009
detail.hit.zdb_id:
1461063-2
Permalink