In:
Journal of the Acoustical Society of America, Acoustical Society of America (ASA), Vol. 140, No. 4_Supplement ( 2016-10-01), p. 3450-3450
Abstract:
Non-negative Matrix Factorization (NMF) factorizes a non-negative matrix into two non-negative matrices. In the field of acoustics, multichannel expansion has been proposed to consider spatial information for sound source separation. Conventional multi-channel NMF has a difficulty in an initial-value dependency of the separation performance due to local minima. This paper proposes initial value settings by using binary masking based sound source separation whose masks on the time frequency domain are calculated from the time difference of arrival of each source. The proposed method calculates initial spatial correlation matrices using separated sources by binary masking. The music separation experiments confirmed that the separation performance of the proposed method was better than that of the conventional method. In addition, we evaluated initial value settings by using binary masking for automatic speech recognition (ASR) tasks in noisy environments. The ASR experiments confirmed that appropriate initial value settings were effective because initializations starting from previously well-estimated spatial correlation matrices achieved better ASR performances than random initializations.
Type of Medium:
Online Resource
ISSN:
0001-4966
,
1520-8524
Language:
English
Publisher:
Acoustical Society of America (ASA)
Publication Date:
2016
detail.hit.zdb_id:
1461063-2
Permalink