GLORIA

GEOMAR Library Ocean Research Information Access

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Acoustical Society of America (ASA)  (3)
Material
Publisher
  • Acoustical Society of America (ASA)  (3)
Language
Years
FID
  • 1
    Online Resource
    Online Resource
    Acoustical Society of America (ASA) ; 2006
    In:  The Journal of the Acoustical Society of America Vol. 120, No. 5_Supplement ( 2006-11-01), p. 3378-3379
    In: The Journal of the Acoustical Society of America, Acoustical Society of America (ASA), Vol. 120, No. 5_Supplement ( 2006-11-01), p. 3378-3379
    Abstract: In this paper, we introduce a spoken language interface for music information retrieval. In response to voice commands, the system searches for a song through an internet music shop or a ‘‘playlist’’ stored in the local PC; the system then plays it. To cope with the almost unlimited size of the vocabulary, a remote server program with which a user can customize their recognition grammar and dictionary is implemented. When a user selects favorite artists, the server program automatically generates a minimal set of recognition grammars and a dictionary. The system then sends them to the interface program. Therefore, on average, the vocabulary is less than 1000 words for each user. To perform a field test of the system, we implemented a speech collection capability, whereby speech utterances are compressed in free lossless audio codec (FLAC) format and are sent back to the server program with dialogue logs. Currently, the system is available to the public for experimental use. More than 100 users are involved in field testing. In our presentation, we will report details of the system and the results of field tests, which include motorcycle environments. a)Currently at Graduate School of Computer and Information Sciences, Hosei University, Tokyo 184-8584, Japan.
    Type of Medium: Online Resource
    ISSN: 0001-4966 , 1520-8524
    RVK:
    Language: English
    Publisher: Acoustical Society of America (ASA)
    Publication Date: 2006
    detail.hit.zdb_id: 1461063-2
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 2
    Online Resource
    Online Resource
    Acoustical Society of America (ASA) ; 2012
    In:  The Journal of the Acoustical Society of America Vol. 131, No. 4_Supplement ( 2012-04-01), p. 3235-3235
    In: The Journal of the Acoustical Society of America, Acoustical Society of America (ASA), Vol. 131, No. 4_Supplement ( 2012-04-01), p. 3235-3235
    Abstract: One of the main applications of Blind Source Separation (BSS) is to improve performance of Automatic Speech Recognition (ASR) systems. However, conventional BSS algorithm has been applied only to speech signals as a pre-processing approach. In this paper, a closely coupled framework between FDICA-based BSS algorithm and speech recognition system is proposed. In the source separation step, a confidence score of the separation accuracy for each frequency bin is first estimated. Subsequently, by employing multi-band speech recognition system, acoustic likelihood is calculated from the estimated BSS confidence scores and Mel-scale filter bank energy. Therefore, our proposed method can reduce ASR errors which caused by separation errors in BSS and permutation errors in ICA, as in the conventional approach. Experimental results showed that our proposed method improved word accuracy of ASR by approximately 10%.
    Type of Medium: Online Resource
    ISSN: 0001-4966 , 1520-8524
    RVK:
    Language: English
    Publisher: Acoustical Society of America (ASA)
    Publication Date: 2012
    detail.hit.zdb_id: 1461063-2
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
  • 3
    Online Resource
    Online Resource
    Acoustical Society of America (ASA) ; 2016
    In:  Journal of the Acoustical Society of America Vol. 140, No. 4_Supplement ( 2016-10-01), p. 3110-3110
    In: Journal of the Acoustical Society of America, Acoustical Society of America (ASA), Vol. 140, No. 4_Supplement ( 2016-10-01), p. 3110-3110
    Abstract: This paper presents a method to classify a situation of being crowded with people using environmental sounds that are collected by smartphones. A final goal of the research is to estimate “crowd-density” using only environmental sounds collect by smartphones. Advantages of the approach are (1) acoustic singles can be collected and processed at low cost, (2) because many people carry smartphones, crowd-density can be obtained not only from many places, but also at any time. As the first step, in this paper, we tried to classify “a situation of being crowded with people.” We collected environmental sounds using smartphones both in residential area and downtown area. The total duration of the collected data is 77,900 seconds. The sound of “crowded with people” is defined as buzz-buzz where more than one person talked at the same time. Two kinds of classifiers were trained on the basis of the GMM-UBM (Gaussian Mixture Model and Universal Background Model) method. The one was trained with acoustic features that are generally used in speech recognition, and the other was trained with additional parameter of sound power. Experiment results showed that the parameter of sound power improves the F-measure to 0.60 from 0.58.
    Type of Medium: Online Resource
    ISSN: 0001-4966 , 1520-8524
    RVK:
    Language: English
    Publisher: Acoustical Society of America (ASA)
    Publication Date: 2016
    detail.hit.zdb_id: 1461063-2
    Location Call Number Limitation Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...