In:
ACM Transactions on Human-Robot Interaction, Association for Computing Machinery (ACM)
Abstract:
Social robots in the home will need to solve audio identification problems to better interact with their users. This paper focuses on the classification between a) natural conversation that includes at least one co-located user and b) media that is playing from electronic sources and does not require a social response, such as television shows. This classification can help social robots detect a user’s social presence using sound. Social robots that are able to solve this problem can apply this information to assist them in making decisions, such as determining when and how to appropriately engage human users. We compiled a dataset from a variety of acoustic environments which contained either natural or media audio, including audio that we recorded in our own homes. Using this dataset, we performed an experimental evaluation on a range of traditional machine learning classifiers, and assessed the classifiers’ abilities to generalize to new recordings, acoustic conditions, and environments. We conclude that a C-Support Vector Classification (SVC) algorithm outperformed other classifiers. Finally, we present a classification pipeline that in-home robots can utilize, and discuss the timing and size of the trained classifiers, as well as privacy and ethics considerations.
Type of Medium:
Online Resource
ISSN:
2573-9522
Language:
English
Publisher:
Association for Computing Machinery (ACM)
Publication Date:
2023
detail.hit.zdb_id:
2946140-6
Permalink