Speaker diarization is the practice of determining who speaks when in audio recordings. Psychotherapy research often relies on labor intensive manual diarization. Unsupervised methods are available but yield higher error rates. We present a method for supervised speaker diarization based on random forests. It can be considered a compromise between commonly used labor-intensive manual coding and fully automated procedures. The method is validated using the EMRAI synthetic speech corpus and is made publicly available. It yields low diarization error rates (M: 5.61%, STD: 2.19). Supervised speaker diarization is a promising method for psychotherapy research and similar fields.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7399377PMC
http://dx.doi.org/10.3389/fpsyg.2020.01726DOI Listing

Publication Analysis

Top Keywords

speaker diarization
16
supervised speaker
12
random forests
8
error rates
8
diarization
6
diarization random
4
forests tool
4
tool psychotherapy
4
psychotherapy process
4
process speaker
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!