trying...
382697782024012620240206
1879-83653102024Jan25Studies in health technology and informaticsStud Health Technol InformAutomatic Speech Recognition System to Record Progress Notes in a Mobile EHR: A Pilot Study.124128124-12810.3233/SHTI230940Creating notes in the EHR is one of the most problematic aspects for health professionals. The main challenges are the time spent on this task and the quality of the records. Automatic speech recognition technologies aim to facilitate clinical documentation for users, optimizing their workflow. In our hospital, we internally developed an automatic speech recognition system (ASR) to record progress notes in a mobile EHR. The objective of this article is to describe the pilot study carried out to evaluate the implementation of ASR to record progress notes in a mobile EHR application. As a result, the specialty that used ASR the most was Home Medicine. The lack of access to a computer at the time of care and the need to perform short and fast evolutions were the main reasons for users to use the system.VargasCarolina PaulaCPHealth Informatics Department, Hospital Italiano de Buenos Aires, Argentina.GaieraAlejandroAHealth Informatics Department, Hospital Italiano de Buenos Aires, Argentina.BrandánAndresAHealth Informatics Department, Hospital Italiano de Buenos Aires, Argentina.RenatoAlejandroAHealth Informatics Department, Hospital Italiano de Buenos Aires, Argentina.BenitezSoniaSHealth Informatics Department, Hospital Italiano de Buenos Aires, Argentina.LunaDanielDHealth Informatics Department, Hospital Italiano de Buenos Aires, Argentina.engJournal Article
NetherlandsStud Health Technol Inform92145820926-9630HumansPilot ProjectsSpeech Recognition SoftwareDocumentationHealth PersonnelHospitalsSpeech recognition softwareelectronic health recordsmobile applications
202412664320241256432024125546ppublish3826977810.3233/SHTI230940SHTI230940
trying2...
trying...
3654501MCID_676f08678bbce26c860eb927 39721154 39717516 39698099 39690813 39687201 automatic "automatable"[All Fields] OR "automatic"[All Fields] OR "automatical"[All Fields] OR "automatically"[All Fields] OR "automaticities"[All Fields] OR "automaticity"[All Fields] OR "automatics"[All Fields] OR "automatism"[MeSH Terms] OR "automatism"[All Fields] OR "automatisms"[All Fields] OR "automatization"[All Fields] OR "automatize"[All Fields] OR "automatized"[All Fields] OR "automatizes"[All Fields] OR "automatizing"[All Fields] speech "speech"[MeSH Terms] OR "speech"[All Fields] OR "speech's"[All Fields] OR "speeches"[All Fields] ("automatable"[All Fields] OR "automatic"[All Fields] OR "automatical"[All Fields] OR "automatically"[All Fields] OR "automaticities"[All Fields] OR "automaticity"[All Fields] OR "automatics"[All Fields] OR "automatism"[MeSH Terms] OR "automatism"[All Fields] OR "automatisms"[All Fields] OR "automatization"[All Fields] OR "automatize"[All Fields] OR "automatized"[All Fields] OR "automatizes"[All Fields] OR "automatizing"[All Fields]) AND ("speech"[MeSH Terms] OR "speech"[All Fields] OR "speeches"[All Fields]) trying2...
trying...
3972115420241225
1873-78382562024Dec24CognitionCognitionBeat gestures and prosodic prominence interactively influence language comprehension.10604910604910.1016/j.cognition.2024.106049S0010-0277(24)00335-4Face-to-face communication is not only about 'what' is said but also 'how' it is said, both in speech and bodily signals. Beat gestures are rhythmic hand movements that typically accompany prosodic prominence in conversation. Yet, it is still unclear how beat gestures influence language comprehension. On the one hand, beat gestures may share the same functional role of focus markers as prosodic prominence. Accordingly, they would drive attention towards the concurrent speech and highlight its content. On the other hand, beat gestures may trigger inferences of high speaker confidence, generate the expectation that the sentence content is correct and thereby elicit the commitment to the truth of the statement. This study directly disentangled the two hypotheses by evaluating additive and interactive effects of prosodic prominence and beat gestures on language comprehension. Participants watched videos of a speaker uttering sentences and judged whether each sentence was true or false. Sentences sometimes contained a world knowledge violation that may go unnoticed ('semantic illusion'). Combining beat gestures with prosodic prominence led to a higher degree of semantic illusion, making more world knowledge violations go unnoticed during language comprehension. These results challenge current theories proposing that beat gestures are visual focus markers. To the contrary, they suggest that beat gestures automatically trigger inferences of high speaker confidence and thereby elicit the commitment to the truth of the statement, in line with Grice's cooperative principle in conversation. More broadly, our findings also highlight the influence of metacognition on language comprehension in face-to-face communication.Copyright © 2024 The Authors. Published by Elsevier B.V. All rights reserved.FerrariAmbraAMax Plank Institute for Psycholinguistics, Wundtlaan 1, 6525 XD Nijmegen, The Netherlands; Radboud University Nijmegen, Donders Institute for Brain, Cognition and Behaviour, 6525 EN Nijmegen, The Netherlands. Electronic address: ambra.ferrari@mpi.nl.HagoortPeterPMax Plank Institute for Psycholinguistics, Wundtlaan 1, 6525 XD Nijmegen, The Netherlands; Radboud University Nijmegen, Donders Institute for Brain, Cognition and Behaviour, 6525 EN Nijmegen, The Netherlands.engJournal Article20241224
NetherlandsCognition03675410010-0277IMBeat gesturesLanguage comprehensionMetacognitionMultimodal communicationPragmaticsProsodyDeclaration of competing interest The authors declare no competing interests.
2024222024102420241213202412260202024122602020241225182aheadofprint3972115410.1016/j.cognition.2024.106049S0010-0277(24)00335-4
3971751620241224
2589-871X92024Forensic science international. SynergyForensic Sci Int SynergThe AI Act in a law enforcement context: The case of automatic speech recognition for transcribing investigative interviews.10056310056310056310.1016/j.fsisyn.2024.100563Law enforcement agencies manually transcribe thousands of investigative interviews per year in relation to different crimes. In order to automate and improve efficiency in the transcription of such interviews, applied research explores artificial intelligence models, including Automatic Speech Recognition (ASR) and Natural Language Processing. While AI models can improve efficiency in criminal investigations, their successful implementation requires evaluation of legal and technical risks. This paper explores the legal and technical challenges of applying ASR models to investigative interviews in the context of the European Union Artificial Intelligence Act (AIA). The AIA provisions are discussed in the view of domain specific studies for interviews in the Norwegian police, best practices, and empirical analyses in speech recognition in order to provide law enforcement with a practical code of conduct on the techno-legal requirements for the adoption of such models in their work and potential grey areas for further research.© 2024 The Authors.StoykovaRadinaRUniversity of Groningen, Broerstraat 5, 9712 CP Grongingen, Netherlands.PorterKyleKNorwegian University of Science and Technology, Teknologivegen 22, 2815 Gjøvik, Norway.BekaThomasTNorwegian Police IT-unit, Fridtjof Nansens vei 14, 0031 Oslo, Norway.engJournal Article20241205
NetherlandsForensic Sci Int Synerg1017668492589-871XArtificial IntelligenceArtificial Intelligence Act (AI Act)Automatic speech recognitionGeneral-purpose AI modelsInvestigative interviewsLaw enforcementThe authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Thomas Beka reports a relationship with Norwegian Police IT-unit that includes: employment. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
202473120241016202411252024122462120241224620202412244442024125epublish39717516PMC1166407210.1016/j.fsisyn.2024.100563S2589-871X(24)00110-4Riksadvokaten . 2016. Rundskriv nr.212076.Eriksen P.K.F. 2013. Avhørsrapporten som rekontekstualisering av avhøret.Norwegian Ministry of Children and Equality P.K.F. 2016. The rights of the child in Norway: Norway’s fifth and sixth periodic reports to the UN committee on the rights of the child.Justis og beredskapsdepartementet P.K.F. 2015. Forskrift om avhør av barn og andre særlig sårbare fornærmede og vitner (tilrettelagte avhør)Riksrevisjonen P.K.F. Stanford Univ; 2024. Riksrevisjonens undersøkelse av digitalisering i politiet Dokument 3:7.S. Wollin-Giering, M. Hoffmann, J. Höfting, C. Ventzke, et al., Automatic Transcription of English and German Qualitative Interviews, in: Forum Qualitative Sozialforschung/Forum: Qualitative Social Research, Vol. 25, No. 1, 2024.Krausman A., Kelley T., McGhee S., Schaefer K.E., Fitzhugh S. CCDC Army Research LaboratorySOS International; 2019. Using Dragon for Speech-To-Text Transcription in Support of Human-Autonomy Teaming Research: Technical Report.European Parliament A. 2024. Corrigendum to the position of the European parliament adopted at first reading on 13 march 2024 with a view to the adoption of regulation (EU) 2024/ ...... of the European parliament and of the council laying down harmonised rules on artificial intelligence and amending regulations (EC) no 300/2008, (EU) no 167/2013, (EU) no 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and directives 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 (artificial intelligence act) P9_Ta(2024)0138 (com(2021)0206 – C9-0146/2021 – 2021/0106(COD)) URL: www.europarl.europa.eu/doceo/document/TA-9-2024-0138-FNL-COR01_EN.pdf. (Accessed 27 May 2024)Radford A., Kim J.W., Xu T., Brockman G., McLeavey C., Sutskever I. OpenAI; 2022. Robust Speech Recognition Via Large-Scale Weak Supervision: Tech. Rep.Baevski A., Zhou Y., Mohamed A., Auli M. Wav2vec 2.0: A framework for self-supervised learning of speech representations. Adv. Neural Inf. Process. Syst. 2020;33:12449–12460.Li J., et al. Recent advances in end-to-end automatic speech recognition. APSIPA Trans. Signal Inf. Process. 2022;11(1)Babu A., Wang C., Tjandra A., Lakhotia K., Xu Q., Goyal N., Singh K., von Platen P., Saraf Y., Pino J., et al. 2021. XLS-r: Self-supervised cross-lingual speech representation learning at scale. arXiv preprint arXiv:2111.09296.Jurafsky D., Martin J.H. Stanford Univ; 2019. Speech and Language Processing (3rd (draft) ed.)J. Rugayan, T. Svendsen, G. Salvi, Semantically meaningful metrics for Norwegian ASR systems, in: Interspeech, 2022.Harrington L. Incorporating automatic speech recognition methods into the transcription of police-suspect interviews: factors affecting automatic performance. Front. Commun. 2023;8Negrão M., Domingues P. Speechtotext: An open-source software for automatic detection and transcription of voice recordings in digital forensics. Forensic Sci. Int. Digit. Invest. 2021;38Panayotov V., Chen G., Povey D., Khudanpur S. Librispeech: an asr corpus based on public domain audio books. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP; IEEE; 2015. pp. 5206–5210.Vásquez-Correa J.C., Álvarez Muniain A. Novel speech recognition systems applied to forensics within child exploitation: Wav2vec2. 0 vs. whisper. Sensors. 2023;23(4):1843.PMC996119736850439Loakes D. Does automatic speech recognition (ASR) have a role in the transcription of indistinct covert recordings for forensic purposes. Front. Commun. 2022;7 doi: 10.3389/fcomm.10.3389/fcommLoakes D. Automatic speech recognition and the transcription of indistinct forensic audio: how do the new generation of systems fare? Front. Commun. 2024;9Wahler M.E. A word is worth a thousand words: Legal implications of relying on machine translation technology. Stetson L. Rev. 2018;48:109.Lorch B., Scheler N., Riess C. Compliance challenges in forensic image analysis under the artificial intelligence act. 2022 30th European Signal Processing Conference; EUSIPCO; IEEE; 2022. pp. 613–617.Bommasani R., Hudson D.A., Adeli E., Altman R., Arora S., von Arx S., Bernstein M.S., Bohg J., Bosselut A., Brunskill E., et al. 2021. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258.Baevski A., Conneau A., Auli M. 2020. Wav2vec 2.0: Learning the structure of speech from raw audio. URL: https://ai.meta.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/Gutierrez C.I., Aguirre A., Uuk R., Boine C.C., Franklin M. A proposal for a definition of general purpose artificial intelligence systems. Digit. Soc. 2023;2(3):36. doi: 10.1007/s44206-023-00068-w.10.1007/s44206-023-00068-wEbers M., Hoch V.R.S., Rosenkranz F., Ruschemeier H., Steinrötter B. The European commission’s proposal for an artificial intelligence act—A critical assessment by members of the robotics and AI law society (RAILS) J. 2021;4(4):589–603. doi: 10.3390/j4040043. URL: https://www.mdpi.com/2571-8800/4/4/43. Number: 4 Publisher: Multidisciplinary Digital Publishing Institute.10.3390/j4040043Casey E. The chequered past and risky future of digital forensics. Aust. J. Forensic Sci. 2019;51(6):649–664. doi: 10.1080/00450618.2018.1554090.10.1080/00450618.2018.1554090Hughes N., Karabiyik U. Towards reliable digital forensics investigations through measurement science. WIREs Forensic Sci. 2020 doi: 10.1002/wfs2.1367. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/wfs2.1367.10.1002/wfs2.136710.1002/wfs2.1367Stoykova R. Digital evidence: Unaddressed threats to fairness and the presumption of innocence. Comput. Law Secur. Rev. 2021;42 doi: 10.1016/j.clsr.2021.105575. URL: https://www.sciencedirect.com/science/article/pii/S0267364921000480.10.1016/j.clsr.2021.105575Palmiotto F. In: Algorithmic Governance and Governance of Algorithms: Legal and Ethical Challenges. Ebers M., Cantero Gamito M., editors. Springer International Publishing; Cham: 2021. The black box on trial: The impact of algorithmic opacity on fair trial rights in criminal proceedings; pp. 49–70.Stoykova R., Bonnici J., Franke K. In: Artificial Intelligence (AI) in Forensic Sciences. first ed. Geradts Z., Franke K., editors. Wiley; 2023. Machine learning for evidence in criminal proceedings: Techno-legal challenges for reliability assurance.Crawford K. Yale University Press; New Haven: 2021. Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. OCLC.Schuett J. Risk management in the artificial intelligence act. Eur. J. Risk Regul. 2023:1–19. doi: 10.1017/err.2023.1. URL: https://www.cambridge.org/core/journals/european-journal-of-risk-regulation/article/risk-management-in-the-artificial-intelligence-act/2E4D5707E65EFB3251A76E288BA74068#. Publisher: Cambridge University Press.10.1017/err.2023.1National Institute of Standards and Technology, (NIST) J. National Institute of Standards and Technology; Gaithersburg, MD: 2023. AI Risk Management Framework: AI RMF (1.0): Technical Report NIST AI 100-1. URL: https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf.10.6028/NIST.AI.100-1Bull R., Rachlew A. In: Interrogation and Torture. first ed. Barela S.J., Fallon M., Gaggioli G., Ohlin J.D., editors. Oxford University PressNew York; 2020. Investigative interviewing: From England to Norway and beyond; pp. 171–196. URL: https://academic.oup.com/book/40539/chapter/347867987.10.1093/oso/9780190097523.003.0007Westera N.J., Kebbell M.R., Milne B. Interviewing witnesses: do investigative and evidential requirements concur? Milne B., Roberts K.A., editors. Br. J. Forensic Practice. 2011;13(2):103–113. doi: 10.1108/14636641111134341. Publisher: Emerald Group Publishing Limited.10.1108/14636641111134341Haworth K. Routledge Taylor & Francis Group; 2020. Police Interviews as Evidence. URL: https://publications.aston.ac.uk/id/eprint/42440/Milne R., Nunan J., Hope L., Hodgkins J., Clarke C. From verbal account to written evidence: Do written statements generated by officers accurately represent what witnesses say? Front. Psychol. 2022;12 URL: https://www.frontiersin.org/articles/10.3389/fpsyg.2021.774322.10.3389/fpsyg.2021.774322PMC886837335222145A. Koenecke, A.S.G. Choi, K.X. Mei, H. Schellmann, M. Sloane, Careless Whisper: Speech-to-Text Hallucination Harms, in: The 2024 ACM Conference on Fairness, Accountability, and Transparency, 2024, pp. 1672–1681.European Commission A. 2016. Directive (EU) 2016/680 of the European parliament and of the council of 27 april 2016 on the protection of natural persons with regard to the processing of personal data by competent authorities for the purposes of the prevention, investigation, detection or prosecution of criminal offences or the execution of criminal penalties, and on the free movement of such data, and repealing council framework decision 2008/977/JHA. URL: http://data.europa.eu/eli/dir/2016/680/oj.Article 29 Working Party A. 2007. Opinion 04/2007 on the concept of personal data (WP 136, 20 june 2007) URL: https://ec.europa.eu/justice/article-29/documentation/opinion-recommendation/index_en.htm#maincontentSec11.Koenecke A., Nam A., Lake E., Nudell J., Quartey M., Mengesha Z., Toups C., Rickford J.R., Jurafsky D., Goel S. Racial disparities in automated speech recognition. Proc. Natl. Acad. Sci. 2020;117(14):7684–7689.PMC714938632205437N. Markl, Language variation and algorithmic bias: understanding algorithmic bias in British English automatic speech recognition, in: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, 2022, pp. 521–534.Fairlearn N. 2023. Performing a fairness assessment. URL: https://fairlearn.org/main/user_guide/assessment/perform_fairness_assessment.html.Rajan S.S., Udeshi S., Chattopadhyay S. International Conference on Fundamental Approaches To Software Engineering. Springer International Publishing Cham; 2022. Aequevox: Automated fairness testing of speech recognition systems; pp. 245–267.Liu C., Picheny M., Sarı L., Chitkara P., Xiao A., Zhang X., Chou M., Alvarado A., Hazirbas C., Saraf Y. Towards measuring fairness in speech recognition: Casual conversations dataset transcriptions. ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP; IEEE; 2022. pp. 6162–6166.Pallet D.S., Fisher W.M., Fiscus J.G. International Conference on Acoustics, Speech, and Signal Processing. IEEE; 1990. Tools for the analysis of benchmark speech recognition tests; pp. 97–100.Dheram P., Ramakrishnan M., Raju A., Chen I.-F., King B., Powell K., Saboowala M., Shetty K., Stolcke A. 2022. Toward fairness in speech recognition: Discovery and mitigation of performance disparities. arXiv preprint arXiv:2207.11345.P.E. Solberg, P. Ortiz, P. Parsons, T. Svendsen, G. Salvi, Improving generalization of Norwegian ASR with limited linguistic resources, in: The 24rd Nordic Conference on Computational Linguistics, 2023.de Miguel Beriain I., Jiménez P.N., José Rementería M. European Parliament/ Directorate General for Parliamentary Research Services (EPRS); LU: 2022. Auditing the Quality of Datasets Used in Algorithmic Decision-Making Systems. URL: https://data.europa.eu/doi/10.2861/98930.10.2861/98930A. Aksënova, D. van Esch, J. Flynn, P. Golik, How might we create better benchmarks for speech recognition?, in: Proceedings of the 1st Workshop on Benchmarking: Past, Present and Future, 2021, pp. 22–34.OpenAI A. 2024. Whisper large V3 model card. URL: https://huggingface.co/openai/whisper-large-v3.S. Khare, A.R. Mittal, A. Diwan, S. Sarawagi, P. Jyothi, S. Bharadwaj, Low Resource ASR: The Surprising Effectiveness of High Resource Transliteration, in: Interspeech, 2021, pp. 1529–1533.Kummervold P.E., de la Rosa J., Wetjen F., Braaten R.-A., Solberg P.E. 2024. Whispering in norwegian: Navigating orthographic and dialectic challenges. arXiv preprint arXiv:2402.01917.P.E. Solberg, P. Beauguitte, P.E. Kummervold, F. Wetjen, A Large Norwegian Dataset for Weak Supervision ASR, in: Proceedings of the Second Workshop on Resources and Representations for under-Resourced Languages and Domains, RESOURCEFUL-2023, 2023, pp. 48–52.Wu T. Department of Engineering University of Cambridge; 2023. Distilling and Forgetting in Large Pre-Trained Models.Jain R., Barcovschi A., Yiwere M.Y., Corcoran P., Cucu H. Exploring native and non-native english child speech recognition with whisper. IEEE Access. 2024;12:41601–41610.Veale M., Zuiderveen Borgesius F. Demystifying the draft EU artificial intelligence act—Analysing the good, the bad, and the unclear elements of the proposed approach. Comput. Law Rev. Int. 2021;22(4):97–112.Amodei D., Ananthanarayanan S., Anubhai R., Bai J., Battenberg E., Case C., Casper J., Catanzaro B., Cheng Q., Chen G., et al. International Conference on Machine Learning. PMLR; 2016. Deep speech 2: End-to-end speech recognition in english and mandarin; pp. 173–182.Likhomanenko T., Xu Q., Pratap V., Tomasello P., Kahn J., Avidov G., Collobert R., Synnaeve G. 2020. Rethinking evaluation in asr: Are our models robust enough? arXiv preprint arXiv:2010.11745.Kiseleva A., Kotzinos D., De Hert P. Transparency of AI in healthcare as a multilayered system of accountabilities: Between legal requirements and technical limitations. Front. Artif. Intell. 2022;5 URL: https://www.frontiersin.org/articles/10.3389/frai.2022.879603.10.3389/frai.2022.879603PMC918930235707765Bibal A., Lognoul M., de Streel A., Frénay B. Legal requirements on explainability in machine learning. Artif. Intell. Law. 2021;29(2):149–169. doi: 10.1007/s10506-020-09270-4.10.1007/s10506-020-09270-4Carvalho D.V., Pereira E.M., Cardoso J.S. Machine learning interpretability: A survey on methods and metrics. Electronics. 2019;8(8):832. URL: https://www.mdpi.com/2079-9292/8/8/832. Number: 8 Publisher: Multidisciplinary Digital Publishing Institute.M.T. Ribeiro, S. Singh, C. Guestrin, ” Why should I trust you?” Explaining the predictions of any classifier, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 1135–1144.A. Krug, R. Knaebel, S. Stober, Neuron activation profiles for interpreting convolutional speech recognition models, in: NeurIPS Workshop on Interpretability and Robustness in Audio, Speech, and Language, IRASL, 2018.A. Krug, S. Stober, Introspection for convolutional automatic speech recognition, in: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, 2018, pp. 187–199.Healy E.W., Yoho S.E., Apoux F. Band importance for sentences and words reexamined. J. Acoust. Soc. Am. 2013;133(1):463–473.PMC354888523297918H.S. Kavaki, M.I. Mandel, Identifying important time-frequency locations in continuous speech utterances, in: Proceedings of Interspeech, 2020.Trinh V.A., Kavaki H.S., Mandel M.I. Importantaug: a data augmentation agent for speech. ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP; IEEE; 2022. pp. 8592–8596.Lunde S.R. NTNU; 2022. Modeling the Interpretability of an End-to-End Automatic Speech Recognition System Adapted to Norwegian Speech.Z. Gekhman, D. Zverinski, J. Mallinson, G. Beryozkin, RED-ACE: Robust Error Detection for ASR using Confidence Embeddings, in: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, pp. 2800–2808.Oneaţă D., Caranica A., Stan A., Cucu H. An evaluation of word-level confidence estimation for end-to-end automatic speech recognition. 2021 IEEE Spoken Language Technology Workshop; SLT; IEEE; 2021. pp. 258–265.Miner A.S., Haque A., Fries J.A., Fleming S.L., Wilfley D.E., Terence Wilson G., Milstein A., Jurafsky D., Arnow B.A., Stewart Agras W., et al. Assessing the accuracy of automatic speech recognition for psychotherapy. NPJ Digit. Med. 2020;3(1):82.PMC727010632550644A.C. Morris, V. Maier, P. Green, From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition, in: Eighth International Conference on Spoken Language Processing, 2004.N.A. Smuha, E. Ahmed-Rengers, A. Harkens, W. Li, J. MacLaren, R. Piselli, K. Yeung, How the EU Can Achieve Legally Trustworthy AI: A Response to the European Commission’s Proposal for an Artificial Intelligence Act, Rochester, NY, 2021, 10.2139/ssrn.3899991, URL:.10.2139/ssrn.3899991Urban E., Mehrotra N. 2024. Test accuracy of a custom speech model. URL: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-evaluate-data?pivots=speech-studio.Y. Gaur, W.S. Lasecki, F. Metze, J.P. Bigham, The effects of automatic speech recognition quality on human transcription latency, in: Proceedings of the 13th International Web for All Conference, 2016, pp. 1–8.Yi L., Min R., Kunjie C., Dan L., Ziqiang Z., Fan L., Bo Y. Identifying and managing risks of ai-driven operations: A case study of automatic speech recognition for improving air traffic safety. Chin. J. Aeronaut. 2023;36(4):366–386.Gabler P., Geiger B.C., Schuppler B., Kern R. Reconsidering read and spontaneous speech: Causal perspectives on the generation of training data for automatic speech recognition. Information. 2023;14(2):137.Yang M., Kanda N., Wang X., Wu J., Sivasankaran S., Chen Z., Li J., Yoshioka T. Simulating realistic speech overlaps improves multi-talker ASR. ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP; IEEE; 2023. pp. 1–5.Koluguri N.R., Kriman S., Zelenfroind G., Majumdar S., Rekesh D., Noroozi V., Balam J., Ginsburg B. Investigating end-to-end ASR architectures for long form audio transcription. ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP; IEEE; 2024. pp. 13366–13370.M. Garnerin, S. Rossato, L. Besacier, Gender representation in French broadcast corpora and its impact on ASR performance, in: Proceedings of the 1st International Workshop on AI for Smart TV Content Production, Access and Delivery, 2019, pp. 3–9.Garnerin M., Rossato S., Besacier L. 3rd Workshop on Gender Bias in Natural Language Processing. Association for Computational Linguistics; 2021. Investigating the impact of gender representation in ASR training data: A case study on librispeech; pp. 86–92.Vipperla R., Renals S., Frankel J. 2008. Longitudinal study of ASR performance on ageing voices.Fan Z., Cao X., Salvi G., Svendsen T. Using modified adult speech as data augmentation for child speech recognition. ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP; IEEE; 2023. pp. 1–5.G. Yeung, A. Alwan, On the difficulties of automatic speech recognition for kindergarten-aged children, in: Interspeech 2018, 2018.Y. Getman, R. Al-Ghezi, K. Voskoboinik, T. Grósz, M. Kurimo, G. Salvi, T. Svendsen, S. Strömbergsson, Wav2vec2-based speech rating system for children with speech sound disorder, in: Interspeech, 2022.R. Cumbal, B. Moell, J.D. Águas Lopes, O. Engwall, “You don’t understand me!”: Comparing ASR results for L1 and L2 speakers of Swedish, in: Interspeech 2021, 2021.Feng S., Kudina O., Halpern B.M., Scharenborg O. 2021. Quantifying bias in automatic speech recognition. arXiv preprint arXiv:2103.15122.P. Parsons, K. Kvale, T. Svendsen, G. Salvi, A character-based analysis of impacts of dialects on end-to-end Norwegian ASR, in: The 24rd Nordic Conference on Computational Linguistics, 2023.P.E. Solberg, P. Ortiz, The Norwegian Parliamentary Speech Corpus, in: Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022, pp. 1003–1008.Park T.J., Kanda N., Dimitriadis D., Han K.J., Watanabe S., Narayanan S. A review of speaker diarization: Recent advances with deep learning. Comput. Speech Lang. 2022;72Huh J., Chung J.S., Nagrani A., Brown A., Jung J.-w., Garcia-Romero D., Zisserman A. The vox celeb speaker recognition challenge: A retrospective. IEEE/ACM Trans. Audio Speech Lang. Process. 2024Ji Z., Lee N., Frieske R., Yu T., Su D., Xu Y., Ishii E., Bang Y.J., Madotto A., Fung P. Survey of hallucination in natural language generation. ACM Comput. Surv. 2023;55(12):1–38.A. Mittal, R. Murthy, V. Kumar, R. Bhat, Towards understanding and mitigating the hallucinations in NLP and Speech, in: Proceedings of the 7th Joint International Conference on Data Science & Management of Data, 11th ACM IKDD CODS and 29th COMAD, 2024, pp. 489–492.Serai P., Sunder V., Fosler-Lussier E. Hallucination of speech recognition errors with sequence to sequence learning. IEEE/ACM Trans. Audio Speech Lang. Process. 2022;30:890–900.Field A., Verma P., San N., Eberhardt J.L., Jurafsky D. 2023. Developing speech processing pipelines for police accountability. arXiv preprint arXiv:2306.06086.Aghakhani H., Schönherr L., Eisenhofer T., Kolossa D., Holz T., Kruegel C., Vigna G. Venomave: Targeted poisoning against speech recognition. 2023 IEEE Conference on Secure and Trustworthy Machine Learning; SaTML; IEEE; 2023. pp. 404–417.Carlini N., Wagner D. Audio adversarial examples: Targeted attacks on speech-to-text. 2018 IEEE Security and Privacy Workshops; SPW; IEEE; 2018. pp. 1–7.Olivier R., Raj B. Proc. INTERSPEECH 2023. 2023. There is more than one kind of robustness: Fooling whisper with adversarial examples; pp. 4394–4398.10.21437/Interspeech.2023-1105Olivier R., Raj B. Proc. Interspeech 2022. 2022. Recent improvements of ASR models in the face of adversarial attacks; pp. 4113–4117.10.21437/Interspeech.2022-400Bai Y., Wang Y., Zeng Y., Jiang Y., Xia S.-T. Query efficient black-box adversarial attack on deep neural networks. Pattern Recognit. 2023;133Peri N., Gupta N., Huang W.R., Fowl L., Zhu C., Feizi S., Goldstein T., Dickerson J.P. Computer Vision–ECCV 2020 Workshops: Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16. Springer; 2020. Deep k-nn defense against clean-label data poisoning attacks; pp. 55–70.Neuwirth R.J. Prohibited artificial intelligence practices in the proposed EU artificial intelligence act (AIA) Comput. Law Secur. Rev. 2023;48 doi: 10.1016/j.clsr.2023.105798. URL: https://www.sciencedirect.com/science/article/pii/S0267364923000092.10.1016/j.clsr.2023.105798
3969809920241219
2405-844010232024Dec15HeliyonHeliyonPredicting the victims of hate speech on microblogging platforms.e40611e40611e4061110.1016/j.heliyon.2024.e40611Hate speech constitutes a major problem on microblogging platforms, with automatic detection being a growing research area. Most existing works focus on analyzing the content of social media posts. Our study shifts focus to predicting which users are likely to become targets of hate speech. This paper proposes a novel Hate-speech Target Prediction Framework (HTPK) and introduces a new Hate Speech Target Dataset (HSTD), which contains tweets labeled for targets and non-targets of hate speech. Using a combination of Term Frequency-Inverse Document Frequency (TFIDF), N-grams, and Part-of-Speech (PoS) tags, we tested various machine learning algorithms, Naïve Bayes (NB) classifier performs best with an accuracy of 93%, significantly outperforming other algorithms. This research identifies the optimal combination of features for predicting hate speech targets and compares various machine learning algorithms, providing a foundation for more proactive hate speech mitigation on social media platforms.© 2024 The Authors.KhanSahrishSDepartment of Computer Science, Quaid-i-Azam University, Islamabad, Pakistan.Department of Computer Science, University of Warwick, Coventry CV4 7AL, UK.AbbasiRabeeh AyazRADepartment of Computer Science, Quaid-i-Azam University, Islamabad, Pakistan.SindhuMuddassar AzamMADepartment of Computer Science, Quaid-i-Azam University, Islamabad, Pakistan.ArafatSachiSFaculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia.KhattakAkmal SaeedASDepartment of Computer Science, Quaid-i-Azam University, Islamabad, Pakistan.DaudAliAFaculty of Resilience, Rabdan Academy, Abu Dhabi, United Arab Emirates.MushtaqMubasharMDepartment of Computer Science, Forman Christian College (A Chartered University), Lahore, Pakistan.engJournal Article20241126
EnglandHeliyon1016725602405-8440Hate speechMachine learningPredictionSocial mediaTwitterThe authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
202451620241152024112020241219623202412196222024121951920241126epublish39698099PMC1165283810.1016/j.heliyon.2024.e40611S2405-8440(24)16642-XIbrahim Y.M., Essameldin R., Saad S.M. Social media forensics: an adaptive cyberbullying-related hate speech detection approach based on neural networks with uncertainty. IEEE Access. 2024;12:59474–59484. doi: 10.1109/ACCESS.2024.3393295.10.1109/ACCESS.2024.3393295Chadha K., Steiner L., Vitak J., Ashktorab Z. Women's responses to online harassment. Int. J. Commun. 2020;14(1):239–257.Chetty N., Alathur S. Hate speech review in the context of online social networks. Aggress. Violent Behav. 2018;40:108–118. doi: 10.1016/j.avb.2018.05.003.10.1016/j.avb.2018.05.003Son L.H., Kumar A., Sangwan S.R., Arora A., Nayyar A., Abdel-Basset M. Sarcasm detection using soft attention-based bidirectional long short-term memory model with convolution network. IEEE Access. 2019;7:23319–23328. doi: 10.1109/ACCESS.2019.2899260.10.1109/ACCESS.2019.2899260Bouazizi M., Ohtsuki T. 2016 IEEE Global Communications Conference (GLOBECOM) IEEE; USA: 2016. Sentiment analysis in Twitter: from classification to quantification of sentiments within tweets; pp. 1–6.10.1109/GLOCOM.2016.7842262Djuric N., Zhou J., Morris R., Grbovic M., Radosavljevic V., Bhamidipati N. Hate speech detection with comment embeddings. Proceedings of the 24th International Conference on World Wide Web; WWW ‘15 Companion; New York, NY, USA: ACM; 2015. pp. 29–30.10.1145/2740908.2742760Badjatiya P., Gupta S., Gupta M., Varma V. Deep learning for hate speech detection in tweets. Proceedings of the 26th International Conference on World Wide Web Companion; WWW ‘17 Companion, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland; 2017. pp. 759–760.10.1145/3041021.3054223Watanabe H., Bouazizi M., Ohtsuki T. Hate speech on Twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access. 2018;6:13825–13835. doi: 10.1109/ACCESS.2018.2806394.10.1109/ACCESS.2018.2806394Rodríguez A., Argueta C., Chen Y. 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC) IEEE; USA: 2019. Automatic detection of hate speech on Facebook using sentiment and emotion analysis; pp. 169–174.10.1109/ICAIIC.2019.8669073Fazil M., Abulaish M. A hybrid approach for detecting automated spammers in Twitter. IEEE Trans. Inf. Forensics Secur. 2018;13(11):2707–2719. doi: 10.1109/TIFS.2018.2825958.10.1109/TIFS.2018.2825958Burnap P., Williams M.L. Us and them: identifying cyber hate on Twitter across multiple protected characteristics. EPJ Data Sci. 2016;5(1) doi: 10.1140/epjds/s13688-016-0072-6.10.1140/epjds/s13688-016-0072-6PMC717559832355598Dadvar M., Trieschnigg R., de Jong F. 25th Benelux Conference on Artificial Intelligence, BNAIC 2013. Delft University of Technology; Netherlands: 2013. Expert knowledge for automatic detection of bullies in social networks; pp. 57–64.Alfina I., Mulia R., Fanany M.I., Ekanata Y. 2017 International Conference on Advanced Computer Science and Information Systems (ICACSIS) IEEE; USA: 2017. Hate speech detection in the Indonesian language: a dataset and preliminary study; pp. 233–238.10.1109/ICACSIS.2017.8355039Nobata C., Tetreault J., Thomas A., Mehdad Y., Chang Y. Abusive language detection in online user content. Proceedings of the 25th International Conference on World Wide Web; WWW ‘16, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland; 2016. pp. 145–153.10.1145/2872427.2883062Gaydhani A., Doma V., Kendre S., Bhagwat L. Detecting hate speech and offensive language on Twitter using machine learning: an n-gram and TFIDF based approach. 2018. arXiv:1809.08651 [abs]arXiv:1809.08651https://doi.org/10.48550/arXiv.1809.08651http://arxiv.org/abs/1809.08651 CoRR.10.48550/arXiv.1809.08651Abbass Z., Ali Z., Ali M., Akbar B., Saleem A. 2020 IEEE 14th International Conference on Semantic Computing (ICSC) IEEE; USA: 2020. A framework to predict social crime through Twitter tweets by using machine learning; pp. 363–368.10.1109/ICSC.2020.00073Cortis K., Handschuh S. Analysis of cyberbullying tweets in trending world events. Proceedings of the 15th International Conference on Knowledge Technologies and Data-Driven Business; i-KNOW ‘15; New York, NY, USA: ACM; 2015. pp. 7:1–7:8.10.1145/2809563.2809605Devlin J., Chang M.-W., Lee K., Toutanova K. Bert: pre-training of deep bidirectional transformers for language understanding. 2019. arXiv:1810.04805Liu Y., Ott M., Goyal N., Du J., Joshi M., Chen D., Levy O., Lewis M., Zettlemoyer L., Stoyanov V. 2019. arXiv:1907.11692 Roberta: a robustly optimized bert pretraining approach.He P., Liu X., Gao J., Chen W. 2021. arXiv:2006.03654 Deberta: Decoding-enhanced bert with disentangled attention.Kikkisetti D., Mustafa R.U., Melillo W., Corizzo R., Boukouvalas Z., Gill J., Japkowicz N. Using llms to discover emerging coded antisemitic hate-speech in extremist social. 2024. arXiv:2401.10841 media.Zhu L., Pergola G., Gui L., Zhou D., He Y. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) Zong C., Xia F., Li W., Navigli R., editors. Association for Computational Linguistics; 2021. Topic-driven and knowledge-aware transformer for dialogue emotion detection; pp. 1571–1582. Online.Pergola G., Gui L., He Y. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Toutanova K., Rumshisky A., Zettlemoyer L., Hakkani-Tur D., Beltagy I., Bethard S., Cotterell R., Chakraborty T., Zhou Y., editors. Association for Computational Linguistics; 2021. A disentangled adversarial neural topic model for separating opinions from plots in user reviews; pp. 2870–2883. Online.Wolfe R., Yang Y., Howe B., Caliskan A. Contrastive language-vision ai models pretrained on web-scraped multimodal data exhibit sexual objectification bias. Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency; FAccT ‘23; New York, NY, USA: Association for Computing Machinery; 2023. pp. 1174–1185.10.1145/3593013.3594072Pergola G., Gui L., He Y. TDAM: a topic-dependent attention model for sentiment analysis. Inf. Process. Manag. 2019;56(6)Lu J., Tan X., Pergola G., Gui L., He Y. Event-centric question answering via contrastive learning and invertible event transformation. In: Goldberg Y., Kozareva Z., Zhang Y., editors. Findings of the Association for Computational Linguistics: EMNLP 2022, Association for Computational Linguistics, Abu Dhabi; United Arab Emirates; 2022. pp. 2377–2389.Lu J., Li J., Wallace B., He Y., Pergola G. NapSS: paragraph-level medical text simplification via narrative prompting and sentence-matching summarization. In: Vlachos A., Augenstein I., editors. Findings of the Association for Computational Linguistics: EACL 2023, Association for Computational Linguistics; Dubrovnik, Croatia; 2023. pp. 1079–1091.Irfan A., Azeem D., Narejo S., Kumar N. 2024 IEEE 1st Karachi Section Humanitarian Technology Conference (KHI-HTC) 2024. Multi-modal hate speech recognition through machine learning; pp. 1–6.10.1109/KHI-HTC60760.2024.10482031Silva L., Mondal M., Correa D., Benevenuto F., Weber I. Analyzing the targets of hate in online social media. Proceedings of the 10th International Conference on Web and Social Media; ICWSM 2016, AAAI, Conference date: 17-05-2016 Through 20-05-201, USA; 2016. pp. 687–690.ElSherief M., Nilizadeh S., Nguyen D., Vigna G., Belding E. Peer to peer hate: hate speech instigators and their targets. International AAAI Conference on Web and Social; Media, AAAI, USA; 2018. pp. 1–10.Zampieri M., Malmasi S., Nakov P., Rosenthal S., Farra N., Kumar R. Predicting the type and target of offensive posts in social media. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers); Association for Computational Linguistics, Minneapolis, Minnesota; 2019. pp. 1415–1420.10.18653/v1/N19-1144Saeed Z., Abbasi R.A., Maqbool O., Sadaf A., Razzak I., Daud A., Aljohani N.R., Xu G. What's happening around the world? A survey and framework on event detection techniques on Twitter. J. Grid Comput. 2019;17:279–312.Davidson T., Warmsley D., Macy M., Weber I. Automated hate speech detection and the problem of offensive language. International AAAI Conference on Web and Social; Media, AAAI, USA; 2017. pp. 1–4.Zhang Z., Luo L. Hate speech detection: a solved problem? The challenging case of long tail on Twitter. Semant. Web Accepted. 2018;10 doi: 10.3233/SW-180338.10.3233/SW-180338Sharma S., Agrawal S., Shrivastava M. Degree based classification of harmful speech using Twitter data. Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying; (TRAC-2018), Association for Computational Linguistics, Santa Fe, New Mexico, USA; 2018. pp. 106–112.Aulia N., Budi I. Hate speech detection on Indonesian long text documents using machine learning approach. Proceedings of the 2019 5th International Conference on Computing and Artificial Intelligence; ICCAI ‘19; New York, NY, USA: ACM; 2019. pp. 164–169.10.1145/3330482.3330491Zhang H., Mahata D., Shahid S., Mehnaz L., Anand S., Kumar Y., Shah R.R., Uppal K. MIDAS at SemEval-2019 task 6: identifying offensive posts and targeted offense from Twitter. Proceedings of the 13th International Workshop on Semantic Evaluation; Association for Computational Linguistics, Minneapolis, Minnesota, USA; 2019. pp. 683–690.10.18653/v1/S19-2122Plaza-del Arco F.M., Molina-González M.D., Martin M., Ureña-López L.A. SINAI at SemEval-2019 task 5: ensemble learning to detect hate speech against inmigrants and women in English and Spanish tweets. Proceedings of the 13th International Workshop on Semantic Evaluation; Association for Computational Linguistics, Minneapolis, Minnesota, USA; 2019. pp. 476–479.10.18653/v1/S19-2084Nugroho K., Noersasongko Purwanto, Muljono E. 2019 International Conference on Information and Communications Technology (ICOIACT) IEEE; USA: 2019. Improving random forest method to detect hatespeech and offensive word; pp. 514–518.Santosh T.Y., Aravind K.V. Hate speech detection in Hindi-English code-mixed social media text. Proceedings of the ACM India Joint International Conference on Data Science and Management of Data; CoDS-COMAD ‘19; New York, NY, USA: Association for Computing Machinery; 2019. pp. 310–313.10.1145/3297001.3297048Pratiwi N.I., Budi I., Jiwanggi M.A. Hate speech identification using the hate codes for Indonesian tweets. Proceedings of the 2019 2nd International Conference on Data Science and Information Technology; DSIT 2019; New York, NY, USA: Association for Computing Machinery; 2019. pp. 128–133.10.1145/3352411.3352432Bohra A., Vijay D., Singh V., Akhtar S.S., Shrivastava M. A dataset of Hindi-English code-mixed social media text for hate speech detection. Proceedings of the Second Workshop on Computational Modeling of People‘s Opinions, Personality, and Emotions in Social Media; Association for Computational Linguistics, New Orleans, Louisiana, USA; 2018. pp. 36–41.10.18653/v1/W18-1105Mossie Z., Wang J.-H. Vulnerable community identification using hate speech detection on social media. Inf. Process. Manag. 2020;57(3) doi: 10.1016/j.ipm.2019.102087.10.1016/j.ipm.2019.102087Ribeiro M., Calais P., Santos Y., Almeida V., Jr. W.M. Characterizing and detecting hateful users on Twitter. International AAAI Conference on Web and Social; Media, AAAI, USA; 2018. pp. 1–4.Tulkens S., Hilte L., Lodewyckx E., Verhoeven B., Daelemans W. A dictionary-based approach to racism detection in Dutch social media. Proceedings of the LREC 2016 Workshop on Text Analytics for Cybersecurity and Online Safety (TA-COS), European Language Resources Association (ELRA); Association for Computational Linguistics, San Diego, California; 2016. pp. 1–7.Ross B., Rist M., Carbonell G., Cabrera B., Kurowsky N., Wojatzki M. In: Proceedings of NLP4CMC III: 3rd Workshop on Natural Language Processing for Computer-Mediated Communication. Beißwenger M., Wojatzki M., Zesch T., editors. vol. 17. Johann Christian Senckenberg; Bochum, Germany: 2016. Measuring the reliability of hate speech annotations: the case of the European refugee crisis; pp. 6–9. (Bochumer Linguistische Arbeitsberichte, Universitaetsbibliothek).Ganganwar V. An overview of classification algorithms for imbalanced datasets. Int. J. Emerg. Technol. Adv. Eng. 2012;2:42–47.Alowibdi J.S., Alshdadi A.A., Daud A., Dessouky M.M., Alhazmi E.A. Coronavirus pandemic (covid-19): emotional toll analysis on Twitter. Int. J. Semantic Web Inf. Syst. 2021;17(2):1–21.Meng Q., Suresh T., Lee R.K.-W., Chakraborty T. Predicting hate intensity of Twitter conversation threads. Knowl.-Based Syst. 2023;275 doi: 10.1016/j.knosys.2023.110644. https://www.sciencedirect.com/science/article/pii/S095070512300394510.1016/j.knosys.2023.110644Rosenberg E., Tarazona C., Mallor F., Eivazi H., Pastor-Escuredo D., Fuso-Nerini F., Vinuesa R. Sentiment analysis on Twitter data towards climate action. Results Eng. 2023;19 doi: 10.1016/j.rineng.2023.101287.10.1016/j.rineng.2023.101287Khan W., Daud A., Khan K., Muhammad S., Haq R. Exploring the frontiers of deep learning and natural language processing: a comprehensive overview of key challenges and emerging trends. Nat. Lang. Process. J. 2023Haider F., Dipty I., Rahman F., Assaduzzaman M., Sohel A. In: Computational Intelligence in Data Science. Chandran KR S., N S., A B., Hamead H S., editors. Springer Nature; Switzerland, Cham: 2023. Social media hate speech detection using machine learning approach; pp. 218–229.
3969081320241218
1651-20222024Dec17Logopedics, phoniatrics, vocologyLogoped Phoniatr VocolProsodic changes with age: a longitudinal study with three public figures in European Portuguese.1101-1010.1080/14015439.2024.2431331The analysis of acoustic parameters contributes to the characterisation of human communication development throughout the lifetime. The present paper intends to analyse suprasegmental features of European Portuguese in longitudinal conversational speech samples of three male public figures in uncontrolled environments across different ages, approximately 30 years apart.Twenty prosodic features concerning intonation, intensity, rhythm, and pause measures were extracted semi-automatically from 360 speech intervals (3-4 interviews from each speaker x 30 speech intervals x 3 speakers) lasting between 3 to 6 s.Twelve prosodic parameters presented significant age effects at least in one speaker. Group mean comparisons revealed significant differences between the youngest (i.e. 50 years) and the oldest age groups (i.e. 80 years) in seven parameters. The results from the analysis point to a lower and less variable fo, higher fo minimum, wider fo peaks, more vocal effort and more variable global intensity, slower speech and articulation rate, and also more frequent and longer pauses in older ages.This longitudinal study has the potential to contribute to the characterization of the normal aging process, proving to be significant in the domains of human-machine communication, speech recognition systems, applied linguistics, or the implementation of strategies in communicative contexts with older adults.ValenteAna Rita SARSInstitute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal.Department Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal.School of Health Sciences, Polytechnic of Leiria, Leiria, Portugal.OliveiraCatarinaCInstitute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal.School of Health Sciences, University of Aveiro, Aveiro, Portugal.AlbuquerqueLucianaLInstitute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal.Department Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal.Center for Health Technology and Services Research, University of Aveiro, Aveiro, Portugal.TeixeiraAntónioAInstitute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal.Department Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal.BarbosaPlínio APASpeech Prosody Studies Group, Dep. of Linguistics, State Univ. of Campinas, Campinas, Brazil.engJournal Article20241217
EnglandLogoped Phoniatr Vocol96173111401-5439IMProsodyacoustic phoneticslongitudinal analysisvocal ageing
20241218113220241218113220241218153aheadofprint3969081310.1080/14015439.2024.2431331
3968720120241217
1949-30451542024Oct-DecIEEE transactions on affective computingIEEE Trans Affect ComputMultimodal Prediction of Obsessive-Compulsive Disorder and Comorbid Depression Severity and Energy Delivered by Deep Brain Electrodes.202520412025-204110.1109/taffc.2024.3395117To develop reliable, valid, and efficient measures of obsessive-compulsive disorder (OCD) severity, comorbid depression severity, and total electrical energy delivered (TEED) by deep brain stimulation (DBS), we trained and compared random forests regression models in a clinical trial of participants receiving DBS for refractory OCD. Six participants were recorded during open-ended interviews at pre- and post-surgery baselines and then at 3-month intervals following DBS activation. Ground-truth severity was assessed by clinical interview and self-report. Visual and auditory modalities included facial action units, head and facial landmarks, speech behavior and content, and voice acoustics. Mixed-effects random forest regression with Shapley feature reduction strongly predicted severity of OCD, comorbid depression, and total electrical energy delivered by the DBS electrodes (intraclass correlation, ICC, = 0.83, 0.87, and 0.81, respectively. When random effects were omitted from the regression, predictive power decreased to moderate for severity of OCD and comorbid depression and remained comparable for total electrical energy delivered (ICC = 0.60, 0.68, and 0.83, respectively). Multimodal measures of behavior outperformed ones from single modalities. Feature selection achieved large decreases in features and corresponding increases in prediction. The approach could contribute to closed-loop DBS that would automatically titrate DBS based on affect measures.HindujaSaurabhS0000-0003-1637-5950Department of Psychology, University of Pittsburgh, Pittsburgh, PA 15213 USA.DarziAliADepartment of Psychology, University of Pittsburgh, Pittsburgh, PA 15213 USA.ErtugrulItir OnalIODepartment of Information and Computing Sciences, Utrecht University, 3584 CS Utrecht, The Netherlands.ProvenzaNicoleN0000-0002-6952-5417Department of Neurosurgery, Baylor College of Medicine, Houston, TX 77090 USA.GadotRonRDepartment of Neurosurgery, Baylor College of Medicine, Houston, TX 77090 USA.StorchEric AEAMenninger Department of Psychiatry and Behavioral Science, Baylor College of Medicine, Houston, TX 77090 USA.ShethSameer ASADepartment of Neurosurgery, Baylor College of Medicine, Houston, TX 77090 USA.GoodmanWayne KWKMenninger Department of Psychiatry and Behavioral Science, Baylor College of Medicine, Houston, TX 77090 USA.CohnJeffrey FJF0000-0002-9393-1116Department of Psychology, University of Pittsburgh, Pittsburgh, PA 15213 USA.engJournal Article20240430
United StatesIEEE Trans Affect Comput1016350971949-3045Obsessive-compulsive disorder (OCD)deep brain stimulation (DBS)depressionmixed-effectsmultimodal machine learningshapley feature reduction
2024121711512024121711502024121743920241216ppublish39687201PMC1164900310.1109/taffc.2024.3395117NIHMS2010941American Psychiatric Association, DSM-5, Washington, DC, 2015.Alghowinem S, Gedeon T, Goecke R, Cohn J, and Parker G, “Depression detection model interpretation via feature selection methods,” IEEE Trans. Affective Comput, vol. 14, no. 1,pp. 133–152, First Quarter 2023.PMC1001957836938342Cummins N, Scherer S, Krajewski J, Schnieder S, Epps J, and Quatieri TF, “A review of depression and suicide risk assessment using speech analysis,” Speech Commun., vol. 71, no. C, pp. 10–49, Jul. 2015. [Online]. Available: 10.1016/j.specom.2015.03.00410.1016/j.specom.2015.03.004Dibeklioglu H, Hammal Z, and Cohn JF, “Dynamic multimodal measurement of depression severity using deep autoencoding,” IEEE J. Biomed. Health Inform, vol. 22, no. 2, pp. 525–536, Mar. 2018.PMC558173728278485Fang M, Peng S, Liang Y, Hung C-C, and Liu S, “A multimodal fusion model with multi-level attention mechanism for depression detection,” Biomed. Signal Process. Control, vol. 82, 2023, Art. no. 104561.Scherer S. et al., “Automatic behavior descriptors for psychological disorder analysis,” in Proc. IEEE Int. Conf. Autom. Face Gesture Recognit., 2013, pp. 1–8.Arioz U, Smrke U, Plohl N, and Mlakar I, “Scoping review on the multimodal classification of depression and experimental study on existing multimodal models,” Diagnostics, vol. 12, no. 11, 2022, Art. no. 2683. [Online]. Available: https://www.mdpi.com/2075-4418/12/11/2683 PMC968970836359525Khoo LS, Lim MK, Y Chong C, and McNaney R, “Machine learning for multimodal mental health detection: A systematic review of passive sensing approaches,” Sensors, vol. 24, no. 2, 2024, Art. no. 348.PMC1082086038257440Fokkema JE-CM and Wolpert M, “Generalized linear mixed-model (GLMM) trees: A flexible decision-tree method for multilevel and longitudinal data,” Psychother. Res, vol. 31, no. 3, pp. 329–341, 2021. [Online]. Available: 10.1080/10503307.2020.178503710.1080/10503307.2020.178503732602811Lewis RA, Ghandeharioun A, Fedor S, Pedrelli P, Picard R, and Mischoulon D, “Mixed effects random forests for personalised predictions of clinical depression severity,” 2023, arXiv:2301.09815.Valstar M. et al., “AVEC 2013: The continuous audio/visual emotion and depression recognition challenge,” in Proc. 3rd ACM Int. Workshop Audio/Vis. Emotion Challenge, New York, NY, USA, 2013, pp. 3–10. [Online]. Available: 10.1145/2512530.251253310.1145/2512530.2512533Simpson EH, “The interpretation of interaction in contingency tables,” J. Roy. Stat. Soc. Ser. Methodol, vol. 13, no. 2, pp. 238–241, 1951.D. American Psychiatric Association et al., Diagnostic and Statistical Manual of Mental Disorders: DSM-5, vol. 5. Washington, DC, USA: Amer. Psychiatr. Assoc., 2013.Quarantini LC et al., “Comorbid major depression in obsessive-compulsive disorder patients,” Comprehensive Psychiatry, vol. 52, no. 4, pp. 386–393, 2011.21087765Sheth SA and Mayberg HS, “Deep brain stimulation for obsessive-compulsive disorder and depression,” Annu. Rev. Neurosci, vol. 46, pp. 341–358, Jul. 2023.37018916Crino RD and Andrews G, “Obsessive-compulsive disorder and axis I comorbidity,” J. Anxiety Disord, vol. 10, pp. 37–46, 1996.Overbeek T, Schruers K, and Griez E, “Comorbidity of obsessive-compulsive disorder and depression: Prevalence, symptom severity, and treatment effect,” J. Clin. Psychiatry, vol. 63, pp. 1106–1112, 2002.12523869Bellodi L, Sciuto G, Diaferia G, Ronchi P, and Smeraldi E, “Psychiatric disorders in the families of patients with obsessive-compulsive disorder,” Psychiatry Res., vol. 42, pp. 111–120, 1992.1631248Demal U, Lenz G, Mayrhofer A, Zapotoczky HG, and Zitterl W, “Obsessive-compulsive disorder and depression. A retrospective study on course and interaction,” Psychopathology, vol. 26, pp. 145–150, 1993.8234627Romanelli RJ, Wu FM, Gamba R, Mojtabai R, and Segal JB, “Behavioral therapy and serotonin reuptake inhibitor pharmacotherapy in the treatment of obsessive-compulsive disorder: A systematic review and meta-analysis of head-to-head randomized controlled trials,” Depression Anxiety, vol. 31, no. 8, pp. 641–652, 2014.24390912Öst L-G, Havnen A, Hansen B, and Kvale G, “Cognitive behavioral treatments of obsessive–compulsive disorder. A systematic review and meta-analysis of studies published 1993–2014,” Clin. Psychol. Rev, vol. 40, pp. 156–169, Aug. 2015.26117062Reddy YCJ, Sundar AS, Narayanaswamy JC, and Math SB, “Clinical practice guidelines for obsessive-compulsive disorder,” Indian J. Psychiatry, vol. 59, no. Suppl 1, pp. S74–S90, 2017.PMC531010728216787Karas PJ, Lee S, Jimenez-Shahed J, Goodman WK, Viswanathan A, and Sheth SA, “Deep brain stimulation for obsessive compulsive disorder: Evolution of surgical stimulation target parallels changing model of dysfunctional brain circuits,” Front. Neurosci,vol. 12, 2019, Art. no. 998.PMC633147630670945Rădulescu A, Herron J, Kennedy C, and Scimemi A, “Global and local excitation and inhibition shape the dynamics of the cortico-striatalthalamo-cortical pathway,” Sci. Rep, vol. 7, no. 1, 2017, Art. no. 7608.PMC554892328790376Gadot R. et al., “Efficacy of deep brain stimulation for treatment-resistant obsessive-compulsive disorder: Systematic review and meta-analysis,” J. Neurol. Neurosurgery Psychiatry, vol. 93, no. 11, pp. 1166–1173, 2022.36127157Beebe B and Gerstman LJ, “The “packaging” of maternal stimulation in relation to infant facial-visual engagement: A case study at four months,” Merrill-Palmer Quart. Behav. Develop, vol. 26, no. 4, pp. 321–339, 1980.McCall MV, Riva-Possea P, Garlow SJ, Mayberg HS, and Crowell AL, “Analyzing non-verbal behavior throughout recovery in a sample of depressed patients receiving deep brain stimulation,” Neurol. Psychiatry Brain Res, vol. 37, pp. 33–40, 2020.PMC737540732699489Cotes RO et al., “Multimodal assessment of schizophrenia and depression utilizing video, acoustic, locomotor, electroencephalographic, and heart rate technology: Protocol for an observational study,” JMIR Res. Protoc, vol. 11, no. 7, 2022, Art. no. e36417.PMC933020935830230Bilalpur M. et al., “Multimodal feature selection for detecting mothers’ depression in dyadic interactions with their adolescent offspring,” in Proc. IEEE 17th Int. Conf. Autom. Face Gesture Recognit., 2023, pp. 1–8.PMC1140874639296877Ekman P, Friesen WV, and Hager JC, “Facial action coding system: Research Nexus,” in Netw. Res. Inf, Salt Lake City, UT, 2002.Cowen AS and Keltner D, “Universal facial expressions uncovered in art of the ancient americas: A computational approach,” Sci. Adv, vol. 6, no. 34, 2020, Art. no. eabb1005.PMC743810332875109Barrett LF, Adolphs R, Marsella S, Martinez AM, and Pollak SD, “Emotional expressions reconsidered: Challenges to inferring emotion from human facial movements,” Psychol. Sci. Public Int, vol. 20, no. 1, pp. 1–68, 2019.PMC664085631313636Keltner D, Sauter D, Tracy J, and Cowen A, “Emotional expression: Advances in basic emotion theory,” J. Nonverbal Behav, vol. 43, pp. 133–160, 2019.PMC668708631395997Cordaro DT, Sun R, Keltner D, Kamble S, Huddar N, and McNeil G, “Universals and cultural variations in 22 emotional expressions across five cultures,” Emotion, vol. 18, no. 1, 2018, Art. no. 75.28604039Mattson RE, Rogge RD, Johnson MD, Davidson EK, and Fincham FD, “The positive and negative semantic dimensions of relationship satisfaction,” Pers. Relationships, vol. 20, no. 2, pp. 328–355, 2013.Ertugrul IO, Jeni LA, Ding W, and Cohn JF, “AFAR: A deep learning based tool for automated facial affect recognition,” in Proc. IEEE 14th Int. Conf. Autom. Face Gesture Recognit., 2019, pp. 1–1.PMC687437431762712Ertugrul IO, Cohn JF, Jeni LA, Zhang Z, Yin L, and Ji Q, “Crossing domains for AU coding: Perspectives, approaches, and measures,” IEEE Trans. Biometrics Behav. Identity Sci, vol. 2, no. 2, pp. 158–171, Apr. 2020.PMC720246732377637Baltrusaitis T, Zadeh A, Lim YC, and Morency L-P, “OpenFace 2.0: Facial behavior analysis toolkit,” in Proc. IEEE 13th Int. Conf. Autom. Face Gesture Recognit., 2018, pp. 59–66.N. I. Technology, “Facereader v6.1,” Report, 2015.Hammal Z, Cohn JF, and George DT, “Interpersonal coordination of headmotion in distressed couples,” IEEE Trans. Affective Comput, vol. 5, no. 2, pp. 155–167, Second Quarter 2014.PMC449597526167256Hammal Z, Cohn JF, Heike C, and Speltz ML, “Automatic measurement of head and facial movement for analysis and detection of infants’ positive and negative affect,” Front. ICT, vol. 2, Dec. 2015, Art. no. 21.Gavrilescu M and Vizireanu N, “Predicting depression, anxiety, and stress levels from videos using the facial action coding system,” Sensors, vol. 19, no. 17, Aug. 2019, Art. no. 3693.PMC674951831450687Yang T-H, Wu C-H, Su M-H, and Chang C-C, “Detection of mood disorder using modulation spectrum of facial action unit profiles,” in Proc. Int. Conf. Orange Technol., 2016, pp. 5–8.Martin KB et al., “Objective measurement of head movement differences in children with and without autism spectrum disorder,” Mol. Autism, vol. 9, no. 1, Dec. 2018, Art. no. 14.PMC582831129492241Ding Y. et al., “Automated detection of optimal DBS device settings,” in Proc. Int. Conf. Multimodal Interact. Companion Pub., New York, NY, USA, 2020, pp. 354–356.PMC808663833937916Darzi A. et al., “Facial action units and head dynamics in longitudinal interviews reveal OCD and depression severity and DBS energy,” in Proc. 16th IEEE Int. Conf. Autom. Face Gesture Recognit., 2021, pp. 1–6.Sundberg J, Patel S, Bjorkner E, and Scherer KR, “Interdependencies among voice source parameters in emotional speech,” IEEE Trans. Affective Comput, vol. 2, no. 3, pp. 162–174, Third Quarter 2011.Cordaro DT, Keltner D, Tshering S, Wangchuk D, and Flynn LM, “The voice conveys emotion in ten globalized cultures and one remote village in Bhutan,” Emotion, vol. 16, no. 1, 2016, Art. no. 117.26389648Busso C, Lee S, and Narayanan S, “Analysis of emotionally salient aspects of fundamental frequency for emotion detection,” IEEE Trans. Audio, Speech, Lang. Process, vol. 17, no. 4, pp. 582–596, May 2009.Alpert M, Pouget ER, and Silva RR, “Reflections of depression in acoustic measures of the patient’s speech,” J. Affect. Disord, vol. 66, no. 1, pp. 59–69, 2001.11532533Mundt JC, Vogel AP, Feltner DE, and Lenderking WR, “Vocal acoustic biomarkers of depression severity and treatment response,” Biol. Psychiatry, vol. 72, no. 7, pp. 580–587, 2012.PMC340993122541039Yang Y, Fairbairn C, and Cohn JF, “Detecting depression severity from vocal prosody,” IEEE Trans. Affective Comput, vol. 4, no. 2, pp. 142–150, Second Quarter 2013.PMC479106726985326Özseven T, Düğenci M, Doruk A, and Kahraman HI, “Voice traces of anxiety: Acoustic parameters affected by anxiety disorder,” Arch. Acoust, vol. 43, pp. 625–636, 2018.Scherer S, Stratou G, and Morency L-P, “Audiovisual behavior descriptors for depression assessment,” in Proc. 15th ACM Int. Conf. Multimodal Interact., 2013, pp. 135–140.Pennebaker JW, Francis ME, and Booth RJ, “Linguistic inquiry and word count: LIWC 2001,” Mahway: Lawrence Erlbaum Associates, vol. 71, no. 2001, 2001, Art. no. 2001.Malik M, Malik MK, Mehmood K, and Makhdoom I, “Automatic speech recognition: A survey,” Multimedia Tools Appl., vol. 80, no. 6, pp. 9411–9457, 2021.Dimauro G, Di Nicola V, Bevilacqua V, Caivano D, and Girardi F, “Assessment of speech intelligibility in Parkinson’s disease using a speech-to-text system,” IEEE Access, vol. 5, pp. 22 199–22 208, 2017.Devlin J, Chang M-W, Lee K, and Toutanova K, “BERT: Pretraining of deep bidirectional transformers for language understanding,” 2018, arXiv: 1810.04805.Liu Y. et al., “RoBERTa: A robustly optimized BERT pretraining approach,” 2019, arXiv: 1907.11692.Chowdhery A. et al., “PaLM: Scaling language modeling with pathways,” 2022, arXiv:2204.02311.Savova GK et al., “Mayo clinical text analysis and knowledge extraction system (cTAKES): Architecture, component evaluation and applications,” J. Amer. Med. Inform. Assoc, vol. 17, no. 5, pp. 507–513, 2010.PMC299566820819853Manning C and Schutze H, Foundations of Statistical Natural Language Processing. Cambridge, MA, USA: MIT Press, 1999.Cohen AS, Mitchell KR, and Elvevåg B, “What do we really know about blunted vocal affect and alogia? A meta-analysis of objective assessments,” Schizophrenia Res., vol. 159, no. 2/3, pp. 533–538, 2014.PMC425403825261880Baki P, Kaya H, Güleç H, Güleç H, and Salah AA, “A multimodal approach for mania level prediction in bipolar disorder,” IEEE Trans. Affective Comput, vol. 13, no. 4, pp. 2119–2131, Fourth Quarter 2022.Carson NJ et al., “Identification of suicidal behavior among psychiatrically hospitalized adolescents using natural language processing and machine learning of electronic health records,” PLoS One, vol. 14, no. 2, 2019, Art. no. e0211116.PMC638054330779800Metzger M-H, Tvardik N, Gicquel Q, Bouvry C, Poulet E, and Potinet-Pagliaroli V, “Use of emergency department electronic medical records for automated epidemiological surveillance of suicide attempts: A french pilot study,” Int. J. Methods Psychiatr. Res, vol. 26, no. 2, 2017, Art. no. e1522.PMC687720227634457Coppersmith G, Leary R, Crutchley P, and Fine A, “Natural language processing of social media as screening for suicide risk,” Biomed. Inform. Insights, vol. 10, 2018, Art. no. 1178222618792860.PMC611139130158822Bittar A, Velupillai S, Roberts A, and Dutta R, “Text classification to inform suicide risk assessment in electronic health records,” in Proc. 17th World Congr. Med. Health Inform., 2019,pp. 40–44.31437881Tanana M, Hallgren KA, Imel ZE, Atkins DC, and Srikumar V, “A comparison of natural language processing methods for automated coding of motivational interviewing,” J. Substance Abuse Treat, vol. 65, pp. 43–50, 2016.PMC484209626944234Baggott MJ, Kirkpatrick MG, Bedi G, and de Wit H, “Intimate insight: MDMA changes how people talk about significant others,” J. Psychopharmacology, vol. 29, no. 6, pp. 669–677, 2015.PMC469815225922420To D, Sharma B, Karnik N, Joyce C, Dligach D, and Afshar M, “Validation of an alcohol misuse classifier in hospitalized patients,” Alcohol, vol. 84, pp. 49–55, 2020.PMC710125931574300Hoogendoorn M, Berger T, Schulz A, Stolz T, and Szolovits P, “Predicting social anxiety treatment outcome based on therapeutic email conversations,” IEEE J. Biomed. Health Inform, vol. 21, no. 5, pp. 1449–1459, Sep. 2017.PMC561366927542187Patel R. et al., “Mood instability is a common feature of mental health disorders and is associated with poor clinical outcomes,” BMJ Open, vol. 5, no. 5, 2015, Art. no. e007504.PMC445275425998036Banerjee T. et al., “Predicting mood disorder symptoms with remotely collected videos using an interpretable multimodal dynamic attention fusion network,” 2021, arXiv:2109.03029.Stratou G, Scherer S, Gratch J, and Morency L-P, “Automatic nonverbal behavior indicators of depression and PTSD: The effect of gender,” J. Multimodal User Interfaces, vol. 9, pp. 17–29, 2015.pp. 11–18, 2014.Huang Z, Epps J, Joachim D, and Chen M, “Depression detection from short utterances via diverse smartphones in natural environmental conditions,” in Proc. 19th Annu. Conf. Int. Speech Commun. Assoc., 2018, pp. 3393–3397.Espinola CW, Gomes JC, Pereira JMS, and Santos WPD, “Detection of major depressive disorder, bipolar disorder, schizophrenia and generalized anxiety disorder using vocal acoustic analysis and machine learning,” Res. Biomed. Eng, vol. 38, pp. 813–829, 2022.Schultebraucks K, Yadav V, Shalev AY, Bonanno GA, and Galatzer-Levy IR, “Deep learning-based classification of posttraumatic stress disorder and depression following trauma utilizing visual and auditory markers of arousal and mood,” Psychol. Med, vol. 52, no. 5, pp. 957–967, 2022.32744201Marmar CR et al., “Speech-based markers for posttraumatic stress disorder in US veterans,” Depression Anxiety, vol. 36, no. 7, pp. 607–616, Jul. 2019. [Online]. Available: https://onlinelibrary.wiley.com/doi/10.1002/da.22890 10.1002/da.22890PMC660285431006959Joshi J. et al., “Multimodal assistive technologies for depression diagnosis and monitoring,” J. Multimodal User Interfaces, vol. 7, pp. 217–228, 2013.Yang L, Jiang D, He L, Pei E, Oveneke MC, and Sahli H, “Decision tree based depression classification from audio video and language information,” in Proc. 6th Int. Workshop Audio/Vis. Emotion Challenge, 2016, pp. 89–96.Sardari S, Nakisa B, Rastgoo MN, and Eklund P, “Audio based depression detection using convolutional autoencoder,” Expert Syst. Appl, vol. 189, 2022, Art. no. 116076.Zhang Z, Lin W, Liu M, and Mahmoud M, “Multimodal deep learning framework for mental disorder recognition,” in Proc. IEEE 15th Int. Conf. Autom. Face Gesture Recognit., 2020, pp. 344–350.Lipovetsky S and Conklin M, “Analysis of regression in game theory approach,” Appl. Stochastic Models Bus. Ind, vol. 17, no. 4, pp. 319–330, 2001.Zhou Y, Yao X, Han W, Wang Y, Li Z, and Li Y, “Distinguishing apathy and depression in older adults with mild cognitive impairment using text, audio, and video based on multiclass classification and shapely additive explanations,” Int. J. Geriatr. Psychiatry, vol. 37, no. 11, 2022.36284449Hox JJ, Multilevel Analysis: Techniques and Applications. New York, NY, USA: Routledge, 2010.Tabachnick BG and Fidell LS, Multilevel Liniear Model, 5th ed., 2007, sec. 15, pp. 781–857.Beck AT, Steer RA, and Brown G, Manual for the Beck Depression Inventory-II. San Antonio, TX, USA: Psychological Corporation, 1996.Storch EA, Rasmussen SA, Price LH, Larson MJ, Murphy TK, and Goodman WK, “Development and psychometric evaluation of the Yale–Brown obsessive-compulsive scale—Second edition,” Psychol. Assessment, vol. 22, no. 2, pp. 223–232, 2010.20528050McAuley MD, “Incorrect calculation of total electrical energy delivered by a deep brain stimulator,” Brain Stimulation, vol. 13, no. 5, pp. 1414–1415, Sep. 2020.32745654Jeni LA, Cohn JF, and Kanade T, “Dense 3D face alignment from 2D video for real-time use,” Image Vis. Comput, vol. 58, pp. 13–24, 2017.PMC593171329731533Zhang Z. et al., “Multimodal spontaneous emotion corpus for human behavior analysis,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 2016, pp. 3438–3446.Girard JM, Chu W-S, Jeni LA, and Cohn JF, “Sayette group formation task (GFT) spontaneous facial expression database,” in Proc. 12th IEEE Int. Conf. Autom. Face Gesture Recognit., 2017, pp. 581–588.PMC587602529606916Christ M, Braun N, Neuffer J, and Kempa-Liehr AW, “Time series feature extraction on basis of scalable hypothesis tests (Tsfresh – A Python package),” Neurocomputing, vol. 307, pp. 72–77, 2018.Dewi C, Chen R-C, Jiang X, and Yu H, “Adjusting eye aspect ratio for strong eye blink detection based on facial landmarks,” PeerJ Comput. Sci, vol. 8, 2022, Art. no. e943.PMC904433735494836TranscribeMe! - Fast & accurate human transcription services. [Online]. Available: https://www.transcribeme.com/McAuliffe M, Socolof M, Mihuc S, Wagner M, and Sonderegger M, “Montreal forced aligner: Trainable text-speech alignment Using kaldi,” in Proc. 18th Annu. Conf. Int. Speech Commun. Assoc., 2017, pp. 498–502.Eyben F, Wöllmer M, and Schuller B, “OpenSMILE: The Munich versatile and fast open-source audio feature extractor,” in Proc. 18th ACM Int. Conf. Multimedia, New York, NY, USA, 2010, pp. 1459–1462. [Online]. Available: 10.1145/1873951.187424610.1145/1873951.1874246Degottex G, Kane J, Drugman T, Raitio T, and Scherer S, “COVAREP—A collaborative voice analysis repository for speech technologies,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2014, pp. 960–964.Eyben F. et al., “The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing,” IEEE Trans. Affective Comput, vol. 7, no. 2, pp. 190–202, Second Quarter 2016.Tasnim M and Novikova J, “Cost-effective models for detecting depression from speech,” in Proc. IEEE 21st Int. Conf. Mach. Learn. Appl., 2022, pp. 1687–1694.Low DM, Bentley KH, and Ghosh SS, “Automated assessment of psychiatric disorders using speech: A systematic review,” Laryngoscope Invest. Otolaryngol, vol. 5, no. 1, pp. 96–116, 2020. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/lio2.35410.1002/lio2.354PMC704265732128436Cummins N, Vlasenko B, Sagha H, and Schuller B, “Enhancing speech-based depression detection through gender dependent vowel-level formant features,” in Proc. Int. Conf. Artif. Intell. Med., ten Teije A, Popow C, Holmes JH, and Sacchi L, Eds., Springer, 2017, pp. 209–214.Rouast PV, Adam MTP, and Chiong R, “Deep learning for human affect recognition: Insights and new developments,” IEEE Trans. Affective Comput, vol. 12, no. 2, pp. 524–543, Second Quarter 2021.Haider F, Pollak S, Albert P, and Luz S, “Emotion recognition in low-resource settings: An evaluation of automatic feature selection methods,” Comput. Speech Lang, vol. 65, 2021, Art. no. 101119. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0885230820300528Pennebaker JW, Boyd RL, Jordan K, and Blackburn K, “The development and psychometric properties of LIWC2015,” 2015.Tausczik YR and Pennebaker JW, “The psychological meaning of words: LIWC and computerized text analysis methods,” J. Lang. Social Psychol, vol. 29, no. 1, pp. 24–54, 2010.Wu H, Nonparametric Regression Methods for Longitudinal Data Analysis [Mixed-Effects Modeling Approaches]. Hoboken, NJ, USA: Wiley-Interscience, 2006.Hajjem A, Bellavance F, and Larocque D, “Mixed effects regression trees for clustered data,” Statist. Probability Lett, vol. 81, no. 4, pp. 451–459, Apr. 2011.Shapley LS, “A value for n-person games,” in Classics in Game Theory, vol. 69. Princeton, NJ, USA: Princeton Univ. Press, 1997.Lundberg SM and Lee S-I, “A unified approach to interpreting model predictions,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2017, pp. 4765–4774.Ribeiro MT, Singh S, and Guestrin C, ““Why should I trust you?” Explaining the predictions of any classifier,” in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2016, pp. 1135–1144.Kim I, Lee S, Kim Y, Namkoong H, andS. Kim, “A probabilistic model for pathway-guided gene set selection,” in Proc. IEEE Int. Conf. Bioinf. Biomed., 2021, pp. 2733–2740.Strobl C, Boulesteix A-L, Kneib T, Augustin T, and Zeileis A, “Conditional variable importance for random forests,” BMC Bioinf., vol. 9, no. 1, Dec. 2008, Art. no. 307. [Online]. Available: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-30710.1186/1471-2105-9-307PMC249163518620558Wilcoxon F, “Individual comparisons by ranking methods,” Biometrics Bull., vol. 1, no. 6, Dec. 1945, Art. no. 80. [Online]. Available: https://www.jstor.org/stable/10.2307/3001968?origin=crossref10.2307/3001968?origin=crossrefAlghowinem S, Gedeon T, Goecke R, Cohn JF, and Parker G, “Interpretation of depression detection models via feature selection methods,” IEEE Trans. Affective Comput, vol. 14, no. 1,pp. 133–152, First Quarter 2023.PMC1001957836938342He L, Jiang D, Yang L, Pei E, Wu P, and Sahli H, “Multimodal affective dimension prediction using deep bidirectional long short-term memory recurrent neural networks,” in Proc. 5th Int. WorkshopAudio/Vis. Emotion Challenge, New York, NY, USA, 2015, pp. 73–80. [Online]. Available: 10.1145/2808196.281164110.1145/2808196.2811641Rosenthal R, “Conducting judgment studies,” in The New Handbook of Methods in Nonverbal Behavior Research, Harrigan J, Rosenthal R, and Scherer K Eds., London, U.K.: Oxford Univ. Press, Mar. 2008, pp. 199–234. [Online]. Available: https://academic.oup.com/book/25991/chapter/193835959Macdonald AJD and Fugard AJB, “Routine mental health outcome measurement in the UK,” Int. Rev. Psychiatry, vol. 27, no. 4, pp. 306–319, 2015.25832566Brain Behavior Quantification & Synchronization Workshop, 2023. Accessed: May 21, 2023. [Online]. Available: https://event.roseliassociates.com/bbqs-workshopProvenza NR et al., “The case for adaptive neuromodulation to treat severe intractable mental disorders,” Front. Neurosci,vol. 13, Feb. 2019, Art. no. 152.PMC641277930890909Hofman JM et al., “Integrating explanation and prediction in computational social science,” Nature, vol. 595, pp. 181–188, 2021.34194044
trying2...
Automatic Speech Recognition System to Record Progress Notes in a Mobile EHR: A Pilot Study. | LitMetric

Creating notes in the EHR is one of the most problematic aspects for health professionals. The main challenges are the time spent on this task and the quality of the records. Automatic speech recognition technologies aim to facilitate clinical documentation for users, optimizing their workflow. In our hospital, we internally developed an automatic speech recognition system (ASR) to record progress notes in a mobile EHR. The objective of this article is to describe the pilot study carried out to evaluate the implementation of ASR to record progress notes in a mobile EHR application. As a result, the specialty that used ASR the most was Home Medicine. The lack of access to a computer at the time of care and the need to perform short and fast evolutions were the main reasons for users to use the system.

Download full-text PDF

Source
http://dx.doi.org/10.3233/SHTI230940DOI Listing

Publication Analysis

Top Keywords

automatic speech
12
speech recognition
12
record progress
12
progress notes
12
notes mobile
12
mobile ehr
12
recognition system
8
pilot study
8
asr record
8
system record
4

Similar Publications

Beat gestures and prosodic prominence interactively influence language comprehension.

Cognition

December 2024

Max Plank Institute for Psycholinguistics, Wundtlaan 1, 6525 XD Nijmegen, The Netherlands; Radboud University Nijmegen, Donders Institute for Brain, Cognition and Behaviour, 6525 EN Nijmegen, The Netherlands.

Face-to-face communication is not only about 'what' is said but also 'how' it is said, both in speech and bodily signals. Beat gestures are rhythmic hand movements that typically accompany prosodic prominence in conversation. Yet, it is still unclear how beat gestures influence language comprehension.

View Article and Find Full Text PDF

The AI Act in a law enforcement context: The case of automatic speech recognition for transcribing investigative interviews.

Forensic Sci Int Synerg

December 2024

Norwegian Police IT-unit, Fridtjof Nansens vei 14, 0031 Oslo, Norway.

Law enforcement agencies manually transcribe thousands of investigative interviews per year in relation to different crimes. In order to automate and improve efficiency in the transcription of such interviews, applied research explores artificial intelligence models, including Automatic Speech Recognition (ASR) and Natural Language Processing. While AI models can improve efficiency in criminal investigations, their successful implementation requires evaluation of legal and technical risks.

View Article and Find Full Text PDF

Hate speech constitutes a major problem on microblogging platforms, with automatic detection being a growing research area. Most existing works focus on analyzing the content of social media posts. Our study shifts focus to predicting which users are likely to become targets of hate speech.

View Article and Find Full Text PDF

Purpose: The analysis of acoustic parameters contributes to the characterisation of human communication development throughout the lifetime. The present paper intends to analyse suprasegmental features of European Portuguese in longitudinal conversational speech samples of three male public figures in uncontrolled environments across different ages, approximately 30 years apart.

Participants And Methods: Twenty prosodic features concerning intonation, intensity, rhythm, and pause measures were extracted semi-automatically from 360 speech intervals (3-4 interviews from each speaker x 30 speech intervals x 3 speakers) lasting between 3 to 6 s.

View Article and Find Full Text PDF

To develop reliable, valid, and efficient measures of obsessive-compulsive disorder (OCD) severity, comorbid depression severity, and total electrical energy delivered (TEED) by deep brain stimulation (DBS), we trained and compared random forests regression models in a clinical trial of participants receiving DBS for refractory OCD. Six participants were recorded during open-ended interviews at pre- and post-surgery baselines and then at 3-month intervals following DBS activation. Ground-truth severity was assessed by clinical interview and self-report.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!