trying...
38269778 2024 01 26 2024 02 06 1879-8365 310 2024 Jan 25 Studies in health technology and informatics Stud Health Technol Inform Automatic Speech Recognition System to Record Progress Notes in a Mobile EHR: A Pilot Study. 124 128 124-128 10.3233/SHTI230940 Creating notes in the EHR is one of the most problematic aspects for health professionals. The main challenges are the time spent on this task and the quality of the records. Automatic speech recognition technologies aim to facilitate clinical documentation for users, optimizing their workflow. In our hospital, we internally developed an automatic speech recognition system (ASR) to record progress notes in a mobile EHR. The objective of this article is to describe the pilot study carried out to evaluate the implementation of ASR to record progress notes in a mobile EHR application. As a result, the specialty that used ASR the most was Home Medicine. The lack of access to a computer at the time of care and the need to perform short and fast evolutions were the main reasons for users to use the system. Vargas Carolina Paula CP Health Informatics Department, Hospital Italiano de Buenos Aires, Argentina. Gaiera Alejandro A Health Informatics Department, Hospital Italiano de Buenos Aires, Argentina. Brandán Andres A Health Informatics Department, Hospital Italiano de Buenos Aires, Argentina. Renato Alejandro A Health Informatics Department, Hospital Italiano de Buenos Aires, Argentina. Benitez Sonia S Health Informatics Department, Hospital Italiano de Buenos Aires, Argentina. Luna Daniel D Health Informatics Department, Hospital Italiano de Buenos Aires, Argentina. eng Journal Article Netherlands Stud Health Technol Inform 9214582 0926-9630 Humans Pilot Projects Speech Recognition Software Documentation Health Personnel Hospitals Speech recognition software electronic health records mobile applications 2024 1 26 6 43 2024 1 25 6 43 2024 1 25 5 46 ppublish 38269778 10.3233/SHTI230940 SHTI230940 trying2... trying...
3654 5 0 1 MCID_676f08678bbce26c860eb927
39721154
39717516
39698099
39690813
39687201
automatic "automatable"[All Fields] OR "automatic"[All Fields] OR "automatical"[All Fields] OR "automatically"[All Fields] OR "automaticities"[All Fields] OR "automaticity"[All Fields] OR "automatics"[All Fields] OR "automatism"[MeSH Terms] OR "automatism"[All Fields] OR "automatisms"[All Fields] OR "automatization"[All Fields] OR "automatize"[All Fields] OR "automatized"[All Fields] OR "automatizes"[All Fields] OR "automatizing"[All Fields] speech "speech"[MeSH Terms] OR "speech"[All Fields] OR "speech's"[All Fields] OR "speeches"[All Fields] ("automatable"[All Fields] OR "automatic"[All Fields] OR "automatical"[All Fields] OR "automatically"[All Fields] OR "automaticities"[All Fields] OR "automaticity"[All Fields] OR "automatics"[All Fields] OR "automatism"[MeSH Terms] OR "automatism"[All Fields] OR "automatisms"[All Fields] OR "automatization"[All Fields] OR "automatize"[All Fields] OR "automatized"[All Fields] OR "automatizes"[All Fields] OR "automatizing"[All Fields]) AND ("speech"[MeSH Terms] OR "speech"[All Fields] OR "speeches"[All Fields])
trying2... trying...
39721154 2024 12 25 1873-7838 256 2024 Dec 24 Cognition Cognition Beat gestures and prosodic prominence interactively influence language comprehension. 106049 106049 10.1016/j.cognition.2024.106049 S0010-0277(24)00335-4 Face-to-face communication is not only about 'what' is said but also 'how' it is said, both in speech and bodily signals. Beat gestures are rhythmic hand movements that typically accompany prosodic prominence in conversation. Yet, it is still unclear how beat gestures influence language comprehension. On the one hand, beat gestures may share the same functional role of focus markers as prosodic prominence. Accordingly, they would drive attention towards the concurrent speech and highlight its content. On the other hand, beat gestures may trigger inferences of high speaker confidence, generate the expectation that the sentence content is correct and thereby elicit the commitment to the truth of the statement. This study directly disentangled the two hypotheses by evaluating additive and interactive effects of prosodic prominence and beat gestures on language comprehension. Participants watched videos of a speaker uttering sentences and judged whether each sentence was true or false. Sentences sometimes contained a world knowledge violation that may go unnoticed ('semantic illusion'). Combining beat gestures with prosodic prominence led to a higher degree of semantic illusion, making more world knowledge violations go unnoticed during language comprehension. These results challenge current theories proposing that beat gestures are visual focus markers. To the contrary, they suggest that beat gestures automatically trigger inferences of high speaker confidence and thereby elicit the commitment to the truth of the statement, in line with Grice's cooperative principle in conversation. More broadly, our findings also highlight the influence of metacognition on language comprehension in face-to-face communication. Copyright © 2024 The Authors. Published by Elsevier B.V. All rights reserved. Ferrari Ambra A Max Plank Institute for Psycholinguistics, Wundtlaan 1, 6525 XD Nijmegen, The Netherlands; Radboud University Nijmegen, Donders Institute for Brain, Cognition and Behaviour, 6525 EN Nijmegen, The Netherlands. Electronic address: ambra.ferrari@mpi.nl. Hagoort Peter P Max Plank Institute for Psycholinguistics, Wundtlaan 1, 6525 XD Nijmegen, The Netherlands; Radboud University Nijmegen, Donders Institute for Brain, Cognition and Behaviour, 6525 EN Nijmegen, The Netherlands. eng Journal Article 2024 12 24 Netherlands Cognition 0367541 0010-0277 IM Beat gestures Language comprehension Metacognition Multimodal communication Pragmatics Prosody Declaration of competing interest The authors declare no competing interests. 2024 2 2 2024 10 24 2024 12 13 2024 12 26 0 20 2024 12 26 0 20 2024 12 25 18 2 aheadofprint 39721154 10.1016/j.cognition.2024.106049 S0010-0277(24)00335-4 39717516 2024 12 24 2589-871X 9 2024 Forensic science international. Synergy Forensic Sci Int Synerg The AI Act in a law enforcement context: The case of automatic speech recognition for transcribing investigative interviews. 100563 100563 100563 10.1016/j.fsisyn.2024.100563 Law enforcement agencies manually transcribe thousands of investigative interviews per year in relation to different crimes. In order to automate and improve efficiency in the transcription of such interviews, applied research explores artificial intelligence models, including Automatic Speech Recognition (ASR) and Natural Language Processing. While AI models can improve efficiency in criminal investigations, their successful implementation requires evaluation of legal and technical risks. This paper explores the legal and technical challenges of applying ASR models to investigative interviews in the context of the European Union Artificial Intelligence Act (AIA). The AIA provisions are discussed in the view of domain specific studies for interviews in the Norwegian police, best practices, and empirical analyses in speech recognition in order to provide law enforcement with a practical code of conduct on the techno-legal requirements for the adoption of such models in their work and potential grey areas for further research. © 2024 The Authors. Stoykova Radina R University of Groningen, Broerstraat 5, 9712 CP Grongingen, Netherlands. Porter Kyle K Norwegian University of Science and Technology, Teknologivegen 22, 2815 Gjøvik, Norway. Beka Thomas T Norwegian Police IT-unit, Fridtjof Nansens vei 14, 0031 Oslo, Norway. eng Journal Article 2024 12 05 Netherlands Forensic Sci Int Synerg 101766849 2589-871X Artificial Intelligence Artificial Intelligence Act (AI Act) Automatic speech recognition General-purpose AI models Investigative interviews Law enforcement The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Thomas Beka reports a relationship with Norwegian Police IT-unit that includes: employment. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. 2024 7 31 2024 10 16 2024 11 25 2024 12 24 6 21 2024 12 24 6 20 2024 12 24 4 44 2024 12 5 epublish 39717516 PMC11664072 10.1016/j.fsisyn.2024.100563 S2589-871X(24)00110-4 Riksadvokaten . 2016. Rundskriv nr.212076. Eriksen P.K.F. 2013. Avhørsrapporten som rekontekstualisering av avhøret. Norwegian Ministry of Children and Equality P.K.F. 2016. The rights of the child in Norway: Norway’s fifth and sixth periodic reports to the UN committee on the rights of the child. Justis og beredskapsdepartementet P.K.F. 2015. Forskrift om avhør av barn og andre særlig sårbare fornærmede og vitner (tilrettelagte avhør) Riksrevisjonen P.K.F. Stanford Univ; 2024. Riksrevisjonens undersøkelse av digitalisering i politiet Dokument 3:7. S. Wollin-Giering, M. Hoffmann, J. Höfting, C. Ventzke, et al., Automatic Transcription of English and German Qualitative Interviews, in: Forum Qualitative Sozialforschung/Forum: Qualitative Social Research, Vol. 25, No. 1, 2024. Krausman A., Kelley T., McGhee S., Schaefer K.E., Fitzhugh S. CCDC Army Research LaboratorySOS International; 2019. Using Dragon for Speech-To-Text Transcription in Support of Human-Autonomy Teaming Research: Technical Report. European Parliament A. 2024. Corrigendum to the position of the European parliament adopted at first reading on 13 march 2024 with a view to the adoption of regulation (EU) 2024/ ...... of the European parliament and of the council laying down harmonised rules on artificial intelligence and amending regulations (EC) no 300/2008, (EU) no 167/2013, (EU) no 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and directives 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 (artificial intelligence act) P9_Ta(2024)0138 (com(2021)0206 – C9-0146/2021 – 2021/0106(COD)) URL: www.europarl.europa.eu/doceo/document/TA-9-2024-0138-FNL-COR01_EN.pdf. (Accessed 27 May 2024) Radford A., Kim J.W., Xu T., Brockman G., McLeavey C., Sutskever I. OpenAI; 2022. Robust Speech Recognition Via Large-Scale Weak Supervision: Tech. Rep. Baevski A., Zhou Y., Mohamed A., Auli M. Wav2vec 2.0: A framework for self-supervised learning of speech representations. Adv. Neural Inf. Process. Syst. 2020;33:12449–12460. Li J., et al. Recent advances in end-to-end automatic speech recognition. APSIPA Trans. Signal Inf. Process. 2022;11(1) Babu A., Wang C., Tjandra A., Lakhotia K., Xu Q., Goyal N., Singh K., von Platen P., Saraf Y., Pino J., et al. 2021. XLS-r: Self-supervised cross-lingual speech representation learning at scale. arXiv preprint arXiv:2111.09296. Jurafsky D., Martin J.H. Stanford Univ; 2019. Speech and Language Processing (3rd (draft) ed.) J. Rugayan, T. Svendsen, G. Salvi, Semantically meaningful metrics for Norwegian ASR systems, in: Interspeech, 2022. Harrington L. Incorporating automatic speech recognition methods into the transcription of police-suspect interviews: factors affecting automatic performance. Front. Commun. 2023;8 Negrão M., Domingues P. Speechtotext: An open-source software for automatic detection and transcription of voice recordings in digital forensics. Forensic Sci. Int. Digit. Invest. 2021;38 Panayotov V., Chen G., Povey D., Khudanpur S. Librispeech: an asr corpus based on public domain audio books. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP; IEEE; 2015. pp. 5206–5210. Vásquez-Correa J.C., Álvarez Muniain A. Novel speech recognition systems applied to forensics within child exploitation: Wav2vec2. 0 vs. whisper. Sensors. 2023;23(4):1843. PMC9961197 36850439 Loakes D. Does automatic speech recognition (ASR) have a role in the transcription of indistinct covert recordings for forensic purposes. Front. Commun. 2022;7 doi: 10.3389/fcomm. 10.3389/fcomm Loakes D. Automatic speech recognition and the transcription of indistinct forensic audio: how do the new generation of systems fare? Front. Commun. 2024;9 Wahler M.E. A word is worth a thousand words: Legal implications of relying on machine translation technology. Stetson L. Rev. 2018;48:109. Lorch B., Scheler N., Riess C. Compliance challenges in forensic image analysis under the artificial intelligence act. 2022 30th European Signal Processing Conference; EUSIPCO; IEEE; 2022. pp. 613–617. Bommasani R., Hudson D.A., Adeli E., Altman R., Arora S., von Arx S., Bernstein M.S., Bohg J., Bosselut A., Brunskill E., et al. 2021. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258. Baevski A., Conneau A., Auli M. 2020. Wav2vec 2.0: Learning the structure of speech from raw audio. URL: https://ai.meta.com/blog/wav2vec-20-learning-the-structure-of-speech-from-raw-audio/ Gutierrez C.I., Aguirre A., Uuk R., Boine C.C., Franklin M. A proposal for a definition of general purpose artificial intelligence systems. Digit. Soc. 2023;2(3):36. doi: 10.1007/s44206-023-00068-w. 10.1007/s44206-023-00068-w Ebers M., Hoch V.R.S., Rosenkranz F., Ruschemeier H., Steinrötter B. The European commission’s proposal for an artificial intelligence act—A critical assessment by members of the robotics and AI law society (RAILS) J. 2021;4(4):589–603. doi: 10.3390/j4040043. URL: https://www.mdpi.com/2571-8800/4/4/43. Number: 4 Publisher: Multidisciplinary Digital Publishing Institute. 10.3390/j4040043 Casey E. The chequered past and risky future of digital forensics. Aust. J. Forensic Sci. 2019;51(6):649–664. doi: 10.1080/00450618.2018.1554090. 10.1080/00450618.2018.1554090 Hughes N., Karabiyik U. Towards reliable digital forensics investigations through measurement science. WIREs Forensic Sci. 2020 doi: 10.1002/wfs2.1367. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/wfs2.1367. 10.1002/wfs2.1367 10.1002/wfs2.1367 Stoykova R. Digital evidence: Unaddressed threats to fairness and the presumption of innocence. Comput. Law Secur. Rev. 2021;42 doi: 10.1016/j.clsr.2021.105575. URL: https://www.sciencedirect.com/science/article/pii/S0267364921000480. 10.1016/j.clsr.2021.105575 Palmiotto F. In: Algorithmic Governance and Governance of Algorithms: Legal and Ethical Challenges. Ebers M., Cantero Gamito M., editors. Springer International Publishing; Cham: 2021. The black box on trial: The impact of algorithmic opacity on fair trial rights in criminal proceedings; pp. 49–70. Stoykova R., Bonnici J., Franke K. In: Artificial Intelligence (AI) in Forensic Sciences. first ed. Geradts Z., Franke K., editors. Wiley; 2023. Machine learning for evidence in criminal proceedings: Techno-legal challenges for reliability assurance. Crawford K. Yale University Press; New Haven: 2021. Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. OCLC. Schuett J. Risk management in the artificial intelligence act. Eur. J. Risk Regul. 2023:1–19. doi: 10.1017/err.2023.1. URL: https://www.cambridge.org/core/journals/european-journal-of-risk-regulation/article/risk-management-in-the-artificial-intelligence-act/2E4D5707E65EFB3251A76E288BA74068#. Publisher: Cambridge University Press. 10.1017/err.2023.1 National Institute of Standards and Technology, (NIST) J. National Institute of Standards and Technology; Gaithersburg, MD: 2023. AI Risk Management Framework: AI RMF (1.0): Technical Report NIST AI 100-1. URL: https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf. 10.6028/NIST.AI.100-1 Bull R., Rachlew A. In: Interrogation and Torture. first ed. Barela S.J., Fallon M., Gaggioli G., Ohlin J.D., editors. Oxford University PressNew York; 2020. Investigative interviewing: From England to Norway and beyond; pp. 171–196. URL: https://academic.oup.com/book/40539/chapter/347867987. 10.1093/oso/9780190097523.003.0007 Westera N.J., Kebbell M.R., Milne B. Interviewing witnesses: do investigative and evidential requirements concur? Milne B., Roberts K.A., editors. Br. J. Forensic Practice. 2011;13(2):103–113. doi: 10.1108/14636641111134341. Publisher: Emerald Group Publishing Limited. 10.1108/14636641111134341 Haworth K. Routledge Taylor & Francis Group; 2020. Police Interviews as Evidence. URL: https://publications.aston.ac.uk/id/eprint/42440/ Milne R., Nunan J., Hope L., Hodgkins J., Clarke C. From verbal account to written evidence: Do written statements generated by officers accurately represent what witnesses say? Front. Psychol. 2022;12 URL: https://www.frontiersin.org/articles/10.3389/fpsyg.2021.774322. 10.3389/fpsyg.2021.774322 PMC8868373 35222145 A. Koenecke, A.S.G. Choi, K.X. Mei, H. Schellmann, M. Sloane, Careless Whisper: Speech-to-Text Hallucination Harms, in: The 2024 ACM Conference on Fairness, Accountability, and Transparency, 2024, pp. 1672–1681. European Commission A. 2016. Directive (EU) 2016/680 of the European parliament and of the council of 27 april 2016 on the protection of natural persons with regard to the processing of personal data by competent authorities for the purposes of the prevention, investigation, detection or prosecution of criminal offences or the execution of criminal penalties, and on the free movement of such data, and repealing council framework decision 2008/977/JHA. URL: http://data.europa.eu/eli/dir/2016/680/oj. Article 29 Working Party A. 2007. Opinion 04/2007 on the concept of personal data (WP 136, 20 june 2007) URL: https://ec.europa.eu/justice/article-29/documentation/opinion-recommendation/index_en.htm#maincontentSec11. Koenecke A., Nam A., Lake E., Nudell J., Quartey M., Mengesha Z., Toups C., Rickford J.R., Jurafsky D., Goel S. Racial disparities in automated speech recognition. Proc. Natl. Acad. Sci. 2020;117(14):7684–7689. PMC7149386 32205437 N. Markl, Language variation and algorithmic bias: understanding algorithmic bias in British English automatic speech recognition, in: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, 2022, pp. 521–534. Fairlearn N. 2023. Performing a fairness assessment. URL: https://fairlearn.org/main/user_guide/assessment/perform_fairness_assessment.html. Rajan S.S., Udeshi S., Chattopadhyay S. International Conference on Fundamental Approaches To Software Engineering. Springer International Publishing Cham; 2022. Aequevox: Automated fairness testing of speech recognition systems; pp. 245–267. Liu C., Picheny M., Sarı L., Chitkara P., Xiao A., Zhang X., Chou M., Alvarado A., Hazirbas C., Saraf Y. Towards measuring fairness in speech recognition: Casual conversations dataset transcriptions. ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP; IEEE; 2022. pp. 6162–6166. Pallet D.S., Fisher W.M., Fiscus J.G. International Conference on Acoustics, Speech, and Signal Processing. IEEE; 1990. Tools for the analysis of benchmark speech recognition tests; pp. 97–100. Dheram P., Ramakrishnan M., Raju A., Chen I.-F., King B., Powell K., Saboowala M., Shetty K., Stolcke A. 2022. Toward fairness in speech recognition: Discovery and mitigation of performance disparities. arXiv preprint arXiv:2207.11345. P.E. Solberg, P. Ortiz, P. Parsons, T. Svendsen, G. Salvi, Improving generalization of Norwegian ASR with limited linguistic resources, in: The 24rd Nordic Conference on Computational Linguistics, 2023. de Miguel Beriain I., Jiménez P.N., José Rementería M. European Parliament/ Directorate General for Parliamentary Research Services (EPRS); LU: 2022. Auditing the Quality of Datasets Used in Algorithmic Decision-Making Systems. URL: https://data.europa.eu/doi/10.2861/98930. 10.2861/98930 A. Aksënova, D. van Esch, J. Flynn, P. Golik, How might we create better benchmarks for speech recognition?, in: Proceedings of the 1st Workshop on Benchmarking: Past, Present and Future, 2021, pp. 22–34. OpenAI A. 2024. Whisper large V3 model card. URL: https://huggingface.co/openai/whisper-large-v3. S. Khare, A.R. Mittal, A. Diwan, S. Sarawagi, P. Jyothi, S. Bharadwaj, Low Resource ASR: The Surprising Effectiveness of High Resource Transliteration, in: Interspeech, 2021, pp. 1529–1533. Kummervold P.E., de la Rosa J., Wetjen F., Braaten R.-A., Solberg P.E. 2024. Whispering in norwegian: Navigating orthographic and dialectic challenges. arXiv preprint arXiv:2402.01917. P.E. Solberg, P. Beauguitte, P.E. Kummervold, F. Wetjen, A Large Norwegian Dataset for Weak Supervision ASR, in: Proceedings of the Second Workshop on Resources and Representations for under-Resourced Languages and Domains, RESOURCEFUL-2023, 2023, pp. 48–52. Wu T. Department of Engineering University of Cambridge; 2023. Distilling and Forgetting in Large Pre-Trained Models. Jain R., Barcovschi A., Yiwere M.Y., Corcoran P., Cucu H. Exploring native and non-native english child speech recognition with whisper. IEEE Access. 2024;12:41601–41610. Veale M., Zuiderveen Borgesius F. Demystifying the draft EU artificial intelligence act—Analysing the good, the bad, and the unclear elements of the proposed approach. Comput. Law Rev. Int. 2021;22(4):97–112. Amodei D., Ananthanarayanan S., Anubhai R., Bai J., Battenberg E., Case C., Casper J., Catanzaro B., Cheng Q., Chen G., et al. International Conference on Machine Learning. PMLR; 2016. Deep speech 2: End-to-end speech recognition in english and mandarin; pp. 173–182. Likhomanenko T., Xu Q., Pratap V., Tomasello P., Kahn J., Avidov G., Collobert R., Synnaeve G. 2020. Rethinking evaluation in asr: Are our models robust enough? arXiv preprint arXiv:2010.11745. Kiseleva A., Kotzinos D., De Hert P. Transparency of AI in healthcare as a multilayered system of accountabilities: Between legal requirements and technical limitations. Front. Artif. Intell. 2022;5 URL: https://www.frontiersin.org/articles/10.3389/frai.2022.879603. 10.3389/frai.2022.879603 PMC9189302 35707765 Bibal A., Lognoul M., de Streel A., Frénay B. Legal requirements on explainability in machine learning. Artif. Intell. Law. 2021;29(2):149–169. doi: 10.1007/s10506-020-09270-4. 10.1007/s10506-020-09270-4 Carvalho D.V., Pereira E.M., Cardoso J.S. Machine learning interpretability: A survey on methods and metrics. Electronics. 2019;8(8):832. URL: https://www.mdpi.com/2079-9292/8/8/832. Number: 8 Publisher: Multidisciplinary Digital Publishing Institute. M.T. Ribeiro, S. Singh, C. Guestrin, ” Why should I trust you?” Explaining the predictions of any classifier, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 1135–1144. A. Krug, R. Knaebel, S. Stober, Neuron activation profiles for interpreting convolutional speech recognition models, in: NeurIPS Workshop on Interpretability and Robustness in Audio, Speech, and Language, IRASL, 2018. A. Krug, S. Stober, Introspection for convolutional automatic speech recognition, in: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, 2018, pp. 187–199. Healy E.W., Yoho S.E., Apoux F. Band importance for sentences and words reexamined. J. Acoust. Soc. Am. 2013;133(1):463–473. PMC3548885 23297918 H.S. Kavaki, M.I. Mandel, Identifying important time-frequency locations in continuous speech utterances, in: Proceedings of Interspeech, 2020. Trinh V.A., Kavaki H.S., Mandel M.I. Importantaug: a data augmentation agent for speech. ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP; IEEE; 2022. pp. 8592–8596. Lunde S.R. NTNU; 2022. Modeling the Interpretability of an End-to-End Automatic Speech Recognition System Adapted to Norwegian Speech. Z. Gekhman, D. Zverinski, J. Mallinson, G. Beryozkin, RED-ACE: Robust Error Detection for ASR using Confidence Embeddings, in: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, pp. 2800–2808. Oneaţă D., Caranica A., Stan A., Cucu H. An evaluation of word-level confidence estimation for end-to-end automatic speech recognition. 2021 IEEE Spoken Language Technology Workshop; SLT; IEEE; 2021. pp. 258–265. Miner A.S., Haque A., Fries J.A., Fleming S.L., Wilfley D.E., Terence Wilson G., Milstein A., Jurafsky D., Arnow B.A., Stewart Agras W., et al. Assessing the accuracy of automatic speech recognition for psychotherapy. NPJ Digit. Med. 2020;3(1):82. PMC7270106 32550644 A.C. Morris, V. Maier, P. Green, From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition, in: Eighth International Conference on Spoken Language Processing, 2004. N.A. Smuha, E. Ahmed-Rengers, A. Harkens, W. Li, J. MacLaren, R. Piselli, K. Yeung, How the EU Can Achieve Legally Trustworthy AI: A Response to the European Commission’s Proposal for an Artificial Intelligence Act, Rochester, NY, 2021, 10.2139/ssrn.3899991, URL:. 10.2139/ssrn.3899991 Urban E., Mehrotra N. 2024. Test accuracy of a custom speech model. URL: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-evaluate-data?pivots=speech-studio. Y. Gaur, W.S. Lasecki, F. Metze, J.P. Bigham, The effects of automatic speech recognition quality on human transcription latency, in: Proceedings of the 13th International Web for All Conference, 2016, pp. 1–8. Yi L., Min R., Kunjie C., Dan L., Ziqiang Z., Fan L., Bo Y. Identifying and managing risks of ai-driven operations: A case study of automatic speech recognition for improving air traffic safety. Chin. J. Aeronaut. 2023;36(4):366–386. Gabler P., Geiger B.C., Schuppler B., Kern R. Reconsidering read and spontaneous speech: Causal perspectives on the generation of training data for automatic speech recognition. Information. 2023;14(2):137. Yang M., Kanda N., Wang X., Wu J., Sivasankaran S., Chen Z., Li J., Yoshioka T. Simulating realistic speech overlaps improves multi-talker ASR. ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP; IEEE; 2023. pp. 1–5. Koluguri N.R., Kriman S., Zelenfroind G., Majumdar S., Rekesh D., Noroozi V., Balam J., Ginsburg B. Investigating end-to-end ASR architectures for long form audio transcription. ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP; IEEE; 2024. pp. 13366–13370. M. Garnerin, S. Rossato, L. Besacier, Gender representation in French broadcast corpora and its impact on ASR performance, in: Proceedings of the 1st International Workshop on AI for Smart TV Content Production, Access and Delivery, 2019, pp. 3–9. Garnerin M., Rossato S., Besacier L. 3rd Workshop on Gender Bias in Natural Language Processing. Association for Computational Linguistics; 2021. Investigating the impact of gender representation in ASR training data: A case study on librispeech; pp. 86–92. Vipperla R., Renals S., Frankel J. 2008. Longitudinal study of ASR performance on ageing voices. Fan Z., Cao X., Salvi G., Svendsen T. Using modified adult speech as data augmentation for child speech recognition. ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing; ICASSP; IEEE; 2023. pp. 1–5. G. Yeung, A. Alwan, On the difficulties of automatic speech recognition for kindergarten-aged children, in: Interspeech 2018, 2018. Y. Getman, R. Al-Ghezi, K. Voskoboinik, T. Grósz, M. Kurimo, G. Salvi, T. Svendsen, S. Strömbergsson, Wav2vec2-based speech rating system for children with speech sound disorder, in: Interspeech, 2022. R. Cumbal, B. Moell, J.D. Águas Lopes, O. Engwall, “You don’t understand me!”: Comparing ASR results for L1 and L2 speakers of Swedish, in: Interspeech 2021, 2021. Feng S., Kudina O., Halpern B.M., Scharenborg O. 2021. Quantifying bias in automatic speech recognition. arXiv preprint arXiv:2103.15122. P. Parsons, K. Kvale, T. Svendsen, G. Salvi, A character-based analysis of impacts of dialects on end-to-end Norwegian ASR, in: The 24rd Nordic Conference on Computational Linguistics, 2023. P.E. Solberg, P. Ortiz, The Norwegian Parliamentary Speech Corpus, in: Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022, pp. 1003–1008. Park T.J., Kanda N., Dimitriadis D., Han K.J., Watanabe S., Narayanan S. A review of speaker diarization: Recent advances with deep learning. Comput. Speech Lang. 2022;72 Huh J., Chung J.S., Nagrani A., Brown A., Jung J.-w., Garcia-Romero D., Zisserman A. The vox celeb speaker recognition challenge: A retrospective. IEEE/ACM Trans. Audio Speech Lang. Process. 2024 Ji Z., Lee N., Frieske R., Yu T., Su D., Xu Y., Ishii E., Bang Y.J., Madotto A., Fung P. Survey of hallucination in natural language generation. ACM Comput. Surv. 2023;55(12):1–38. A. Mittal, R. Murthy, V. Kumar, R. Bhat, Towards understanding and mitigating the hallucinations in NLP and Speech, in: Proceedings of the 7th Joint International Conference on Data Science & Management of Data, 11th ACM IKDD CODS and 29th COMAD, 2024, pp. 489–492. Serai P., Sunder V., Fosler-Lussier E. Hallucination of speech recognition errors with sequence to sequence learning. IEEE/ACM Trans. Audio Speech Lang. Process. 2022;30:890–900. Field A., Verma P., San N., Eberhardt J.L., Jurafsky D. 2023. Developing speech processing pipelines for police accountability. arXiv preprint arXiv:2306.06086. Aghakhani H., Schönherr L., Eisenhofer T., Kolossa D., Holz T., Kruegel C., Vigna G. Venomave: Targeted poisoning against speech recognition. 2023 IEEE Conference on Secure and Trustworthy Machine Learning; SaTML; IEEE; 2023. pp. 404–417. Carlini N., Wagner D. Audio adversarial examples: Targeted attacks on speech-to-text. 2018 IEEE Security and Privacy Workshops; SPW; IEEE; 2018. pp. 1–7. Olivier R., Raj B. Proc. INTERSPEECH 2023. 2023. There is more than one kind of robustness: Fooling whisper with adversarial examples; pp. 4394–4398. 10.21437/Interspeech.2023-1105 Olivier R., Raj B. Proc. Interspeech 2022. 2022. Recent improvements of ASR models in the face of adversarial attacks; pp. 4113–4117. 10.21437/Interspeech.2022-400 Bai Y., Wang Y., Zeng Y., Jiang Y., Xia S.-T. Query efficient black-box adversarial attack on deep neural networks. Pattern Recognit. 2023;133 Peri N., Gupta N., Huang W.R., Fowl L., Zhu C., Feizi S., Goldstein T., Dickerson J.P. Computer Vision–ECCV 2020 Workshops: Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16. Springer; 2020. Deep k-nn defense against clean-label data poisoning attacks; pp. 55–70. Neuwirth R.J. Prohibited artificial intelligence practices in the proposed EU artificial intelligence act (AIA) Comput. Law Secur. Rev. 2023;48 doi: 10.1016/j.clsr.2023.105798. URL: https://www.sciencedirect.com/science/article/pii/S0267364923000092. 10.1016/j.clsr.2023.105798 39698099 2024 12 19 2405-8440 10 23 2024 Dec 15 Heliyon Heliyon Predicting the victims of hate speech on microblogging platforms. e40611 e40611 e40611 10.1016/j.heliyon.2024.e40611 Hate speech constitutes a major problem on microblogging platforms, with automatic detection being a growing research area. Most existing works focus on analyzing the content of social media posts. Our study shifts focus to predicting which users are likely to become targets of hate speech. This paper proposes a novel Hate-speech Target Prediction Framework (HTPK) and introduces a new Hate Speech Target Dataset (HSTD), which contains tweets labeled for targets and non-targets of hate speech. Using a combination of Term Frequency-Inverse Document Frequency (TFIDF), N-grams, and Part-of-Speech (PoS) tags, we tested various machine learning algorithms, Naïve Bayes (NB) classifier performs best with an accuracy of 93%, significantly outperforming other algorithms. This research identifies the optimal combination of features for predicting hate speech targets and compares various machine learning algorithms, providing a foundation for more proactive hate speech mitigation on social media platforms. © 2024 The Authors. Khan Sahrish S Department of Computer Science, Quaid-i-Azam University, Islamabad, Pakistan. Department of Computer Science, University of Warwick, Coventry CV4 7AL, UK. Abbasi Rabeeh Ayaz RA Department of Computer Science, Quaid-i-Azam University, Islamabad, Pakistan. Sindhu Muddassar Azam MA Department of Computer Science, Quaid-i-Azam University, Islamabad, Pakistan. Arafat Sachi S Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia. Khattak Akmal Saeed AS Department of Computer Science, Quaid-i-Azam University, Islamabad, Pakistan. Daud Ali A Faculty of Resilience, Rabdan Academy, Abu Dhabi, United Arab Emirates. Mushtaq Mubashar M Department of Computer Science, Forman Christian College (A Chartered University), Lahore, Pakistan. eng Journal Article 2024 11 26 England Heliyon 101672560 2405-8440 Hate speech Machine learning Prediction Social media Twitter The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. 2024 5 16 2024 11 5 2024 11 20 2024 12 19 6 23 2024 12 19 6 22 2024 12 19 5 19 2024 11 26 epublish 39698099 PMC11652838 10.1016/j.heliyon.2024.e40611 S2405-8440(24)16642-X Ibrahim Y.M., Essameldin R., Saad S.M. Social media forensics: an adaptive cyberbullying-related hate speech detection approach based on neural networks with uncertainty. IEEE Access. 2024;12:59474–59484. doi: 10.1109/ACCESS.2024.3393295. 10.1109/ACCESS.2024.3393295 Chadha K., Steiner L., Vitak J., Ashktorab Z. Women's responses to online harassment. Int. J. Commun. 2020;14(1):239–257. Chetty N., Alathur S. Hate speech review in the context of online social networks. Aggress. Violent Behav. 2018;40:108–118. doi: 10.1016/j.avb.2018.05.003. 10.1016/j.avb.2018.05.003 Son L.H., Kumar A., Sangwan S.R., Arora A., Nayyar A., Abdel-Basset M. Sarcasm detection using soft attention-based bidirectional long short-term memory model with convolution network. IEEE Access. 2019;7:23319–23328. doi: 10.1109/ACCESS.2019.2899260. 10.1109/ACCESS.2019.2899260 Bouazizi M., Ohtsuki T. 2016 IEEE Global Communications Conference (GLOBECOM) IEEE; USA: 2016. Sentiment analysis in Twitter: from classification to quantification of sentiments within tweets; pp. 1–6. 10.1109/GLOCOM.2016.7842262 Djuric N., Zhou J., Morris R., Grbovic M., Radosavljevic V., Bhamidipati N. Hate speech detection with comment embeddings. Proceedings of the 24th International Conference on World Wide Web; WWW ‘15 Companion; New York, NY, USA: ACM; 2015. pp. 29–30. 10.1145/2740908.2742760 Badjatiya P., Gupta S., Gupta M., Varma V. Deep learning for hate speech detection in tweets. Proceedings of the 26th International Conference on World Wide Web Companion; WWW ‘17 Companion, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland; 2017. pp. 759–760. 10.1145/3041021.3054223 Watanabe H., Bouazizi M., Ohtsuki T. Hate speech on Twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access. 2018;6:13825–13835. doi: 10.1109/ACCESS.2018.2806394. 10.1109/ACCESS.2018.2806394 Rodríguez A., Argueta C., Chen Y. 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC) IEEE; USA: 2019. Automatic detection of hate speech on Facebook using sentiment and emotion analysis; pp. 169–174. 10.1109/ICAIIC.2019.8669073 Fazil M., Abulaish M. A hybrid approach for detecting automated spammers in Twitter. IEEE Trans. Inf. Forensics Secur. 2018;13(11):2707–2719. doi: 10.1109/TIFS.2018.2825958. 10.1109/TIFS.2018.2825958 Burnap P., Williams M.L. Us and them: identifying cyber hate on Twitter across multiple protected characteristics. EPJ Data Sci. 2016;5(1) doi: 10.1140/epjds/s13688-016-0072-6. 10.1140/epjds/s13688-016-0072-6 PMC7175598 32355598 Dadvar M., Trieschnigg R., de Jong F. 25th Benelux Conference on Artificial Intelligence, BNAIC 2013. Delft University of Technology; Netherlands: 2013. Expert knowledge for automatic detection of bullies in social networks; pp. 57–64. Alfina I., Mulia R., Fanany M.I., Ekanata Y. 2017 International Conference on Advanced Computer Science and Information Systems (ICACSIS) IEEE; USA: 2017. Hate speech detection in the Indonesian language: a dataset and preliminary study; pp. 233–238. 10.1109/ICACSIS.2017.8355039 Nobata C., Tetreault J., Thomas A., Mehdad Y., Chang Y. Abusive language detection in online user content. Proceedings of the 25th International Conference on World Wide Web; WWW ‘16, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland; 2016. pp. 145–153. 10.1145/2872427.2883062 Gaydhani A., Doma V., Kendre S., Bhagwat L. Detecting hate speech and offensive language on Twitter using machine learning: an n-gram and TFIDF based approach. 2018. arXiv:1809.08651 [abs]arXiv:1809.08651https://doi.org/10.48550/arXiv.1809.08651http://arxiv.org/abs/1809.08651 CoRR. 10.48550/arXiv.1809.08651 Abbass Z., Ali Z., Ali M., Akbar B., Saleem A. 2020 IEEE 14th International Conference on Semantic Computing (ICSC) IEEE; USA: 2020. A framework to predict social crime through Twitter tweets by using machine learning; pp. 363–368. 10.1109/ICSC.2020.00073 Cortis K., Handschuh S. Analysis of cyberbullying tweets in trending world events. Proceedings of the 15th International Conference on Knowledge Technologies and Data-Driven Business; i-KNOW ‘15; New York, NY, USA: ACM; 2015. pp. 7:1–7:8. 10.1145/2809563.2809605 Devlin J., Chang M.-W., Lee K., Toutanova K. Bert: pre-training of deep bidirectional transformers for language understanding. 2019. arXiv:1810.04805 Liu Y., Ott M., Goyal N., Du J., Joshi M., Chen D., Levy O., Lewis M., Zettlemoyer L., Stoyanov V. 2019. arXiv:1907.11692 Roberta: a robustly optimized bert pretraining approach. He P., Liu X., Gao J., Chen W. 2021. arXiv:2006.03654 Deberta: Decoding-enhanced bert with disentangled attention. Kikkisetti D., Mustafa R.U., Melillo W., Corizzo R., Boukouvalas Z., Gill J., Japkowicz N. Using llms to discover emerging coded antisemitic hate-speech in extremist social. 2024. arXiv:2401.10841 media. Zhu L., Pergola G., Gui L., Zhou D., He Y. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) Zong C., Xia F., Li W., Navigli R., editors. Association for Computational Linguistics; 2021. Topic-driven and knowledge-aware transformer for dialogue emotion detection; pp. 1571–1582. Online. Pergola G., Gui L., He Y. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Toutanova K., Rumshisky A., Zettlemoyer L., Hakkani-Tur D., Beltagy I., Bethard S., Cotterell R., Chakraborty T., Zhou Y., editors. Association for Computational Linguistics; 2021. A disentangled adversarial neural topic model for separating opinions from plots in user reviews; pp. 2870–2883. Online. Wolfe R., Yang Y., Howe B., Caliskan A. Contrastive language-vision ai models pretrained on web-scraped multimodal data exhibit sexual objectification bias. Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency; FAccT ‘23; New York, NY, USA: Association for Computing Machinery; 2023. pp. 1174–1185. 10.1145/3593013.3594072 Pergola G., Gui L., He Y. TDAM: a topic-dependent attention model for sentiment analysis. Inf. Process. Manag. 2019;56(6) Lu J., Tan X., Pergola G., Gui L., He Y. Event-centric question answering via contrastive learning and invertible event transformation. In: Goldberg Y., Kozareva Z., Zhang Y., editors. Findings of the Association for Computational Linguistics: EMNLP 2022, Association for Computational Linguistics, Abu Dhabi; United Arab Emirates; 2022. pp. 2377–2389. Lu J., Li J., Wallace B., He Y., Pergola G. NapSS: paragraph-level medical text simplification via narrative prompting and sentence-matching summarization. In: Vlachos A., Augenstein I., editors. Findings of the Association for Computational Linguistics: EACL 2023, Association for Computational Linguistics; Dubrovnik, Croatia; 2023. pp. 1079–1091. Irfan A., Azeem D., Narejo S., Kumar N. 2024 IEEE 1st Karachi Section Humanitarian Technology Conference (KHI-HTC) 2024. Multi-modal hate speech recognition through machine learning; pp. 1–6. 10.1109/KHI-HTC60760.2024.10482031 Silva L., Mondal M., Correa D., Benevenuto F., Weber I. Analyzing the targets of hate in online social media. Proceedings of the 10th International Conference on Web and Social Media; ICWSM 2016, AAAI, Conference date: 17-05-2016 Through 20-05-201, USA; 2016. pp. 687–690. ElSherief M., Nilizadeh S., Nguyen D., Vigna G., Belding E. Peer to peer hate: hate speech instigators and their targets. International AAAI Conference on Web and Social; Media, AAAI, USA; 2018. pp. 1–10. Zampieri M., Malmasi S., Nakov P., Rosenthal S., Farra N., Kumar R. Predicting the type and target of offensive posts in social media. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers); Association for Computational Linguistics, Minneapolis, Minnesota; 2019. pp. 1415–1420. 10.18653/v1/N19-1144 Saeed Z., Abbasi R.A., Maqbool O., Sadaf A., Razzak I., Daud A., Aljohani N.R., Xu G. What's happening around the world? A survey and framework on event detection techniques on Twitter. J. Grid Comput. 2019;17:279–312. Davidson T., Warmsley D., Macy M., Weber I. Automated hate speech detection and the problem of offensive language. International AAAI Conference on Web and Social; Media, AAAI, USA; 2017. pp. 1–4. Zhang Z., Luo L. Hate speech detection: a solved problem? The challenging case of long tail on Twitter. Semant. Web Accepted. 2018;10 doi: 10.3233/SW-180338. 10.3233/SW-180338 Sharma S., Agrawal S., Shrivastava M. Degree based classification of harmful speech using Twitter data. Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying; (TRAC-2018), Association for Computational Linguistics, Santa Fe, New Mexico, USA; 2018. pp. 106–112. Aulia N., Budi I. Hate speech detection on Indonesian long text documents using machine learning approach. Proceedings of the 2019 5th International Conference on Computing and Artificial Intelligence; ICCAI ‘19; New York, NY, USA: ACM; 2019. pp. 164–169. 10.1145/3330482.3330491 Zhang H., Mahata D., Shahid S., Mehnaz L., Anand S., Kumar Y., Shah R.R., Uppal K. MIDAS at SemEval-2019 task 6: identifying offensive posts and targeted offense from Twitter. Proceedings of the 13th International Workshop on Semantic Evaluation; Association for Computational Linguistics, Minneapolis, Minnesota, USA; 2019. pp. 683–690. 10.18653/v1/S19-2122 Plaza-del Arco F.M., Molina-González M.D., Martin M., Ureña-López L.A. SINAI at SemEval-2019 task 5: ensemble learning to detect hate speech against inmigrants and women in English and Spanish tweets. Proceedings of the 13th International Workshop on Semantic Evaluation; Association for Computational Linguistics, Minneapolis, Minnesota, USA; 2019. pp. 476–479. 10.18653/v1/S19-2084 Nugroho K., Noersasongko Purwanto, Muljono E. 2019 International Conference on Information and Communications Technology (ICOIACT) IEEE; USA: 2019. Improving random forest method to detect hatespeech and offensive word; pp. 514–518. Santosh T.Y., Aravind K.V. Hate speech detection in Hindi-English code-mixed social media text. Proceedings of the ACM India Joint International Conference on Data Science and Management of Data; CoDS-COMAD ‘19; New York, NY, USA: Association for Computing Machinery; 2019. pp. 310–313. 10.1145/3297001.3297048 Pratiwi N.I., Budi I., Jiwanggi M.A. Hate speech identification using the hate codes for Indonesian tweets. Proceedings of the 2019 2nd International Conference on Data Science and Information Technology; DSIT 2019; New York, NY, USA: Association for Computing Machinery; 2019. pp. 128–133. 10.1145/3352411.3352432 Bohra A., Vijay D., Singh V., Akhtar S.S., Shrivastava M. A dataset of Hindi-English code-mixed social media text for hate speech detection. Proceedings of the Second Workshop on Computational Modeling of People‘s Opinions, Personality, and Emotions in Social Media; Association for Computational Linguistics, New Orleans, Louisiana, USA; 2018. pp. 36–41. 10.18653/v1/W18-1105 Mossie Z., Wang J.-H. Vulnerable community identification using hate speech detection on social media. Inf. Process. Manag. 2020;57(3) doi: 10.1016/j.ipm.2019.102087. 10.1016/j.ipm.2019.102087 Ribeiro M., Calais P., Santos Y., Almeida V., Jr. W.M. Characterizing and detecting hateful users on Twitter. International AAAI Conference on Web and Social; Media, AAAI, USA; 2018. pp. 1–4. Tulkens S., Hilte L., Lodewyckx E., Verhoeven B., Daelemans W. A dictionary-based approach to racism detection in Dutch social media. Proceedings of the LREC 2016 Workshop on Text Analytics for Cybersecurity and Online Safety (TA-COS), European Language Resources Association (ELRA); Association for Computational Linguistics, San Diego, California; 2016. pp. 1–7. Ross B., Rist M., Carbonell G., Cabrera B., Kurowsky N., Wojatzki M. In: Proceedings of NLP4CMC III: 3rd Workshop on Natural Language Processing for Computer-Mediated Communication. Beißwenger M., Wojatzki M., Zesch T., editors. vol. 17. Johann Christian Senckenberg; Bochum, Germany: 2016. Measuring the reliability of hate speech annotations: the case of the European refugee crisis; pp. 6–9. (Bochumer Linguistische Arbeitsberichte, Universitaetsbibliothek). Ganganwar V. An overview of classification algorithms for imbalanced datasets. Int. J. Emerg. Technol. Adv. Eng. 2012;2:42–47. Alowibdi J.S., Alshdadi A.A., Daud A., Dessouky M.M., Alhazmi E.A. Coronavirus pandemic (covid-19): emotional toll analysis on Twitter. Int. J. Semantic Web Inf. Syst. 2021;17(2):1–21. Meng Q., Suresh T., Lee R.K.-W., Chakraborty T. Predicting hate intensity of Twitter conversation threads. Knowl.-Based Syst. 2023;275 doi: 10.1016/j.knosys.2023.110644. https://www.sciencedirect.com/science/article/pii/S0950705123003945 10.1016/j.knosys.2023.110644 Rosenberg E., Tarazona C., Mallor F., Eivazi H., Pastor-Escuredo D., Fuso-Nerini F., Vinuesa R. Sentiment analysis on Twitter data towards climate action. Results Eng. 2023;19 doi: 10.1016/j.rineng.2023.101287. 10.1016/j.rineng.2023.101287 Khan W., Daud A., Khan K., Muhammad S., Haq R. Exploring the frontiers of deep learning and natural language processing: a comprehensive overview of key challenges and emerging trends. Nat. Lang. Process. J. 2023 Haider F., Dipty I., Rahman F., Assaduzzaman M., Sohel A. In: Computational Intelligence in Data Science. Chandran KR S., N S., A B., Hamead H S., editors. Springer Nature; Switzerland, Cham: 2023. Social media hate speech detection using machine learning approach; pp. 218–229. 39690813 2024 12 18 1651-2022 2024 Dec 17 Logopedics, phoniatrics, vocology Logoped Phoniatr Vocol Prosodic changes with age: a longitudinal study with three public figures in European Portuguese. 1 10 1-10 10.1080/14015439.2024.2431331 The analysis of acoustic parameters contributes to the characterisation of human communication development throughout the lifetime. The present paper intends to analyse suprasegmental features of European Portuguese in longitudinal conversational speech samples of three male public figures in uncontrolled environments across different ages, approximately 30 years apart. Twenty prosodic features concerning intonation, intensity, rhythm, and pause measures were extracted semi-automatically from 360 speech intervals (3-4 interviews from each speaker x 30 speech intervals x 3 speakers) lasting between 3 to 6 s. Twelve prosodic parameters presented significant age effects at least in one speaker. Group mean comparisons revealed significant differences between the youngest (i.e. 50 years) and the oldest age groups (i.e. 80 years) in seven parameters. The results from the analysis point to a lower and less variable fo, higher fo minimum, wider fo peaks, more vocal effort and more variable global intensity, slower speech and articulation rate, and also more frequent and longer pauses in older ages. This longitudinal study has the potential to contribute to the characterization of the normal aging process, proving to be significant in the domains of human-machine communication, speech recognition systems, applied linguistics, or the implementation of strategies in communicative contexts with older adults. Valente Ana Rita S ARS Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal. Department Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal. School of Health Sciences, Polytechnic of Leiria, Leiria, Portugal. Oliveira Catarina C Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal. School of Health Sciences, University of Aveiro, Aveiro, Portugal. Albuquerque Luciana L Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal. Department Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal. Center for Health Technology and Services Research, University of Aveiro, Aveiro, Portugal. Teixeira António A Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal. Department Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal. Barbosa Plínio A PA Speech Prosody Studies Group, Dep. of Linguistics, State Univ. of Campinas, Campinas, Brazil. eng Journal Article 2024 12 17 England Logoped Phoniatr Vocol 9617311 1401-5439 IM Prosody acoustic phonetics longitudinal analysis vocal ageing 2024 12 18 11 32 2024 12 18 11 32 2024 12 18 1 53 aheadofprint 39690813 10.1080/14015439.2024.2431331 39687201 2024 12 17 1949-3045 15 4 2024 Oct-Dec IEEE transactions on affective computing IEEE Trans Affect Comput Multimodal Prediction of Obsessive-Compulsive Disorder and Comorbid Depression Severity and Energy Delivered by Deep Brain Electrodes. 2025 2041 2025-2041 10.1109/taffc.2024.3395117 To develop reliable, valid, and efficient measures of obsessive-compulsive disorder (OCD) severity, comorbid depression severity, and total electrical energy delivered (TEED) by deep brain stimulation (DBS), we trained and compared random forests regression models in a clinical trial of participants receiving DBS for refractory OCD. Six participants were recorded during open-ended interviews at pre- and post-surgery baselines and then at 3-month intervals following DBS activation. Ground-truth severity was assessed by clinical interview and self-report. Visual and auditory modalities included facial action units, head and facial landmarks, speech behavior and content, and voice acoustics. Mixed-effects random forest regression with Shapley feature reduction strongly predicted severity of OCD, comorbid depression, and total electrical energy delivered by the DBS electrodes (intraclass correlation, ICC, = 0.83, 0.87, and 0.81, respectively. When random effects were omitted from the regression, predictive power decreased to moderate for severity of OCD and comorbid depression and remained comparable for total electrical energy delivered (ICC = 0.60, 0.68, and 0.83, respectively). Multimodal measures of behavior outperformed ones from single modalities. Feature selection achieved large decreases in features and corresponding increases in prediction. The approach could contribute to closed-loop DBS that would automatically titrate DBS based on affect measures. Hinduja Saurabh S 0000-0003-1637-5950 Department of Psychology, University of Pittsburgh, Pittsburgh, PA 15213 USA. Darzi Ali A Department of Psychology, University of Pittsburgh, Pittsburgh, PA 15213 USA. Ertugrul Itir Onal IO Department of Information and Computing Sciences, Utrecht University, 3584 CS Utrecht, The Netherlands. Provenza Nicole N 0000-0002-6952-5417 Department of Neurosurgery, Baylor College of Medicine, Houston, TX 77090 USA. Gadot Ron R Department of Neurosurgery, Baylor College of Medicine, Houston, TX 77090 USA. Storch Eric A EA Menninger Department of Psychiatry and Behavioral Science, Baylor College of Medicine, Houston, TX 77090 USA. Sheth Sameer A SA Department of Neurosurgery, Baylor College of Medicine, Houston, TX 77090 USA. Goodman Wayne K WK Menninger Department of Psychiatry and Behavioral Science, Baylor College of Medicine, Houston, TX 77090 USA. Cohn Jeffrey F JF 0000-0002-9393-1116 Department of Psychology, University of Pittsburgh, Pittsburgh, PA 15213 USA. eng Journal Article 2024 04 30 United States IEEE Trans Affect Comput 101635097 1949-3045 Obsessive-compulsive disorder (OCD) deep brain stimulation (DBS) depression mixed-effects multimodal machine learning shapley feature reduction 2024 12 17 11 51 2024 12 17 11 50 2024 12 17 4 39 2024 12 16 ppublish 39687201 PMC11649003 10.1109/taffc.2024.3395117 NIHMS2010941 American Psychiatric Association, DSM-5, Washington, DC, 2015. Alghowinem S, Gedeon T, Goecke R, Cohn J, and Parker G, “Depression detection model interpretation via feature selection methods,” IEEE Trans. Affective Comput, vol. 14, no. 1,pp. 133–152, First Quarter 2023. PMC10019578 36938342 Cummins N, Scherer S, Krajewski J, Schnieder S, Epps J, and Quatieri TF, “A review of depression and suicide risk assessment using speech analysis,” Speech Commun., vol. 71, no. C, pp. 10–49, Jul. 2015. [Online]. Available: 10.1016/j.specom.2015.03.004 10.1016/j.specom.2015.03.004 Dibeklioglu H, Hammal Z, and Cohn JF, “Dynamic multimodal measurement of depression severity using deep autoencoding,” IEEE J. Biomed. Health Inform, vol. 22, no. 2, pp. 525–536, Mar. 2018. PMC5581737 28278485 Fang M, Peng S, Liang Y, Hung C-C, and Liu S, “A multimodal fusion model with multi-level attention mechanism for depression detection,” Biomed. Signal Process. Control, vol. 82, 2023, Art. no. 104561. Scherer S. et al., “Automatic behavior descriptors for psychological disorder analysis,” in Proc. IEEE Int. Conf. Autom. Face Gesture Recognit., 2013, pp. 1–8. Arioz U, Smrke U, Plohl N, and Mlakar I, “Scoping review on the multimodal classification of depression and experimental study on existing multimodal models,” Diagnostics, vol. 12, no. 11, 2022, Art. no. 2683. [Online]. Available: https://www.mdpi.com/2075-4418/12/11/2683
PMC9689708 36359525 Khoo LS, Lim MK, Y Chong C, and McNaney R, “Machine learning for multimodal mental health detection: A systematic review of passive sensing approaches,” Sensors, vol. 24, no. 2, 2024, Art. no. 348. PMC10820860 38257440 Fokkema JE-CM and Wolpert M, “Generalized linear mixed-model (GLMM) trees: A flexible decision-tree method for multilevel and longitudinal data,” Psychother. Res, vol. 31, no. 3, pp. 329–341, 2021. [Online]. Available: 10.1080/10503307.2020.1785037 10.1080/10503307.2020.1785037 32602811 Lewis RA, Ghandeharioun A, Fedor S, Pedrelli P, Picard R, and Mischoulon D, “Mixed effects random forests for personalised predictions of clinical depression severity,” 2023, arXiv:2301.09815. Valstar M. et al., “AVEC 2013: The continuous audio/visual emotion and depression recognition challenge,” in Proc. 3rd ACM Int. Workshop Audio/Vis. Emotion Challenge, New York, NY, USA, 2013, pp. 3–10. [Online]. Available: 10.1145/2512530.2512533 10.1145/2512530.2512533 Simpson EH, “The interpretation of interaction in contingency tables,” J. Roy. Stat. Soc. Ser. Methodol, vol. 13, no. 2, pp. 238–241, 1951. D. American Psychiatric Association et al., Diagnostic and Statistical Manual of Mental Disorders: DSM-5, vol. 5. Washington, DC, USA: Amer. Psychiatr. Assoc., 2013. Quarantini LC et al., “Comorbid major depression in obsessive-compulsive disorder patients,” Comprehensive Psychiatry, vol. 52, no. 4, pp. 386–393, 2011. 21087765 Sheth SA and Mayberg HS, “Deep brain stimulation for obsessive-compulsive disorder and depression,” Annu. Rev. Neurosci, vol. 46, pp. 341–358, Jul. 2023. 37018916 Crino RD and Andrews G, “Obsessive-compulsive disorder and axis I comorbidity,” J. Anxiety Disord, vol. 10, pp. 37–46, 1996. Overbeek T, Schruers K, and Griez E, “Comorbidity of obsessive-compulsive disorder and depression: Prevalence, symptom severity, and treatment effect,” J. Clin. Psychiatry, vol. 63, pp. 1106–1112, 2002. 12523869 Bellodi L, Sciuto G, Diaferia G, Ronchi P, and Smeraldi E, “Psychiatric disorders in the families of patients with obsessive-compulsive disorder,” Psychiatry Res., vol. 42, pp. 111–120, 1992. 1631248 Demal U, Lenz G, Mayrhofer A, Zapotoczky HG, and Zitterl W, “Obsessive-compulsive disorder and depression. A retrospective study on course and interaction,” Psychopathology, vol. 26, pp. 145–150, 1993. 8234627 Romanelli RJ, Wu FM, Gamba R, Mojtabai R, and Segal JB, “Behavioral therapy and serotonin reuptake inhibitor pharmacotherapy in the treatment of obsessive-compulsive disorder: A systematic review and meta-analysis of head-to-head randomized controlled trials,” Depression Anxiety, vol. 31, no. 8, pp. 641–652, 2014. 24390912 Öst L-G, Havnen A, Hansen B, and Kvale G, “Cognitive behavioral treatments of obsessive–compulsive disorder. A systematic review and meta-analysis of studies published 1993–2014,” Clin. Psychol. Rev, vol. 40, pp. 156–169, Aug. 2015. 26117062 Reddy YCJ, Sundar AS, Narayanaswamy JC, and Math SB, “Clinical practice guidelines for obsessive-compulsive disorder,” Indian J. Psychiatry, vol. 59, no. Suppl 1, pp. S74–S90, 2017. PMC5310107 28216787 Karas PJ, Lee S, Jimenez-Shahed J, Goodman WK, Viswanathan A, and Sheth SA, “Deep brain stimulation for obsessive compulsive disorder: Evolution of surgical stimulation target parallels changing model of dysfunctional brain circuits,” Front. Neurosci,vol. 12, 2019, Art. no. 998. PMC6331476 30670945 Rădulescu A, Herron J, Kennedy C, and Scimemi A, “Global and local excitation and inhibition shape the dynamics of the cortico-striatalthalamo-cortical pathway,” Sci. Rep, vol. 7, no. 1, 2017, Art. no. 7608. PMC5548923 28790376 Gadot R. et al., “Efficacy of deep brain stimulation for treatment-resistant obsessive-compulsive disorder: Systematic review and meta-analysis,” J. Neurol. Neurosurgery Psychiatry, vol. 93, no. 11, pp. 1166–1173, 2022. 36127157 Beebe B and Gerstman LJ, “The “packaging” of maternal stimulation in relation to infant facial-visual engagement: A case study at four months,” Merrill-Palmer Quart. Behav. Develop, vol. 26, no. 4, pp. 321–339, 1980. McCall MV, Riva-Possea P, Garlow SJ, Mayberg HS, and Crowell AL, “Analyzing non-verbal behavior throughout recovery in a sample of depressed patients receiving deep brain stimulation,” Neurol. Psychiatry Brain Res, vol. 37, pp. 33–40, 2020. PMC7375407 32699489 Cotes RO et al., “Multimodal assessment of schizophrenia and depression utilizing video, acoustic, locomotor, electroencephalographic, and heart rate technology: Protocol for an observational study,” JMIR Res. Protoc, vol. 11, no. 7, 2022, Art. no. e36417. PMC9330209 35830230 Bilalpur M. et al., “Multimodal feature selection for detecting mothers’ depression in dyadic interactions with their adolescent offspring,” in Proc. IEEE 17th Int. Conf. Autom. Face Gesture Recognit., 2023, pp. 1–8. PMC11408746 39296877 Ekman P, Friesen WV, and Hager JC, “Facial action coding system: Research Nexus,” in Netw. Res. Inf, Salt Lake City, UT, 2002. Cowen AS and Keltner D, “Universal facial expressions uncovered in art of the ancient americas: A computational approach,” Sci. Adv, vol. 6, no. 34, 2020, Art. no. eabb1005. PMC7438103 32875109 Barrett LF, Adolphs R, Marsella S, Martinez AM, and Pollak SD, “Emotional expressions reconsidered: Challenges to inferring emotion from human facial movements,” Psychol. Sci. Public Int, vol. 20, no. 1, pp. 1–68, 2019. PMC6640856 31313636 Keltner D, Sauter D, Tracy J, and Cowen A, “Emotional expression: Advances in basic emotion theory,” J. Nonverbal Behav, vol. 43, pp. 133–160, 2019. PMC6687086 31395997 Cordaro DT, Sun R, Keltner D, Kamble S, Huddar N, and McNeil G, “Universals and cultural variations in 22 emotional expressions across five cultures,” Emotion, vol. 18, no. 1, 2018, Art. no. 75. 28604039 Mattson RE, Rogge RD, Johnson MD, Davidson EK, and Fincham FD, “The positive and negative semantic dimensions of relationship satisfaction,” Pers. Relationships, vol. 20, no. 2, pp. 328–355, 2013. Ertugrul IO, Jeni LA, Ding W, and Cohn JF, “AFAR: A deep learning based tool for automated facial affect recognition,” in Proc. IEEE 14th Int. Conf. Autom. Face Gesture Recognit., 2019, pp. 1–1. PMC6874374 31762712 Ertugrul IO, Cohn JF, Jeni LA, Zhang Z, Yin L, and Ji Q, “Crossing domains for AU coding: Perspectives, approaches, and measures,” IEEE Trans. Biometrics Behav. Identity Sci, vol. 2, no. 2, pp. 158–171, Apr. 2020. PMC7202467 32377637 Baltrusaitis T, Zadeh A, Lim YC, and Morency L-P, “OpenFace 2.0: Facial behavior analysis toolkit,” in Proc. IEEE 13th Int. Conf. Autom. Face Gesture Recognit., 2018, pp. 59–66. N. I. Technology, “Facereader v6.1,” Report, 2015. Hammal Z, Cohn JF, and George DT, “Interpersonal coordination of headmotion in distressed couples,” IEEE Trans. Affective Comput, vol. 5, no. 2, pp. 155–167, Second Quarter 2014. PMC4495975 26167256 Hammal Z, Cohn JF, Heike C, and Speltz ML, “Automatic measurement of head and facial movement for analysis and detection of infants’ positive and negative affect,” Front. ICT, vol. 2, Dec. 2015, Art. no. 21. Gavrilescu M and Vizireanu N, “Predicting depression, anxiety, and stress levels from videos using the facial action coding system,” Sensors, vol. 19, no. 17, Aug. 2019, Art. no. 3693. PMC6749518 31450687 Yang T-H, Wu C-H, Su M-H, and Chang C-C, “Detection of mood disorder using modulation spectrum of facial action unit profiles,” in Proc. Int. Conf. Orange Technol., 2016, pp. 5–8. Martin KB et al., “Objective measurement of head movement differences in children with and without autism spectrum disorder,” Mol. Autism, vol. 9, no. 1, Dec. 2018, Art. no. 14. PMC5828311 29492241 Ding Y. et al., “Automated detection of optimal DBS device settings,” in Proc. Int. Conf. Multimodal Interact. Companion Pub., New York, NY, USA, 2020, pp. 354–356. PMC8086638 33937916 Darzi A. et al., “Facial action units and head dynamics in longitudinal interviews reveal OCD and depression severity and DBS energy,” in Proc. 16th IEEE Int. Conf. Autom. Face Gesture Recognit., 2021, pp. 1–6. Sundberg J, Patel S, Bjorkner E, and Scherer KR, “Interdependencies among voice source parameters in emotional speech,” IEEE Trans. Affective Comput, vol. 2, no. 3, pp. 162–174, Third Quarter 2011. Cordaro DT, Keltner D, Tshering S, Wangchuk D, and Flynn LM, “The voice conveys emotion in ten globalized cultures and one remote village in Bhutan,” Emotion, vol. 16, no. 1, 2016, Art. no. 117. 26389648 Busso C, Lee S, and Narayanan S, “Analysis of emotionally salient aspects of fundamental frequency for emotion detection,” IEEE Trans. Audio, Speech, Lang. Process, vol. 17, no. 4, pp. 582–596, May 2009. Alpert M, Pouget ER, and Silva RR, “Reflections of depression in acoustic measures of the patient’s speech,” J. Affect. Disord, vol. 66, no. 1, pp. 59–69, 2001. 11532533 Mundt JC, Vogel AP, Feltner DE, and Lenderking WR, “Vocal acoustic biomarkers of depression severity and treatment response,” Biol. Psychiatry, vol. 72, no. 7, pp. 580–587, 2012. PMC3409931 22541039 Yang Y, Fairbairn C, and Cohn JF, “Detecting depression severity from vocal prosody,” IEEE Trans. Affective Comput, vol. 4, no. 2, pp. 142–150, Second Quarter 2013. PMC4791067 26985326 Özseven T, Düğenci M, Doruk A, and Kahraman HI, “Voice traces of anxiety: Acoustic parameters affected by anxiety disorder,” Arch. Acoust, vol. 43, pp. 625–636, 2018. Scherer S, Stratou G, and Morency L-P, “Audiovisual behavior descriptors for depression assessment,” in Proc. 15th ACM Int. Conf. Multimodal Interact., 2013, pp. 135–140. Pennebaker JW, Francis ME, and Booth RJ, “Linguistic inquiry and word count: LIWC 2001,” Mahway: Lawrence Erlbaum Associates, vol. 71, no. 2001, 2001, Art. no. 2001. Malik M, Malik MK, Mehmood K, and Makhdoom I, “Automatic speech recognition: A survey,” Multimedia Tools Appl., vol. 80, no. 6, pp. 9411–9457, 2021. Dimauro G, Di Nicola V, Bevilacqua V, Caivano D, and Girardi F, “Assessment of speech intelligibility in Parkinson’s disease using a speech-to-text system,” IEEE Access, vol. 5, pp. 22 199–22 208, 2017. Devlin J, Chang M-W, Lee K, and Toutanova K, “BERT: Pretraining of deep bidirectional transformers for language understanding,” 2018, arXiv: 1810.04805. Liu Y. et al., “RoBERTa: A robustly optimized BERT pretraining approach,” 2019, arXiv: 1907.11692. Chowdhery A. et al., “PaLM: Scaling language modeling with pathways,” 2022, arXiv:2204.02311. Savova GK et al., “Mayo clinical text analysis and knowledge extraction system (cTAKES): Architecture, component evaluation and applications,” J. Amer. Med. Inform. Assoc, vol. 17, no. 5, pp. 507–513, 2010. PMC2995668 20819853 Manning C and Schutze H, Foundations of Statistical Natural Language Processing. Cambridge, MA, USA: MIT Press, 1999. Cohen AS, Mitchell KR, and Elvevåg B, “What do we really know about blunted vocal affect and alogia? A meta-analysis of objective assessments,” Schizophrenia Res., vol. 159, no. 2/3, pp. 533–538, 2014. PMC4254038 25261880 Baki P, Kaya H, Güleç H, Güleç H, and Salah AA, “A multimodal approach for mania level prediction in bipolar disorder,” IEEE Trans. Affective Comput, vol. 13, no. 4, pp. 2119–2131, Fourth Quarter 2022. Carson NJ et al., “Identification of suicidal behavior among psychiatrically hospitalized adolescents using natural language processing and machine learning of electronic health records,” PLoS One, vol. 14, no. 2, 2019, Art. no. e0211116. PMC6380543 30779800 Metzger M-H, Tvardik N, Gicquel Q, Bouvry C, Poulet E, and Potinet-Pagliaroli V, “Use of emergency department electronic medical records for automated epidemiological surveillance of suicide attempts: A french pilot study,” Int. J. Methods Psychiatr. Res, vol. 26, no. 2, 2017, Art. no. e1522. PMC6877202 27634457 Coppersmith G, Leary R, Crutchley P, and Fine A, “Natural language processing of social media as screening for suicide risk,” Biomed. Inform. Insights, vol. 10, 2018, Art. no. 1178222618792860. PMC6111391 30158822 Bittar A, Velupillai S, Roberts A, and Dutta R, “Text classification to inform suicide risk assessment in electronic health records,” in Proc. 17th World Congr. Med. Health Inform., 2019,pp. 40–44. 31437881 Tanana M, Hallgren KA, Imel ZE, Atkins DC, and Srikumar V, “A comparison of natural language processing methods for automated coding of motivational interviewing,” J. Substance Abuse Treat, vol. 65, pp. 43–50, 2016. PMC4842096 26944234 Baggott MJ, Kirkpatrick MG, Bedi G, and de Wit H, “Intimate insight: MDMA changes how people talk about significant others,” J. Psychopharmacology, vol. 29, no. 6, pp. 669–677, 2015. PMC4698152 25922420 To D, Sharma B, Karnik N, Joyce C, Dligach D, and Afshar M, “Validation of an alcohol misuse classifier in hospitalized patients,” Alcohol, vol. 84, pp. 49–55, 2020. PMC7101259 31574300 Hoogendoorn M, Berger T, Schulz A, Stolz T, and Szolovits P, “Predicting social anxiety treatment outcome based on therapeutic email conversations,” IEEE J. Biomed. Health Inform, vol. 21, no. 5, pp. 1449–1459, Sep. 2017. PMC5613669 27542187 Patel R. et al., “Mood instability is a common feature of mental health disorders and is associated with poor clinical outcomes,” BMJ Open, vol. 5, no. 5, 2015, Art. no. e007504. PMC4452754 25998036 Banerjee T. et al., “Predicting mood disorder symptoms with remotely collected videos using an interpretable multimodal dynamic attention fusion network,” 2021, arXiv:2109.03029. Stratou G, Scherer S, Gratch J, and Morency L-P, “Automatic nonverbal behavior indicators of depression and PTSD: The effect of gender,” J. Multimodal User Interfaces, vol. 9, pp. 17–29, 2015. pp. 11–18, 2014. Huang Z, Epps J, Joachim D, and Chen M, “Depression detection from short utterances via diverse smartphones in natural environmental conditions,” in Proc. 19th Annu. Conf. Int. Speech Commun. Assoc., 2018, pp. 3393–3397. Espinola CW, Gomes JC, Pereira JMS, and Santos WPD, “Detection of major depressive disorder, bipolar disorder, schizophrenia and generalized anxiety disorder using vocal acoustic analysis and machine learning,” Res. Biomed. Eng, vol. 38, pp. 813–829, 2022. Schultebraucks K, Yadav V, Shalev AY, Bonanno GA, and Galatzer-Levy IR, “Deep learning-based classification of posttraumatic stress disorder and depression following trauma utilizing visual and auditory markers of arousal and mood,” Psychol. Med, vol. 52, no. 5, pp. 957–967, 2022. 32744201 Marmar CR
et al., “Speech-based markers for posttraumatic stress disorder in US veterans,” Depression Anxiety, vol. 36, no. 7, pp. 607–616, Jul. 2019. [Online]. Available: https://onlinelibrary.wiley.com/doi/10.1002/da.22890
10.1002/da.22890 PMC6602854 31006959 Joshi J. et al., “Multimodal assistive technologies for depression diagnosis and monitoring,” J. Multimodal User Interfaces, vol. 7, pp. 217–228, 2013. Yang L, Jiang D, He L, Pei E, Oveneke MC, and Sahli H, “Decision tree based depression classification from audio video and language information,” in Proc. 6th Int. Workshop Audio/Vis. Emotion Challenge, 2016, pp. 89–96. Sardari S, Nakisa B, Rastgoo MN, and Eklund P, “Audio based depression detection using convolutional autoencoder,” Expert Syst. Appl, vol. 189, 2022, Art. no. 116076. Zhang Z, Lin W, Liu M, and Mahmoud M, “Multimodal deep learning framework for mental disorder recognition,” in Proc. IEEE 15th Int. Conf. Autom. Face Gesture Recognit., 2020, pp. 344–350. Lipovetsky S and Conklin M, “Analysis of regression in game theory approach,” Appl. Stochastic Models Bus. Ind, vol. 17, no. 4, pp. 319–330, 2001. Zhou Y, Yao X, Han W, Wang Y, Li Z, and Li Y, “Distinguishing apathy and depression in older adults with mild cognitive impairment using text, audio, and video based on multiclass classification and shapely additive explanations,” Int. J. Geriatr. Psychiatry, vol. 37, no. 11, 2022. 36284449 Hox JJ, Multilevel Analysis: Techniques and Applications. New York, NY, USA: Routledge, 2010. Tabachnick BG and Fidell LS, Multilevel Liniear Model, 5th ed., 2007, sec. 15, pp. 781–857. Beck AT, Steer RA, and Brown G, Manual for the Beck Depression Inventory-II. San Antonio, TX, USA: Psychological Corporation, 1996. Storch EA, Rasmussen SA, Price LH, Larson MJ, Murphy TK, and Goodman WK, “Development and psychometric evaluation of the Yale–Brown obsessive-compulsive scale—Second edition,” Psychol. Assessment, vol. 22, no. 2, pp. 223–232, 2010. 20528050 McAuley MD, “Incorrect calculation of total electrical energy delivered by a deep brain stimulator,” Brain Stimulation, vol. 13, no. 5, pp. 1414–1415, Sep. 2020. 32745654 Jeni LA, Cohn JF, and Kanade T, “Dense 3D face alignment from 2D video for real-time use,” Image Vis. Comput, vol. 58, pp. 13–24, 2017. PMC5931713 29731533 Zhang Z. et al., “Multimodal spontaneous emotion corpus for human behavior analysis,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 2016, pp. 3438–3446. Girard JM, Chu W-S, Jeni LA, and Cohn JF, “Sayette group formation task (GFT) spontaneous facial expression database,” in Proc. 12th IEEE Int. Conf. Autom. Face Gesture Recognit., 2017, pp. 581–588. PMC5876025 29606916 Christ M, Braun N, Neuffer J, and Kempa-Liehr AW, “Time series feature extraction on basis of scalable hypothesis tests (Tsfresh – A Python package),” Neurocomputing, vol. 307, pp. 72–77, 2018. Dewi C, Chen R-C, Jiang X, and Yu H, “Adjusting eye aspect ratio for strong eye blink detection based on facial landmarks,” PeerJ Comput. Sci, vol. 8, 2022, Art. no. e943. PMC9044337 35494836 TranscribeMe! - Fast & accurate human transcription services. [Online]. Available: https://www.transcribeme.com/ McAuliffe M, Socolof M, Mihuc S, Wagner M, and Sonderegger M, “Montreal forced aligner: Trainable text-speech alignment Using kaldi,” in Proc. 18th Annu. Conf. Int. Speech Commun. Assoc., 2017, pp. 498–502. Eyben F, Wöllmer M, and Schuller B, “OpenSMILE: The Munich versatile and fast open-source audio feature extractor,” in Proc. 18th ACM Int. Conf. Multimedia, New York, NY, USA, 2010, pp. 1459–1462. [Online]. Available: 10.1145/1873951.1874246 10.1145/1873951.1874246 Degottex G, Kane J, Drugman T, Raitio T, and Scherer S, “COVAREP—A collaborative voice analysis repository for speech technologies,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2014, pp. 960–964. Eyben F. et al., “The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing,” IEEE Trans. Affective Comput, vol. 7, no. 2, pp. 190–202, Second Quarter 2016. Tasnim M and Novikova J, “Cost-effective models for detecting depression from speech,” in Proc. IEEE 21st Int. Conf. Mach. Learn. Appl., 2022, pp. 1687–1694. Low DM, Bentley KH, and Ghosh SS, “Automated assessment of psychiatric disorders using speech: A systematic review,” Laryngoscope Invest. Otolaryngol, vol. 5, no. 1, pp. 96–116, 2020. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/lio2.354 10.1002/lio2.354 PMC7042657 32128436 Cummins N, Vlasenko B, Sagha H, and Schuller B, “Enhancing speech-based depression detection through gender dependent vowel-level formant features,” in Proc. Int. Conf. Artif. Intell. Med., ten Teije A, Popow C, Holmes JH, and Sacchi L, Eds., Springer, 2017, pp. 209–214. Rouast PV, Adam MTP, and Chiong R, “Deep learning for human affect recognition: Insights and new developments,” IEEE Trans. Affective Comput, vol. 12, no. 2, pp. 524–543, Second Quarter 2021. Haider F, Pollak S, Albert P, and Luz S, “Emotion recognition in low-resource settings: An evaluation of automatic feature selection methods,” Comput. Speech Lang, vol. 65, 2021, Art. no. 101119. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0885230820300528 Pennebaker JW, Boyd RL, Jordan K, and Blackburn K, “The development and psychometric properties of LIWC2015,” 2015. Tausczik YR and Pennebaker JW, “The psychological meaning of words: LIWC and computerized text analysis methods,” J. Lang. Social Psychol, vol. 29, no. 1, pp. 24–54, 2010. Wu H, Nonparametric Regression Methods for Longitudinal Data Analysis [Mixed-Effects Modeling Approaches]. Hoboken, NJ, USA: Wiley-Interscience, 2006. Hajjem A, Bellavance F, and Larocque D, “Mixed effects regression trees for clustered data,” Statist. Probability Lett, vol. 81, no. 4, pp. 451–459, Apr. 2011. Shapley LS, “A value for n-person games,” in Classics in Game Theory, vol. 69. Princeton, NJ, USA: Princeton Univ. Press, 1997. Lundberg SM and Lee S-I, “A unified approach to interpreting model predictions,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2017, pp. 4765–4774. Ribeiro MT, Singh S, and Guestrin C, ““Why should I trust you?” Explaining the predictions of any classifier,” in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2016, pp. 1135–1144. Kim I, Lee S, Kim Y, Namkoong H, andS. Kim, “A probabilistic model for pathway-guided gene set selection,” in Proc. IEEE Int. Conf. Bioinf. Biomed., 2021, pp. 2733–2740. Strobl C, Boulesteix A-L, Kneib T, Augustin T, and Zeileis A, “Conditional variable importance for random forests,” BMC Bioinf., vol. 9, no. 1, Dec. 2008, Art. no. 307. [Online]. Available: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-307 10.1186/1471-2105-9-307 PMC2491635 18620558 Wilcoxon F, “Individual comparisons by ranking methods,” Biometrics Bull., vol. 1, no. 6, Dec. 1945, Art. no. 80. [Online]. Available: https://www.jstor.org/stable/10.2307/3001968?origin=crossref 10.2307/3001968?origin=crossref Alghowinem S, Gedeon T, Goecke R, Cohn JF, and Parker G, “Interpretation of depression detection models via feature selection methods,” IEEE Trans. Affective Comput, vol. 14, no. 1,pp. 133–152, First Quarter 2023. PMC10019578 36938342 He L, Jiang D, Yang L, Pei E, Wu P, and Sahli H, “Multimodal affective dimension prediction using deep bidirectional long short-term memory recurrent neural networks,” in Proc. 5th Int. WorkshopAudio/Vis. Emotion Challenge, New York, NY, USA, 2015, pp. 73–80. [Online]. Available: 10.1145/2808196.2811641 10.1145/2808196.2811641 Rosenthal R, “Conducting judgment studies,” in The New Handbook of Methods in Nonverbal Behavior Research, Harrigan J, Rosenthal R, and Scherer K Eds., London, U.K.: Oxford Univ. Press, Mar. 2008, pp. 199–234. [Online]. Available: https://academic.oup.com/book/25991/chapter/193835959 Macdonald AJD and Fugard AJB, “Routine mental health outcome measurement in the UK,” Int. Rev. Psychiatry, vol. 27, no. 4, pp. 306–319, 2015. 25832566 Brain Behavior Quantification & Synchronization Workshop, 2023. Accessed: May 21, 2023. [Online]. Available: https://event.roseliassociates.com/bbqs-workshop Provenza NR et al., “The case for adaptive neuromodulation to treat severe intractable mental disorders,” Front. Neurosci,vol. 13, Feb. 2019, Art. no. 152. PMC6412779 30890909 Hofman JM et al., “Integrating explanation and prediction in computational social science,” Nature, vol. 595, pp. 181–188, 2021. 34194044 trying2...
Automatic Speech Recognition System to Record Progress Notes in a Mobile EHR: A Pilot Study. | LitMetric
Creating notes in the EHR is one of the most problematic aspects for health professionals. The main challenges are the time spent on this task and the quality of the records. Automatic speech recognition technologies aim to facilitate clinical documentation for users, optimizing their workflow. In our hospital, we internally developed an automatic speech recognition system (ASR) to record progress notes in a mobile EHR. The objective of this article is to describe the pilot study carried out to evaluate the implementation of ASR to record progress notes in a mobile EHR application. As a result, the specialty that used ASR the most was Home Medicine. The lack of access to a computer at the time of care and the need to perform short and fast evolutions were the main reasons for users to use the system.
Similar Publications
Cognition
December 2024
Max Plank Institute for Psycholinguistics, Wundtlaan 1, 6525 XD Nijmegen, The Netherlands; Radboud University Nijmegen, Donders Institute for Brain , Cognition and Behaviour, 6525 EN Nijmegen, The Netherlands.
Face-to-face communication is not only about 'what' is said but also 'how' it is said, both in speech and bodily signals. Beat gestures are rhythmic hand movements that typically accompany prosodic prominence in conversation. Yet, it is still unclear how beat gestures influence language comprehension.
View Article and Find Full Text PDF
Forensic Sci Int Synerg
December 2024
Norwegian Police IT-unit, Fridtjof Nansens vei 14, 0031 Oslo, Norway.
Law enforcement agencies manually transcribe thousands of investigative interviews per year in relation to different crimes. In order to automate and improve efficiency in the transcription of such interviews, applied research explores artificial intelligence models, including Automatic Speech Recognition (ASR) and Natural Language Processing. While AI models can improve efficiency in criminal investigations, their successful implementation requires evaluation of legal and technical risks.
View Article and Find Full Text PDF
Hate speech constitutes a major problem on microblogging platforms, with automatic detection being a growing research area. Most existing works focus on analyzing the content of social media posts. Our study shifts focus to predicting which users are likely to become targets of hate speech.
View Article and Find Full Text PDF
Logoped Phoniatr Vocol
December 2024
Speech Prosody Studies Group, Dep. of Linguistics, State Univ. of Campinas , Campinas, Brazil.
Purpose : The analysis of acoustic parameters contributes to the characterisation of human communication development throughout the lifetime. The present paper intends to analyse suprasegmental features of European Portuguese in longitudinal conversational speech samples of three male public figures in uncontrolled environments across different ages, approximately 30 years apart.Participants And Methods : Twenty prosodic features concerning intonation, intensity, rhythm, and pause measures were extracted semi-automatically from 360 speech intervals (3-4 interviews from each speaker x 30 speech intervals x 3 speakers) lasting between 3 to 6 s.
View Article and Find Full Text PDF
To develop reliable, valid, and efficient measures of obsessive-compulsive disorder (OCD) severity, comorbid depression severity, and total electrical energy delivered (TEED) by deep brain stimulation (DBS), we trained and compared random forests regression models in a clinical trial of participants receiving DBS for refractory OCD. Six participants were recorded during open-ended interviews at pre- and post-surgery baselines and then at 3-month intervals following DBS activation. Ground-truth severity was assessed by clinical interview and self-report.
View Article and Find Full Text PDF
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!