Fairness is a principal social value that is observable in civilisations around the world. Yet, a fairness metric for digital texts that describe even a simple social interaction, e.g., 'The boy hurt the girl' has not been developed. We address this by employing word embeddings that use factors found in a new social psychology literature review on the topic. We use these factors to build fairness vectors. These vectors are used as sentence level measures, whereby each dimension reflects a fairness component. The approach is employed to approximate human perceptions of fairness. The method leverages a pro-social bias within word embeddings, for which we obtain an F1 = 79.8 on a list of sentences using the Universal Sentence Encoder (USE). A second approach, using principal component analysis (PCA) and machine learning (ML), produces an F1 = 86.2. Repeating these tests using Sentence Bidirectional Encoder Representations from Transformers (SBERT) produces an F1 = 96.9 and F1 = 100 respectively. Improvements using subspace representations are further suggested. By proposing a first-principles approach, the paper contributes to the analysis of digital texts along an ethical dimension.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9549858 | PMC |
http://dx.doi.org/10.1007/s42803-022-00049-4 | DOI Listing |
ISA Trans
January 2025
State Key Laboratory of Mechanical System and Vibration, Shanghai Jiao Tong University, Shanghai 200240, China. Electronic address:
This paper addresses the critical challenge of interpretability in machine learning methods for machine fault diagnosis by introducing a novel ad hoc interpretable neural network structure called Sparse Temporal Logic Network (STLN). STLN conceptualizes network neurons as logical propositions and constructs formal connections between them using specified logical operators, which can be articulated and understood as a formal language called Weighted Signal Temporal Logic. The network includes a basic word network using wavelet kernels to extract intelligible features, a transformer encoder with sparse and structured neural attention to locate informative signal segments relevant to decision-making, and a logic network to synthesize a coherent language for fault explanation.
View Article and Find Full Text PDFJ Cheminform
January 2025
Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, University of Bonn, Friedrich-Hirzebruch-Allee 5/6, 53115, Bonn, Germany.
Analogue series (AS) are generated during compound optimization in medicinal chemistry and are the major source of structure-activity relationship (SAR) information. Pairs of active AS consisting of compounds with corresponding substituents and comparable potency progression represent SAR transfer events for the same target or across different targets. We report a new computational approach to systematically search for SAR transfer series that combines an AS alignment algorithm with context-depending similarity assessment based on vector embeddings adapted from natural language processing.
View Article and Find Full Text PDFNeuroscience
January 2025
Human Communication, Learning, and Development, Faculty of Education, The University of Hong Kong, China.
The human brain possesses the ability to automatically extract statistical regularities from environmental inputs, including visual-graphic symbols and printed units. However, the specific brain regions underlying the statistical learning of these visual-graphic symbols or artificial orthography remain unclear. This study utilized functional magnetic resonance imaging (fMRI) with an artificial orthography learning paradigm to measure brain activities associated with the statistical learning of radical positional regularities embedded in pseudocharacters containing high (100%), moderate (80%), and low (60%) levels of consistency, along with a series of random abstract figures.
View Article and Find Full Text PDFJ Cogn
January 2025
Department of Humanities, University of Trento, via Tommaso Gar 14, 38122, Trento, Italy.
The productive use of morphological information is considered one of the possible ways in which speakers of a language understand and learn unknown words. In the present study we investigate if, and how, also adult L2 learners exploit morphological information to process unknown words by analyzing the impact of language proficiency in the processing of novel derivations. Italian L2 learners, divided into three proficiency groups, participated in a lexical decision where pseudo-words could embed existing stems (e.
View Article and Find Full Text PDFJAMIA Open
February 2025
Department of Medicine, University of Wisconsin-Madison, Madison, WI 53792, United States.
Objective: To evaluate large language models (LLMs) for pre-test diagnostic probability estimation and compare their uncertainty estimation performance with a traditional machine learning classifier.
Materials And Methods: We assessed 2 instruction-tuned LLMs, Mistral-7B-Instruct and Llama3-70B-chat-hf, on predicting binary outcomes for Sepsis, Arrhythmia, and Congestive Heart Failure (CHF) using electronic health record (EHR) data from 660 patients. Three uncertainty estimation methods-Verbalized Confidence, Token Logits, and LLM Embedding+XGB-were compared against an eXtreme Gradient Boosting (XGB) classifier trained on raw EHR data.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!