We introduce a modern, optimized, and publicly available implementation of the sequential Information Bottleneck clustering algorithm, which strikes a highly competitive balance between clustering quality and speed. We describe a set of optimizations that make the algorithm computation more efficient, particularly for the common case of sparse data representation. The results are substantiated by an extensive evaluation that compares the algorithm to commonly used alternatives, focusing on the practically important use case of text clustering. The evaluation covers a range of publicly available benchmark datasets and a set of clustering setups employing modern word and sentence embeddings obtained by state-of-the-art neural models. The results show that in spite of using the more basic Term-Frequency representation, the proposed implementation provides a highly attractive trade-off between quality and speed that outperforms the alternatives considered. This new release facilitates the use of the algorithm in real-world applications of text clustering.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9407479 | PMC |
http://dx.doi.org/10.3390/e24081132 | DOI Listing |
Sci Rep
January 2025
School of Computer Science and Technology, Donghua University, Shanghai, 201620, China.
Extracting high-order abstract patterns from complex high-dimensional data forms the foundation of human cognitive abilities. Abstract visual reasoning involves identifying abstract patterns embedded within composite images, considered a core competency of machine intelligence. Traditional neuro-symbolic methods often infer unknown objects through data fitting, without fully exploring the abstract patterns within composite images and the sequential sensitivity of visual sequences.
View Article and Find Full Text PDFChem Sci
January 2025
Centre for Membrane Separations, Adsorption, Catalysis and Spectroscopy for Sustainable Solutions (cMACS), KU Leuven Celestijnenlaan 200F, Post Box 2454 3001 Leuven Belgium
Plastic waste conversion into valuable chemicals is a promising alternative to landfill or incineration. In particular, the chemical upcycling of polybutadiene rubber (PBR) could provide a renewable route towards highly desirable α,ω-dienes with varying chain lengths, which can find ample industrial application. While previous research has shown that the treatment of polybutadiene with a consecutive hydrogenation and ethenolysis reaction can afford long-chain α,ω-dienes, achieving precise control over the product chain length remains an important bottleneck.
View Article and Find Full Text PDFChem Sci
January 2025
J. Mike Walker '66 Department of Mechanical Engineering, Texas A&M University College Station TX 77843 USA
This perspective work examines the current advancements in integrated CO capture and electrochemical conversion technologies, comparing the emerging methods of (1) electrochemical reactive capture (eRCC) though amine- and (bi)carbonate-mediated processes and (2) direct (flue gas) adsorptive capture and conversion (ACC) with the conventional approach of sequential carbon capture and conversion (SCCC). We initially identified and discussed a range of cell-level technological bottlenecks inherent to eRCC and ACC including, but not limited to, mass transport limitations of reactive species, limitation of dimerization, impurity effects, inadequate generation of CO to sustain industrially relevant current densities, and catalyst instabilities with respect to some eRCC electrolytes, amongst others. We followed this with stepwise perspectives on whether these are considered intrinsic challenges of the technologies - otherwise recommendations were disclosed where appropriate.
View Article and Find Full Text PDFAngew Chem Int Ed Engl
January 2025
Xiamen Key Laboratory of Optoelectronic Materials and Advanced Manufacturing, Institute of Luminescent Materials and Information Displays, College of Materials Science and Engineering, Huaqiao University, Xiamen, 361021, China.
The advancement of tin-based perovskite solar cells (TPSCs) has been severely hindered by the poor controllability of perovskite crystal growth and the energy level mismatch between the perovskite and fullerene-based electron transport layer (ETL). Here, we synthesized three cis-configured pyridyl-substituted fulleropyrrolidines (PPF), specifically 2-pyridyl (PPF2), 3-pyridyl (PPF3), and 4-pyridyl (PPF4), and utilized them as precursor additives to regulate the crystallization kinetics during film formation. The spatial distance between the two pyridine groups in PPF2, PPF3, and PPF4 increases sequentially, enabling PPF4 to interact with more perovskite colloidal particles.
View Article and Find Full Text PDFChem Rev
January 2025
Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, California 90089, United States.
Conventional artificial intelligence (AI) systems are facing bottlenecks due to the fundamental mismatches between AI models, which rely on parallel, in-memory, and dynamic computation, and traditional transistors, which have been designed and optimized for sequential logic operations. This calls for the development of novel computing units beyond transistors. Inspired by the high efficiency and adaptability of biological neural networks, computing systems mimicking the capabilities of biological structures are gaining more attention.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!