5,091 results match your criteria: "Scientific data[Journal]"

A significant challenge in computational chemistry is developing approximations that accelerate ab initio methods while preserving accuracy. Machine learning interatomic potentials (MLIPs) have emerged as a promising solution for constructing atomistic potentials that can be transferred across different molecular and crystalline systems. Most MLIPs are trained only on energies and forces in vacuum, while an improved description of the potential energy surface could be achieved by including the curvature of the potential energy surface.

View Article and Find Full Text PDF

The effect of work content on workload, stress, and performance was not well addressed in the literature, due to the lack of comprehensive conceptualization, problem definition, and relevant dataset. The gap between laboratory-simulated studies and real-life working conditions delays the generalization, hindering the development of performance management and monitoring tools. Contributing to this topic, a data collection effort is organized, which considers unique work conditions and work content factors of a coffee shop, to conceptualize scenarios that better highlight their effect on human performance, thus creating the Work content Effect on BAristas (WEBA) dataset.

View Article and Find Full Text PDF

The process of developing new drugs is arduous and costly, particularly for targets classified as "difficult-to-drug." Macrocycles show a particular ability to modulate difficult-to-drug targets, including protein-protein interactions, while still allowing oral administration. However, the determination of membrane permeability, critical for reaching intracellular targets and for oral bioavailability, is laborious and expensive.

View Article and Find Full Text PDF

Multi-Planar Cervical Motion Dataset: IMU Measurements and Goniometer.

Sci Data

January 2025

Department of Anatomy and Anthropology, Faculty of Medical & Health Sciences, Tel- Aviv University, Tel-Aviv, 699780, Israel.

This data descriptor presents a comprehensive and replicable dataset and method for calculating the cervical range of motion (CROM) utilizing quaternion-based orientation analysis from Delsys inertial measurement unit (IMU) sensors. This study was conducted with 14 participants and analyzed 504 cervical movements in the Sagittal, Frontal and Horizontal planes. Validated against a Universal Goniometer and tested for reliability and reproducibility.

View Article and Find Full Text PDF

The Qinghai-Tibet Plateau (QTP), a high mountain area prone to destructive rainstorm hazards and inducing natural disasters, underscores the importance of developing precipitation intensity-duration-frequency (IDF) curves for estimating extreme precipitation characteristics. Here we introduce the Qinghai-Tibet Plateau Precipitation Intensity-Duration-Frequency Curves (QTPPIDFC) dataset, the first gridded dataset tailored for estimating extreme precipitation characteristics in QTP. The generalized extreme value distribution is chosen to fit the annual maximum precipitation samples at 203 weather stations, based on which the at-site IDF curves are estimated; then, principal component analysis is done to identify the southeast-northwest spatial pattern of at-site IDF curves, and its first principal component gives a 96% explained variance; finally, spatial interpolation is done to estimate gridded IDF curves by using the random forest model with geographical and climatic variables as predictors.

View Article and Find Full Text PDF

Eriocraniidae (Lepidoptera) are widespread leaf miners and have unique adaptability to hypoxia and low temperatures, causing covert but devastating harm to Fagales (Betulaceae and Fagaceae) plants in the Holarctic. However, the lack of a high-quality genome of this most ancient family within the angiosperm-feeding group largely limits the studies on the phylogeny and environmental adaptation of the primitive Lepidoptera. In this study, utilizing Illumina sequencing, PacBio HiFi sequencing, and Hi-C technology, we constructed a chromosome-level genome assembly of E.

View Article and Find Full Text PDF

The global decline in bee populations poses significant risks to agriculture, biodiversity, and environmental stability. To bridge the gap in existing data, we introduce ApisTox, a comprehensive dataset focusing on the toxicity of pesticides to honey bees (Apis mellifera). This dataset combines and leverages data from existing sources such as ECOTOX and PPDB, providing an extensive, consistent, and curated collection that surpasses the previous datasets.

View Article and Find Full Text PDF

The Norwegian Parliamentary Debates Dataset.

Sci Data

January 2025

Norwegian Institute of Public Health and Department of Health Management and Health Economics, University of Oslo, Oslo, 0316, Norway.

Recent advancements in computing power and machine learning techniques have facilitated the digitization of new corpora, as well as new methods for studying high-dimensional data. This has enabled empirical investigations of fundamental questions in the social sciences that were previously restricted by technical limitations or data availability. In this note, we introduce a new dataset covering debates in the Norwegian Parliament in the 1945-2024 period.

View Article and Find Full Text PDF

Chromosome-level genome assembly of Monolepta hieroglyphica, two-spotted leaf beetle (Coleoptera: Chrysomelidae).

Sci Data

January 2025

Institute of Biotechnology, Beijing Key Laboratory of Agricultural Gene Resources and Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, 100097, China.

Monolepta hieroglyphica, in view of its wide-ranging host and highly polyphagous characteristics, has become an important agricultural pest in East and Southeast Asian countries. To better understand its biology and develop control strategies, we present a high-quality chromosome-level genome assembly of M. hieroglyphica, with contig N50 of 18.

View Article and Find Full Text PDF

Near telomere-to-telomere assembly of the Tarim pigeon (Columba livia) genome.

Sci Data

December 2024

Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, 200240, China.

Pigeons serve as important model animals and commercial poultry. The Tarim pigeon, as a breed of Columba livia, is a locally indigenous breed unique to China. While the genome of C.

View Article and Find Full Text PDF

Chromosome-scale assembly and annotation of the wild wheat relative Aegilops comosa.

Sci Data

December 2024

State Key Laboratory of Wheat Improvement, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agriculture Sciences in Weifang, Weifang, 261325, Shandong, China.

Wild relatives of wheat are valuable sources for enhancing the genetic diversity of common wheat. Aegilops comosa, an annual diploid species with an MM genome constitution, possesses numerous agronomically valuable traits that can be exploited for wheat improvement. In this study, we report a chromosome-level genome assembly of Ae.

View Article and Find Full Text PDF

ITH: an open database on Italian Tenders 2016-2023.

Sci Data

December 2024

University of Turin, Computer Science Department, Turin, 10149, Italy.

Governments procure large amounts of goods and services to help them implement policies and deliver public services; in Italy, this is an essential sector, corresponding to about 12% of the gross domestic product. Data are increasingly recorded in public repositories, although they are often divided into multiple sources and not immediately available for consultation. This paper provides a description and analysis of an effort to collect and arrange a legal public administration database.

View Article and Find Full Text PDF

Chromosome-level genome assembly and annotation of Barbel chub Squaliobarbus curriculus.

Sci Data

December 2024

Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, Guangdong Provincial Engineering Technology Research Center for Environmentally Friendly Aquaculture, School of Life Sciences, South China Normal University, Guangzhou, 510631, China.

The barbel chub Squaliobarbus curriculus, is an economically important freshwater fish in China. The fishery production of the wild populations has declined dramatically, making the development of aquaculture urgently needed. However, the lack of high-quality genome has impeded its artificial breeding and genetic breeding.

View Article and Find Full Text PDF

Chromosome-level genome assembly of Triplophysa bombifrons using PacBio HiFi sequencing and Hi-C technologies.

Sci Data

December 2024

College of Life Science and Technology/Tarim Research Center of Rare Fishes, Tarim University, CN-0997, Alar 843300, Xinjiang Uygur Autonomous Region, Xinjiang, China.

Triplophysa bombifrons, a species of bony fish localized in China, has largely been understudied genetically, with limited data available beyond its mitochondrial genome. This study introduces a chromosome-level genome assembly for T. bombifrons, achieved through the integration of PacBio long-read sequencing and Hi-C chromatin interaction mapping.

View Article and Find Full Text PDF

In this paper, we describe the dataset captured with our proprietary data capture solution mounted on top of a Land Rover Defender vehicle. The captured data are the real data of drives on various Slovak roads. The total dataset consist of almost 33 hours of driving with a automotive grade FPD Link camera with 30 fps and with additional sensors such as high-precision GNSS sensor and modem towards mobile data connectivity LTE and 5 G.

View Article and Find Full Text PDF

The northern Gulf of Mexico (nGoM) receives water from over 50 rivers which are highly influenced by humans and include the largest river in the United States, the Mississippi River. To support large-scale data-driven research centered on the dynamic river-ocean system in the region, this study consolidated hydrogeochemical river and ocean data from across the nGoM. In particular, we harmonized 35 chemical solute parameters from 54 rivers and incorporated river discharge data to derive daily solute concentration and flux estimates throughout the nGoM.

View Article and Find Full Text PDF

Comprehensive Mass Spectral Libraries of Human Thyroid Tissues and Cells.

Sci Data

December 2024

Westlake Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, No. 18 Shilongshan Road, Hangzhou, 310024, China.

Thyroid nodules are a common endocrine condition with an increasing incidence over the decades. Data-independent acquisition has been widely utilized in discovery proteomics to identify disease biomarkers and therapeutic targets. To analyze the thyroid disease-related proteome in a high-throughput, reproducible and reliable manner, we introduce thyroid-specific peptide spectral libraries.

View Article and Find Full Text PDF

Transcriptome and translatome profiling of Col-0 and grp7grp8 under ABA treatment in Arabidopsis.

Sci Data

December 2024

Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.

Abscisic acid (ABA) is a crucial phytohormone that regulates plant growth and stress responses. While substantial knowledge exists about transcriptional regulation, the molecular mechanisms underlying ABA-triggered translational regulation remain unclear. Recent advances in deep sequencing of ribosome footprints (Ribo-seq) enable the mapping and quantification of mRNA translation efficiency.

View Article and Find Full Text PDF

BMT: A Cross-Validated ThinPrep Pap Cervical Cytology Dataset for Machine Learning Model Training and Validation.

Sci Data

December 2024

Department of Pathology and Laboratory Medicine, Alpert Medical School, Brown University, Providence, RI, 02912, USA.

In the past several years, a few cervical Pap smear datasets have been published for use in clinical training. However, most publicly available datasets consist of pre-segmented single cell images, contain on-image annotations that must be manually edited out, or are prepared using the conventional Pap smear method. Multicellular liquid Pap image datasets are a more accurate reflection of current cervical screening techniques.

View Article and Find Full Text PDF

High-frequency precipitation (solid/liquid) isotope datasets are useful for identification of moisture sources and various dynamical and thermodynamical processes controlling precipitation formation. Here, we report three-year (2019-2021) daily rain isotope (both oxygen, δO hereafter, and hydrogen, δH, hereafter) datasets from three unique locations in India during the Indian Summer Monsoon (ISM). The locations are- (1) Port Blair- an island situated in the Bay of Bengal (BoB); (2) Mahabaleshwar, located at the crest of the Western Ghats Mountain; and (3) Tezpur, in northeast India, situated close to a dense forest.

View Article and Find Full Text PDF

As molecular research on hemp (Cannabis sativa L.) continues to advance, there is a growing need for the accumulation of more diverse genome data and more accurate genome assemblies. In this study, we report the three-way assembly data of a cannabidiol (CBD)-rich cannabis variety, 'Pink Pepper' cultivar using sequencing technology: PacBio Single Molecule Real-Time (SMRT) technology, Illumina sequencing technology, and Oxford Nanopore Technology (ONT).

View Article and Find Full Text PDF

Unveiling insights from the Joint FAO/WHO Expert Committee on Food Additives (JECFA) portal.

Sci Data

December 2024

Unit of Biostatistics, Epidemiology, and Public Health, Department of Cardiac Thoracic Vascular Sciences and Public Health, University of Padova, via Loredan 18, Padova, 35131, Italy.

This study presents a method for automating the retrieval of key identifies and links to toxicological data from the Joint FAO/WHO Expert Committee on Food Additives (JECFA) database using web scraping techniques. Although the method primarily serves as an automated indexing tool, facilitating organization and access to relevant reports, monographs, and specifications, it significantly enhances the efficiency of navigating the extensive JECFA database. Researchers can then perform more targeted and efficient searches, although additional manual steps are required to extract and structure the detailed toxicological data.

View Article and Find Full Text PDF

To achieve carbon neutrality, solar photovoltaic (PV) in China has undergone enormous development over the past few years. PV datasets with high accuracy and fine temporal span are crucial to assess the corresponding carbon reductions. In this study, we employed the random forest classifier to extract PV installations throughout China in 2015 and 2020 using Landsat-8 imagery in Google Earth Engine.

View Article and Find Full Text PDF

Chromosome-level genome assembly of the tetraploid medicinal and natural dye plant Persicaria tinctoria.

Sci Data

December 2024

Department of Economic Plants and Biotechnology, Yunnan Key Laboratory for Wild Plant Resources, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, 650201, China.

Persicaria tinctoria (2n = 40) is an important traditional medicinal plant and natural dye source within the genus Persicaria. P. tinctoria has been utilized for its antibacterial, antiviral, anti-inflammatory, and tumor treatment properties.

View Article and Find Full Text PDF

Chromosome-level genome assembly of the northern snakehead (Channa argus) using PacBio and Hi-C technologies.

Sci Data

December 2024

Key Laboratory of Mariculture (Ocean University of China), Ministry of Education (KLMME), Fisheries College, Ocean University of China, Qingdao, 266003, China.

The evolutionary origins of specialized organs pose significant challenges for empirical studies, as most such organs evolved millions of years ago. The Northern snakehead (Channa argus), an air-breathing fish, possesses a suprabranchial organ, a common feature of the Anabantoidei, offering a unique opportunity to investigate the function and evolutionary origins of specialized organs. In this study, a high-quality chromosome-level reference genome of C.

View Article and Find Full Text PDF