Motivation: Advances in genomics have created an insistent need for accessible tools that simplify complex genetic data analysis, enabling researchers across fields to harness the power of genome-wide association studies and genomic prediction. GWAStic was developed to bridge this gap, providing an intuitive platform that combines artificial intelligence with traditional statistical methods, making sophisticated genomic analysis accessible without requiring deep expertise in statistical software.
Results: We present GWAStic, an intuitive, cross-platform desktop application designed to streamline genome-wide association studies and genomic prediction for biological and medical researchers.
Motivation: The Galaxy workflow system is an open-source platform supporting data-intensive research in life sciences, featuring a user-friendly web interface for complex analyses without extensive programming. It also offers a representational state transfer based API, enabling remote execution of specific tools. Galaxy supports similarity searches for nucleotide and amino acid sequences, with integrated tools like NCBI BLAST+ and DIAMOND.
View Article and Find Full Text PDFPangenomes are collections of annotated genome sequences of multiple individuals of a species. The structural variants uncovered by these datasets are a major asset to genetic analysis in crop plants. Here we report a pangenome of barley comprising long-read sequence assemblies of 76 wild and domesticated genomes and short-read sequence data of 1,315 genotypes.
View Article and Find Full Text PDFThe FAIR data principle as a commitment to support long-term research data management is widely accepted in the scientific community. However, although many established infrastructures provide comprehensive and long-term stable services and platforms, a large quantity of research data is still hidden. Currently, high-throughput plant genomics and phenomics technologies are producing research data in abundance, the storage of which is not covered by established core databases.
View Article and Find Full Text PDFBackground: The sequencing of whole genomes is becoming increasingly affordable. In this context, large-scale sequencing projects are generating ever larger datasets of species-specific genomic diversity. As a consequence, more and more genomic data need to be made easily accessible and analyzable to the scientific community.
View Article and Find Full Text PDFWith the ongoing cost decrease of genotyping and sequencing technologies, accurate and fast phenotyping remains the bottleneck in the utilizing of plant genetic resources for breeding and breeding research. Although cost-efficient high-throughput phenotyping platforms are emerging for specific traits and/or species, manual phenotyping is still widely used and is a time- and money-consuming step. Approaches that improve data recording, processing or handling are pivotal steps towards the efficient use of genetic resources and are demanded by the research community.
View Article and Find Full Text PDFOver the last years it has been observed that the progress in data collection in life science has created increasing demand and opportunities for advanced bioinformatics. This includes data management as well as the individual data analysis and often covers the entire data life cycle. A variety of tools have been developed to store, share, or reuse the data produced in the different domains such as genotyping.
View Article and Find Full Text PDFIn this opinion article, we discuss the formatting of files from (plant) genotyping studies, in particular the formatting of metadata in Variant Call Format (VCF) files. The flexibility of the VCF format specification facilitates its use as a generic interchange format across domains but can lead to inconsistency between files in the presentation of metadata. To enable fully autonomous machine actionable data flow, generic elements need to be further specified.
View Article and Find Full Text PDFThe research data life cycle from project planning to data publishing is an integral part of current research. Until the last decade, researchers were responsible for all associated phases in addition to the actual research and were assisted only at certain points by IT or bioinformaticians. Starting with advances in sequencing, the automation of analytical methods in all life science fields, including in plant phenotyping, has led to ever-increasing amounts of ever more complex data.
View Article and Find Full Text PDFBackground: Linking nucleotide sequence data (NSD) to scientific publication citations can enhance understanding of NSD provenance, scientific use, and reuse in the community. By connecting publications with NSD records, NSD geographical provenance information, and author geographical information, it becomes possible to assess the contribution of NSD to infer trends in scientific knowledge gain at the global level.
Findings: We extracted and linked records from the European Nucleotide Archive to citations in open-access publications aggregated at Europe PubMed Central.
Fukushima J Med Sci
October 2021
This paper reports on the IAEA's Consultancy Meeting on "low-dose radiation for patients and population -Science, Technology and Society (STS) concepts for communication and perception among medical doctors and stakeholders-", which was held on October 21 and 22, 2020. The meeting consisted of seven presentation sessions, with a total of 27 presentations and 39 participants from seven countries. The meeting focused on various areas including environmental, food, and personal dosimetry;radiation and other secondary health effects after nuclear disasters;communication between medical professionals and patients or residents;and medical education on nuclear accidents.
View Article and Find Full Text PDFThe potential of big data to support businesses has been demonstrated in financial services, manufacturing, and telecommunications. Here, we report on efforts to enter a new data era in plant breeding by collecting genomic and phenotypic information from 12,858 wheat genotypes representing 6575 single-cross hybrids and 6283 inbred lines that were evaluated in six experimental series for yield in field trials encompassing ~125,000 plots. Integrating data resulted in twofold higher prediction ability compared with cases in which hybrid performance was predicted across individual experimental series.
View Article and Find Full Text PDFRye (Secale cereale L.) is an exceptionally climate-resilient cereal crop, used extensively to produce improved wheat varieties via introgressive hybridization and possessing the entire repertoire of genes necessary to enable hybrid breeding. Rye is allogamous and only recently domesticated, thus giving cultivated ryes access to a diverse and exploitable wild gene pool.
View Article and Find Full Text PDFExperimental data is only useful to other researchers if it is findable, accessible, interoperable, and reusable (FAIR). The ISA-Tab framework enables scientists to publish metadata about their experiments in a plain text, machine-readable format that aims to confer that interoperability and reusability. A Python software package (isatools) is currently being developed to programmatically produce these metadata files.
View Article and Find Full Text PDFSequence assembly of large and repeat-rich plant genomes has been challenging, requiring substantial computational resources and often several complementary sequence assembly and genome mapping approaches. The recent development of fast and accurate long-read sequencing by circular consensus sequencing (CCS) on the PacBio platform may greatly increase the scope of plant pan-genome projects. Here, we compare current long-read sequencing platforms regarding their ability to rapidly generate contiguous sequence assemblies in pan-genome studies of barley (Hordeum vulgare).
View Article and Find Full Text PDFThis article describes some use case studies and self-assessments of FAIR status of de.NBI services to illustrate the challenges and requirements for the definition of the needs of adhering to the FAIR (findable, accessible, interoperable and reusable) data principles in a large distributed bioinformatics infrastructure. We address the challenge of heterogeneity of wet lab technologies, data, metadata, software, computational workflows and the levels of implementation and monitoring of FAIR principles within the different bioinformatics sub-disciplines joint in de.
View Article and Find Full Text PDFAdvances in genomics have expedited the improvement of several agriculturally important crops but similar efforts in wheat (Triticum spp.) have been more challenging. This is largely owing to the size and complexity of the wheat genome, and the lack of genome-assembly data for multiple wheat lines.
View Article and Find Full Text PDFThe German Network for Bioinformatics Infrastructure (de.NBI) is a national and academic infrastructure funded by the German Federal Ministry of Education and Research (BMBF). The de.
View Article and Find Full Text PDFDuckweeds are small, free-floating, morphologically highly reduced organisms belonging to the monocot order Alismatales. They display the most rapid growth among flowering plants, vary ~ 14-fold in genome size and comprise five genera. Spirodela is the phylogenetically oldest genus with only two mainly asexually propagating species: S.
View Article and Find Full Text PDFBackground: The FAIR data principle as a commitment to support long-term research data management is widely accepted in the scientific community. Although the ELIXIR Core Data Resources and other established infrastructures provide comprehensive and long-term stable services and platforms for FAIR data management, a large quantity of research data is still hidden or at risk of getting lost. Currently, high-throughput plant genomics and phenomics technologies are producing research data in abundance, the storage of which is not covered by established core databases.
View Article and Find Full Text PDF