Data science and machine learning in materials science require large datasets of technologically relevant molecules or materials. Currently, publicly available molecular datasets with realistic molecular geometries and spectral properties are rare. We here supply a diverse benchmark spectroscopy dataset of 61,489 molecules extracted from organic crystals in the Cambridge Structural Database (CSD), denoted OE62.
View Article and Find Full Text PDFInstant machine learning predictions of molecular properties are desirable for materials design, but the predictive power of the methodology is mainly tested on well-known benchmark datasets. Here, we investigate the performance of machine learning with kernel ridge regression (KRR) for the prediction of molecular orbital energies on three large datasets: the standard QM9 small organic molecules set, amino acid and dipeptide conformers, and organic crystal-forming molecules extracted from the Cambridge Structural Database. We focus on the prediction of highest occupied molecular orbital (HOMO) energies, computed at the density-functional level of theory.
View Article and Find Full Text PDFDeep learning methods for the prediction of molecular excitation spectra are presented. For the example of the electronic density of states of 132k organic molecules, three different neural network architectures: multilayer perceptron (MLP), convolutional neural network (CNN), and deep tensor neural network (DTNN) are trained and assessed. The inputs for the neural networks are the coordinates and charges of the constituent atoms of each molecule.
View Article and Find Full Text PDFSeveral physiological processes have been purported for cellular prion protein (PrP(C)). However, the physiological function of PrP(C) is still unclear and the cellular localization of PrP(C) remains a subject of debate. PrP(C) is expressed in a wide range of tissues including islets of Langerhans.
View Article and Find Full Text PDFBecause of their high antigen specificity and metabolic stability, genetically engineered human monoclonal antibodies are on the way to becoming one of the most promising medical diagnostics and therapeutics. In order to establish an in vitro system capable of producing such biosimilar antibodies, we used human constant chain sequences to design the novel human antibody expressing vector cassette pMAB-ABX. A bidirectional tetracycline (tet)-controllable promotor was used for harmonized expression of immunoglobulin type G (IgG) heavy and light chains.
View Article and Find Full Text PDF