Background: Regularized regression methods such as principal component or partial least squares regression perform well in learning tasks on high dimensional spectral data, but cannot explicitly eliminate irrelevant features. The random forest classifier with its associated Gini feature importance, on the other hand, allows for an explicit feature elimination, but may not be optimally adapted to spectral data due to the topology of its constituent classification trees which are based on orthogonal splits in feature space.
Results: We propose to combine the best of both approaches, and evaluated the joint use of a feature selection based on a recursive feature elimination using the Gini importance of random forests' together with regularized classification methods on spectral data sets from medical diagnostics, chemotaxonomy, biomedical analytics, food science, and synthetically modified spectral data.
Stopped flow spectroscopy is an established technique for acquiring kinetic data on dynamic processes in chemical and biochemical reactions, and Fourier transform infrared (FT-IR) techniques can provide particularly rich structural information on biological macromolecules. However, it is a considerable challenge to design an FT-IR stopped flow system with an optical path length low enough for work with aqueous (1H2O) solutions. The system presented here is designed for minimal sample volumes (approximately 5 microL) and allows simultaneous FT-IR rapid-scan and VIS measurements.
View Article and Find Full Text PDFWe have investigated the conformational transition and aggregation process of recombinant Syrian hamster prion protein (SHaPrP90-232) by Fourier transform infrared spectroscopy, circular dichroism spectroscopy, light scattering, and electron microscopy under equilibrium and kinetic conditions. SHaPrP90-232 showed an infrared absorbance spectrum typical of proteins with a predominant alpha-helical structure both at pH 7.0 and at pH 4.
View Article and Find Full Text PDF