Publications by authors named "Grzymala-Busse J"

As previous research indicates, a multiple-scanning methodology for discretization of numerical datasets, based on entropy, is very competitive. Discretization is a process of converting numerical values of the data records into discrete values associated with numerical intervals defined over the domains of the data records. In multiple-scanning discretization, the last step is the merging of neighboring intervals in discretized datasets as a kind of postprocessing.

View Article and Find Full Text PDF

Recently, research on mining microRNA (or miRNA) expression data has received a lot of attention, mainly because of its role in gene regulation. However, such type of data - usually saved in the form of microarrays - are very specific, because they contain only a small number of cases (often less than 100) compared with large number of attributes (equal to several hundreds or even tens of thousand). The small number of cases available during the learning process can cause instability of the newly created classifiers.

View Article and Find Full Text PDF

Issues obstructing progress in data mining for improved health outcomes include data quality problems, data redundancy, data inconsistency, repeated measures, temporal (time-contextual) measures, and data volume. Related issues involve theoretical and technical problems involving uncertainty management, missing data and missing values, and matching appropriate data mining techniques to patient data sets. Results of data mining research in progress are reported for Duke University's perinatal database that contains nearly a decade of clinical patient data, 71,753 database (patient) records and 4-5000 variables per patient.

View Article and Find Full Text PDF

Normal pregnancy involves a term of 40 weeks gestation. Problems associated with low birthweight and prematurity continue to plague childbearing families and the healthcare system because 8-12% of all newborns in the United States deliver prior to 37 weeks gestation. The high cost of caring for premature babies increasingly treats all pregnant women as if they are 'high risk' for preterm birth.

View Article and Find Full Text PDF

Objective: Develop a prototype expert system for preterm birth risk assessment of pregnant women. Normal gestation involves a term of 40 weeks, but because 8-12% of the newborns in the United States are delivered prior to 37 weeks' gestation, problems associated with prematurity continue to plague individuals, families, and the health care system.

Design: A knowledge-base development methodology used machine learning, statistical analysis, and validation techniques to analyze three large datasets (18,890 subjects and 214 variables).

View Article and Find Full Text PDF

Prediction of preterm birth is a poorly understood domain. The existing manual methods of assessment of preterm birth are 17%-38% accurate. The machine learning system LERS was used for three different datasets about pregnant women.

View Article and Find Full Text PDF

LERS-LB (Learning from Examples using Rough Sets Lower Boundaries) is a computer program based on rough set theory for knowledge acquisition, which extracts patterns from real-world data in generating production rules for expert system development. From LERS-LB evaluation of an SPSS-X data file containing data for recovery room patients, it was concluded that both statistical data files and existing databases can be converted to decision-table format needed by LERS-LB, but it is less desirable to work with statistical files than a well-developed database. It was also concluded that choosing a well-developed database and checking it thoroughly for accuracy and completeness should be done before running LERS-LB, or other learning programs, to avoid problems with data errors.

View Article and Find Full Text PDF