Objective: With an estimated global frequency ranging from5 % to 21 %, polycystic ovary syndrome (PCOS) is one of the most prevalent hormonal disorders. There are many factors found to be related to PCOS. However, most of these researches used traditional methods such as multiple logistic regression (LR). Nowadays, machine learning (Mach-L) emerges as a new method and can be used in medical researches. In the present study, there were two goals: 1. Compare the accuracy of five alternative Mach-L techniques with that of conventional LR. 2. Use Mach-L to forecast PCOS and prioritize the risk factors.
Materials And Methods: Totally, 170 PCOS patients and 950 control participants were included. We collected information on demographics, biochemistry, and lifestyle. PCOS was identified using Rotterdam criteria. Random Forest (RF), stochastic gradient boosting (SGB), multivariate adaptive regression splines (MARS), extreme gradient boosting (XGBoost), and gradient boosting with categorical features support (CatBoost) are five Mach-L algorithms that were used. Models with lower estimation errors were better.
Results: By using t-test, we found subjects with PCOS were younger, glutamic oxaloacetic transaminase (GOT), glutamic pyruvic transaminase (GPT), γ-Glutamyl transferase (γ-GT), Triglyceride (TG), and educational levels were higher. All the five Mach-L methods had lower estimation errors compared to LR. The average of the AUC derived from Mach-L was mean AUC of 0.6669, higher than the that of LR (0.5908). Finally, age, TG, GPT, white blood cell count (WBC), uric acid (UA), and platelet (Plt) were the six most important risk factors selected by Mach-L.
Conclusion: Mach-L methods overtook conventional LR and age was the most significant factor, followed by TG, GPT, WBC, UA, and Plt in a cohort of Chinese women.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.tjog.2024.09.019 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!