This paper proposes a new method that can identify and predict financial fraud among listed companies based on machine learning. We collected 18,060 transactions and 363 indicators of finance, including 362 financial variables and a class variable. Then, we eliminated 9 indicators which were not related to financial fraud and processed the missing values. After that, we extracted 13 indicators from 353 indicators which have a big impact on financial fraud based on multiple feature selection models and the frequency of occurrence of features in all algorithms. Then, we established five single classification models and three ensemble models for the prediction of financial fraud records of listed companies, including LR, RF, XGBOOST, SVM, and DT and ensemble models with a voting classifier. Finally, we chose the optimal single model from five machine learning algorithms and the best ensemble model among all hybrid models. In choosing the model parameter, optimal parameters were selected by using the grid search method and comparing several evaluation metrics of models. The results determined the accuracy of the optimal single model to be in a range from 97% to 99%, and that of the ensemble models as higher than 99%. This shows that the optimal ensemble model performs well and can efficiently predict and detect fraudulent activity of companies. Thus, a hybrid model which combines a logistic regression model with an XGBOOST model is the best among all models. In the future, it will not only be able to predict fraudulent behavior in company management but also reduce the burden of doing so.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9407419 | PMC |
http://dx.doi.org/10.3390/e24081157 | DOI Listing |
United States and European Union laws demand separate clinical studies in children as a condition for drugs' marketing approval. Justified by carefully framed pseudo-scientific wordings, more so the European Medicines Agency than the United States Food and Drug Administration, "Pediatric Drug Development" is probably the largest abuse in medical research in history. Preterm newborns are immature and vulnerable, but they grow.
View Article and Find Full Text PDFBMC Med Inform Decis Mak
January 2025
Department of Electrical Engineering, ESAT-STADIUS, KU Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium.
Background: Waste and fraud are important problems for health insurers to deal with. With the advent of big data, these insurers are looking more and more towards data mining and machine learning methods to help in detecting waste and fraud. However, labeled data is costly and difficult to acquire as it requires expert investigators and known care providers with atypical behavior.
View Article and Find Full Text PDFFood Chem
January 2025
State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, PR China; Key Laboratory of Biology and Cultivation of Herb Medicine, Ministry of Agriculture and Rural Affairs, Beijing 100700, PR China. Electronic address:
Ensuring food traceability is essential for maintaining safety and authenticity. Angelica sinensis (Oliv.) Diels (AS), a medicinal food prized for its rich nutritional value and tonic effects, is frequently vulnerable to geographic origin fraud.
View Article and Find Full Text PDFFood Chem
December 2024
Institute of Food Science and Technology, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, PR China. Electronic address:
To preemptively predict unknown protein adulterants in food and curb the incidence of food fraud at its origin, data-driven models were developed using three machine learning (ML) algorithms. Among these, the random forest (RF)-based model achieved optimal performance, achieving accuracies of 96.2 %, 95.
View Article and Find Full Text PDFPLoS One
January 2025
Department of Computer Science and Engineering at Hanyang University ERICA, Ansan-si, Gyeonggi-do, South Korea.
Privacy-preserving record linkage (PPRL) technology, crucial for linking records across datasets while maintaining privacy, is susceptible to graph-based re-identification attacks. These attacks compromise privacy and pose significant risks, such as identity theft and financial fraud. This study proposes a zero-relationship encoding scheme that minimizes the linkage between source and encoded records to enhance PPRL systems' resistance to re-identification attacks.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!