Objective: Identifying fraud in healthcare programs is crucial, as an estimated 3%-10% of the total healthcare expenditures are lost to fraudulent activities. This study presents a systematic literature review of machine learning techniques applied to fraud detection in health insurance claims. We aim to analyze the data and methodologies documented in the literature over the past two decades, providing insights into research challenges and opportunities.
Methods: We identified research studies on health insurance fraud detection using machine learning approaches from databases such as Google Scholar, Springer-Link journals, Elsevier, PubMed, Excerpta Medica Database (EMBASE), Scopus, the Association for Computing Machinery (ACM) Digital Library, and the Institute of Electrical and Electronics Engineers (IEEE) Xplore Digital Library. We included only articles that presented experimental results of machine learning-based approaches applied to healthcare claims. From the reviewed articles, 137 were selected for the final qualitative and quantitative analyses.
Results: In recent years, there has been a surge in publications centered on the use of machine learning to detect health insurance fraud. Among these studies, those focused on the detection of fraud committed by healthcare providers was the most prevalent, followed by fraud committed by patients. A wide variety of machine learning algorithms are highlighted in these studies, ranging from unsupervised (41 studies) and supervised methods (94 studies), to hybrid approaches (12 studies). While traditional machine learning approaches remain dominant in this research area, the adoption of advanced deep learning techniques is on the rise. Considering the type of healthcare claims data used, 30 studies utilized private data sources, while the rest used publicly available datasets. Data from 16 countries were utilized, with a majority coming from the United States (96 studies), followed by China (11 studies) and Australia (5 studies).
Discussion And Conclusion: Detecting fraud in healthcare claims using machine learning presents several challenges. These include inconsistent data, absence of data standardization and integration, privacy concerns, and a limited number of labeled fraudulent cases to train models on. Future work should focus on enhancing transparency in data preparation, promoting the sharing of fraud investigation outcomes by authorities, and developing benchmark datasets to enhance accessibility and comparability. Furthermore, innovative techniques in data sampling, feature encoding methods for training machine learning models, and exploring the latest advancements in deep learning can significantly advance research in health insurance fraud detection.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.artmed.2024.103061 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!