Secondary use of health data is made difficult in part because of large semantic heterogeneity. Many efforts are being made to align local terminologies with international standards. With increasing concerns about data privacy, we focused here on the use of machine learning methods to align biological data elements using aggregated features that could be shared as open data. A 3-step methodology (features engineering, blocking strategy and supervised learning) was proposed. The first results, although modest, are encouraging for the future development of this approach.

Download full-text PDF

Source
http://dx.doi.org/10.3233/SHTI220469DOI Listing

Publication Analysis

Top Keywords

data privacy
8
data
6
data element
4
element mapping
4
mapping data
4
privacy era
4
era secondary
4
secondary health
4
health data
4
data difficult
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!