AI Article Synopsis

  • Genomic data poses ethical and privacy risks due to the sensitive nature of personal information it contains, prompting the development of a privacy-preserving method called Varlock for secure data storage.
  • Varlock masks personal alleles in genomic data using public allele frequency information and allows for encrypted sharing while preserving valuable non-sensitive traits of the DNA.
  • The method is reversible, enabling specific personal alleles to be restored on demand, thus ensuring safety in sharing genomic information with clinics, patients, and researchers while maintaining data integrity.

Article Abstract

Background: The current and future applications of genomic data may raise ethical and privacy concerns. Processing and storing of this data introduce a risk of abuse by potential offenders since the human genome contains sensitive personal information. For this reason, we have developed a privacy-preserving method, named Varlock providing secure storage of sequenced genomic data. We used a public set of population allele frequencies to mask the personal alleles detected in genomic reads. Each personal allele described by the public set is masked by a randomly selected population allele with respect to its frequency. Masked alleles are preserved in an encrypted confidential file that can be shared in whole or in part using public-key cryptography.

Results: Our method masked the personal variants and introduced new variants detected in a personal masked genome. Alternative alleles with lower population frequency were masked and introduced more often. We performed a joint PCA analysis of personal and masked VCFs, showing that the VCFs between the two groups cannot be trivially mapped. Moreover, the method is reversible and personal alleles in specific genomic regions can be unmasked on demand.

Conclusion: Our method masks personal alleles within genomic reads while preserving valuable non-sensitive properties of sequenced DNA fragments for further research. Personal alleles in the desired genomic regions may be restored and shared with patients, clinics, and researchers. We suggest that the method can provide an additional security layer for storing and sharing of the raw aligned reads.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8487550PMC
http://dx.doi.org/10.1186/s12864-021-07996-2DOI Listing

Publication Analysis

Top Keywords

personal alleles
16
genomic data
12
personal
9
storage sequenced
8
sequenced genomic
8
public set
8
population allele
8
genomic reads
8
frequency masked
8
personal masked
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!