Publications by Yongan Zhao

Publications by authors named "Yongan Zhao"

Page 1 of 1

A Secure Alignment Algorithm for Mapping Short Reads to Human Genome.

J Comput Biol

June 2018

The elastic and inexpensive computing resources such as clouds have been recognized as a useful solution to analyzing massive human genomic data (e.g., acquired by using next-generation sequencers) in biomedical researches.

View Article and Find Full Text PDF

Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks.

Jean Louis Raisaro Florian Tramèr Zhanglong Ji Diyue Bu Yongan Zhao

J Am Med Inform Assoc

July 2017

The Global Alliance for Genomics and Health (GA4GH) created the Beacon Project as a means of testing the willingness of data holders to share genetic data in the simplest technical context-a query for the presence of a specified nucleotide at a given position within a chromosome. Each participating site (or "beacon") is responsible for assuring that genomic data are exposed through the Beacon service only with the permission of the individual to whom the data pertains and in accordance with the GA4GH policy and standards.While recognizing the inference risks associated with large-scale data aggregation, and the fact that some beacons contain sensitive phenotypic associations that increase privacy risk, the GA4GH adjudged the risk of re-identification based on the binary yes/no allele-presence query responses as acceptable.

View Article and Find Full Text PDF

Secure Genomic Computation through Site-Wise Encryption.

Yongan Zhao XiaoFeng Wang Haixu Tang

AMIA Jt Summits Transl Sci Proc

August 2015

Commercial clouds provide on-demand IT services for big-data analysis, which have become an attractive option for users who have no access to comparable infrastructure. However, utilizing these services for human genome analysis is highly risky, as human genomic data contains identifiable information of human individuals and their disease susceptibility. Therefore, currently, no computation on personal human genomic data is conducted on public clouds.

View Article and Find Full Text PDF

A community assessment of privacy preserving techniques for human genomes.

Xiaoqian Jiang Yongan Zhao Xiaofeng Wang Bradley Malin Shuang Wang

BMC Med Inform Decis Mak

September 2015

To answer the need for the rigorous protection of biomedical data, we organized the Critical Assessment of Data Privacy and Protection initiative as a community effort to evaluate privacy-preserving dissemination techniques for biomedical data. We focused on the challenge of sharing aggregate human genomic data (e.g.

View Article and Find Full Text PDF

Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery.

Yongan Zhao Xiaofeng Wang Xiaoqian Jiang Lucila Ohno-Machado Haixu Tang

J Am Med Inform Assoc

January 2015

Objective: To propose a new approach to privacy preserving data selection, which helps the data users access human genomic datasets efficiently without undermining patients' privacy.

Methods: Our idea is to let each data owner publish a set of differentially-private pilot data, on which a data user can test-run arbitrary association-test algorithms, including those not known to the data owner a priori. We developed a suite of new techniques, including a pilot-data generation approach that leverages the linkage disequilibrium in the human genome to preserve both the utility of the data and the privacy of the patients, and a utility evaluation method that helps the user assess the value of the real data from its pilot version with high confidence.

View Article and Find Full Text PDF

RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data.

Yongan Zhao Haixu Tang Yuzhen Ye

Bioinformatics

January 2012

Summary: With the wide application of next-generation sequencing (NGS) techniques, fast tools for protein similarity search that scale well to large query datasets and large databases are highly desirable. In a previous work, we developed RAPSearch, an algorithm that achieved a ~20-90-fold speedup relative to BLAST while still achieving similar levels of sensitivity for short protein fragments derived from NGS data. RAPSearch, however, requires a substantial memory footprint to identify alignment seeds, due to its use of a suffix array data structure.

View Article and Find Full Text PDF