Background: Global disparities in prostate cancer (PCa) incidence highlight the urgent need to identify genomic abnormalities in prostate tumors in different ethnic populations including Asian men. Objective: To systematically explore the genomic complexity and define disease-driven genetic alterations in PCa. Design, setting, and participants: The study sequenced whole-genome and transcriptome of tumor-benign paired tissues from 65 treatment-naive Chinese PCa patients. Subsequent targeted deep sequencing of 293 PCa-relevant genes was performed in another cohort of 145 prostate tumors. Outcome measurements and statistical analysis: The genomic alteration landscape in PCa was analyzed using an integrated computational pipeline. Relationships with PCa progression and survival were analyzed using nonparametric test, log-rank, and multivariable Cox regression analyses. Results and limitations: We demonstrated an association of high frequency of CHD1 deletion with a low rate of TMPRSS2-ERG fusion and relatively high percentage of mutations in androgen receptor upstream activator genes in Chinese patients. We identified five putative clustered deleted tumor suppressor genes and provided experimental and clinical evidence that PCDH9, deleted/loss in approximately 23% of tumors, functions as a novel tumor suppressor gene with prognostic potential in PCa. Furthermore, axon guidance pathway genes were frequently deregulated, including gain/amplification of PLXNA1 gene in approximately 17% of tumors. Functional and clinical data analyses showed that increased expression of PLXNA1 promoted prostate tumor growth and independently predicted prostate tumor biochemical recurrence, metastasis, and poor survival in multi-institutional cohorts of patients with PCa. A limitation of this study is that other genetic alterations were not experimentally investigated. Conclusions: There are shared and salient genetic characteristics of PCa in Chinese and Caucasian men. Novel genetic alterations in PCDH9 and PLXNA1 were associated with disease progression. Patient summary: We reported the first large-scale and comprehensive genomic data of prostate cancer from Asian population. Identification of these genetic alterations may help advance prostate cancer diagnosis, prognosis, and treatment. We presented the first comprehensive genetic alteration landscape of prostate cancer in Chinese men and identify novel genes and progression pathways that may help advance prostate cancer diagnosis, prognosis, and personalized medicine.
Shancheng Ren 卫功宏 Liu Dongbing Liguo Wang Yong Hou Zhu Shida Peng Lihua Zhang Qin Cheng Yanbing Su Hong Zhou Xiuqing Jibin Zhang Li Fuqiang Hancheng Zheng Zhao Zhikun Changjun Yin He Zengquan Xin Gao Zhau Haiyen E. Chu Chia-Yi Wu Jason Boyang Colin Collins Volik Stas Bell Robert H. Jiaoti Huang Kui Wu Danfeng Xu Ye Dingwei Yongwei Yu Zhu Lianhui Qiao Meng Lee Hang Mao Yang Yuehong Zhu Yasheng Shi Xiaolei Rui Chen 王洋 徐卫东 Cheng Yan-Qiong 许传亮 Xu Gao Tie Zhou Bo Yang Jianguo Hou Liu Li Zhensheng Zhang 朱耀 Qin Chao Shao Pengfei Pang Jun Chung Leland W.K. 徐剑锋 Chin lee Wu Zhong Wei-De Xu Xun Yingrui Li Xiuqing Zhang Wang Jian 杨焕明 Wang Jun 黄浩杰 孙颖浩
European Urology
2018
RNA alternative splicing (AS) is an important post-transcriptional mechanism enabling single genes to produce multiple proteins. It has been well demonstrated that viruses deploy host AS machinery for viral protein productions. However, knowledge on viral AS is limited to a few disease-causing viruses in model species. Here we report a novel approach to characterizing viral AS using whole transcriptome dataset from host species. Two insect transcriptomes (Acheta domesticus and Planococcus citri) generated in the 1,000 Insect Transcriptome Evolution (1KITE) project were used as a proof of concept using the new pipeline. Two closely related densoviruses (Acheta domesticus densovirus, AdDNV, and Planococcus citri densovirus, PcDNV, Ambidensovirus, Densovirinae, Parvoviridae) were detected and analyzed for AS patterns. The results suggested that although the two viruses shared major AS features, dramatic AS divergences were observed. Detailed analysis of the splicing junctions showed clusters of AS events occurred in two regions of the virus genome, demonstrating that transcriptome analysis could gain valuable insights into viral splicing. When applied to large-scale transcriptomics projects with diverse taxonomic sampling, our new method is expected to rapidly expand our knowledge on RNA splicing mechanisms for a wide range of viruses.
Zhou Chengran Liu Shanlin Song Wenhui Luo Shiqi Meng Guanliang Yang Chentao Yang Hua Ma Jinmin Wang Liang Gao Shan Wang Jian 杨焕明 Zhao Yun Wang Hui Xin Zhou
Scientific Reports
2018
Quality control (QC) and preprocessing are essential steps for sequencing data analysis to ensure the accuracy of results. However, existing tools cannot provide a satisfying solution with integrated comprehensive functions, proper architectures, and highly scalable acceleration. In this article, we demonstrate SOAPnuke as a tool with abundant functions for a "QC-Preprocess-QC" workflow and MapReduce acceleration framework. Four modules with different preprocessing functions are designed for processing datasets from genomic, small RNA, Digital Gene Expression, and metagenomic experiments, respectively. As a workflow-like tool, SOAPnuke centralizes processing functions into 1 executable and predefines their order to avoid the necessity of reformatting different files when switching tools. Furthermore, the MapReduce framework enables large scalability to distribute all the processing works to an entire compute cluster. We conducted a benchmarking where SOAPnuke and other tools are used to preprocess a ~30× NA12878 dataset published by GIAB. The standalone operation of SOAPnuke struck a balance between resource occupancy and performance. When accelerated on 16 working nodes with MapReduce, SOAPnuke achieved ~5.7 times the fastest speed of other tools.
Chen Yuxin Chen Yongsheng Shi Chun Mei Huang Zhibo Zhang Yong Li Shengkang Li Yan Ye Jia Yu Chang Li Zhuo Xiuqing Zhang Wang Jian 杨焕明 Fang Lin Chen Qiang
GigaScience
2018
Increasing our understanding of Earth's biodiversity and responsibly stewarding its resources are among the most crucial scientific and social challenges of the new millennium. These challenges require fundamental new knowledge of the organization, evolution, functions, and interactions among millions of the planet's organisms. Herein, we present a perspective on the Earth BioGenome Project (EBP), a moonshot for biology that aims to sequence, catalog, and characterize the genomes of all of Earth's eukaryotic biodiversity over a period of 10 years. The outcomes of the EBP will inform a broad range of major issues facing humanity, such as the impact of climate change on biodiversity, the conservation of endangered species and ecosystems, and the preservation and enhancement of ecosystem services. We describe hurdles that the project faces, including data-sharing policies that ensure a permanent, freely available resource for future scientific discovery while respecting access and benefit sharing guidelines of the Nagoya Protocol. We also describe scientific and organizational challenges in executing such an ambitious project, and the structure proposed to achieve the project's goals. The far-reaching potential benefits of creating an open digital repository of genomic information for life on Earth can be realized only by a coordinated international effort.
Harris Lewin Gene Robinson John Kress Baker William J. Coddington Jonathan Crandall Keith A. Richard Durbin Edwards Scott V. Forest Félix Gilbert M. Thomas P. Goldstein Melissa M. Igor Grigoriev Hackett Kevin J.J. David Haussler Erich Jarvis Johnson Warren E. Aristides Patrinos Stephen Richards Castilla-Rubio Juan Carlos Van Sluys Marie-Anne Soltis Pamela S. 徐迅 杨焕明 张国捷
Proceedings of the National Academy of Sciences of the United States of America
2018
Purpose: Recent studies demonstrate that whole-genome sequencing enables detection of cryptic rearrangements in apparently balanced chromosomal rearrangements (also known as balanced chromosomal abnormalities, BCAs) previously identified by conventional cytogenetic methods. We aimed to assess our analytical tool for detecting BCAs in the 1000 Genomes Project without knowing which bands were affected. Methods: The 1000 Genomes Project provides an unprecedented integrated map of structural variants in phenotypically normal subjects, but there is no information on potential inclusion of subjects with apparent BCAs akin to those traditionally detected in diagnostic cytogenetics laboratories. We applied our analytical tool to 1,166 genomes from the 1000 Genomes Project with sufficient physical coverage (8.25-fold). Results: With this approach, we detected four reciprocal balanced translocations and four inversions, ranging in size from 57.9 kb to 13.3 Mb, all of which were confirmed by cytogenetic methods and polymerase chain reaction studies. One of these DNAs has a subtle translocation that is not readily identified by chromosome analysis because of the similarity of the banding patterns and size of exchanged segments, and another results in disruption of all transcripts of an OMIM gene. Conclusion: Our study demonstrates the extension of utilizing low-pass whole-genome sequencing for unbiased detection of BCAs including translocations and inversions previously unknown in the 1000 Genomes Project.
Dong Zirui Wang Huilin Chen Haixiao Jiang Hui Yuan Jianying Yang Zhenjun Wang Wen-Jing Xu Fengping Xiaosen Guo Cao Ye Zhu Zhenzhen Geng Chunyu Cheung Wan Chee Kwok Yvonne K. 杨焕明 Leung Tak Yeung Morton Cynthia C. Sauwai Cheung Kwongwai Choy
Genetics in Medicine
2018
Colorectal cancer is the fifth prevalent cancer in China. Nevertheless, a largescale characterization of Chinese colorectal cancer mutation spectrum has not been carried out. In this study, we have performed whole exome-sequencing analysis of 98 patients' tumor samples with matched pairs of normal colon tissues using Illumina and Complete Genomics high-throughput sequencing platforms. Canonical CRC somatic gene mutations with high prevalence ( > 10%) have been verified, including TP53, APC, KRAS, SMAD4, FBXW7 and PIK3CA. PEG3 is identified as a novel frequently mutated gene (10.6%). APC and Wnt signaling exhibit significantly lower mutation frequencies than those in TCGA data. Analysis with clinical characteristics indicates that APC gene and Wnt signaling display lower mutation rate in lymph node positive cancer than negative ones, which are not observed in TCGA data. APC gene and Wnt signaling are considered as the key molecule and pathway for colorectal cancer initiation, and these findings greatly undermine their importance in tumor progression for Chinese patients. Taken together, the application of nextgeneration sequencing has led to the determination of novel somatic mutations and alternative disease mechanisms in colorectal cancer progression, which may be useful for understanding disease mechanism and personalizing treatment for Chinese patients.
Liu Zhe Yang Chao Xiangchun Li Luo Wen Roy Bhaskar Xiong Teng Xiuqing Zhang 杨焕明 王健 Ye Zhenhao Chen Yang Song Jinghe Ma Shuai Zhou Yong Min Yang Xiaodong Fang 杜杰
Oncotarget
2018
Genomic full-length sequence of HLA-B*15:178 was identified by a group-specific sequencing approach from China.
He L. M. 杨焕明 Xu Yunping Hong W. X.
HLA
2018
Expanded carrier screening (ECS) has been demonstrated to increase the detection rate of carriers compared with traditional tests. The aim of this study was to assess the potential value of ECS for clinical application in Southern China, a region with high prevalence of thalassemia and with diverse ethnic groups, and to provide a reference for future implementations in areas with similar population characteristics. A total of 10,476 prenatal/preconception couples from 34 self-reported ethnic groups were simultaneously tested and analyzed anonymously for 11 Mendelian disorders using targeted next-generation sequencing. Overall, 27.49% of individuals without self-reported family history of disorders were found to be carriers of at least 1 of the 11 conditions, and the carrier frequency varied greatly between ethnic groups, ranging from 4.15% to 81.35%. Furthermore, 255 couples (2.43%) were identified as carrier couples at an elevated risk having an affected baby, sixty-five of which would not have been identified through the existing screening strategy, which only detects thalassemia. The modeled risk of fetuses being affected by any of the selected disorders was 531 per 100,000 (95% CI, 497–567 per 100,000). Our data demonstrate the feasibility of ECS, and provide evidence that ECS is a promising alternative to traditional one-condition screening strategies. The lessons learned from this experience should be applicable for other countries or regions with diverse ethnic groups.
Zhao Sumin Xiang Jiale Fan Chunna Asan Shang Xuan Xinhua Zhang Chen Yan Zhu Bao-Sheng Cai Wang-Wei Shaoke Chen Ren Cai Guo Xiaoling Zhang Chonglin Zhou Yuqiu Huang Shuodan Liu Yan-hui Chen Biyan Yan Shanhuo Chen Yajun Ding Hongmei Guo Fengyu Wang YaoShen Zhong Wenwei Zhu Yaping Wang Yaling Chen Chao Li Yun Huang Hui Mao Mao Yin Ye Wang Jian 杨焕明 Xiangmin Xu Sun Jun Peng Zhiyu
European Journal of Human Genetics
2018
Objective: The coexistence of maternal malignancy and pregnancy has received increasing attention in Noninvasive prenatal testing (NIPT) studies. Malignancy in pregnant women potentially affects the copy number variation (CNV) profile in NIPT results. Only one case of hematologic cancer has been reported in a Hong-Kong pregnant women, and solid tumors have never been reported in pregnant Chinese women. Case report: The patients with dysgerminoma and cervical cancer showed aberrant chromosomal aneuploidies in NIPT and concordant patterns of genome disruption in tumor tissues. The genomic aberrations in the gastric cancer patient had similar copy number variation pattern of gastric cancer. Conclusion: The findings in this study and the literature review further validate the effect of maternal malignancy on the copy number variation profile in NIPT data and strengthen the possibility of detecting malignant tumors with NIPT in the future.
Xing Ji Chen Fang Zhou Yafeng Li Jia Yuan Yuying Mo Yu Liu Qiang Tseng Jen-Yu Shih-Chieh Lin Diego Shen S-H Liu Yu Ye Weiping Cheung Yuen Nei Yuen Ka Yiu Lin Siyuan Fu Hongyun Zhang Liu Na Wang Jian 杨焕明 Wang Yuying Li Shen Fan Shushu Jin Xin Mao Mao Sung Pi-Lin
Taiwanese Journal of Obstetrics and Gynecology
2018
Pangenome analyses facilitate the interpretation of genetic diversity and evolutionary history of a taxon. However, there is an urgent and unmet need to develop new tools for advanced pangenome construction and visualization, especially for metagenomic data. Here, we present an integrated pipeline, named MetaPGN, for construction and graphical visualization of pangenome networks from either microbial genomes or metagenomes. Given either isolated genomes or metagenomic assemblies coupled with a reference genome of the targeted taxon, MetaPGN generates a pangenome in a topological network, consisting of genes (nodes) and gene-gene genomic adjacencies (edges) of which biological information can be easily updated and retrieved. MetaPGN also includes a self-developed Cytoscape plugin for layout of and interaction with the resulting pangenome network, providing an intuitive and interactive interface for full exploration of genetic diversity. We demonstrate the utility of MetaPGN by constructing Escherichia coli pangenome networks from five E. coli pathogenic strains and 760 human gut microbiomes,revealing extensive genetic diversity of E. coli within both isolates and gut microbial populations. With the ability to extract and visualize gene contents and gene-gene physical adjacencies of a specific taxon from large-scale metagenomic data, MetaPGN provides advantages in expanding pangenome analysis to uncultured microbial taxa.
Peng Ye Tang Shanmei Wang Dan Zhong Huanzi Huijue Jia Cai Xianghang Zhang Zhaoxi Xiao Minfeng 杨焕明 Wang Jian Karsten Kristiansen Xu Xun Junhua Li
GigaScience
2018