Plant genome duplication database (PGDDB)

From PGML

Jump to: navigation, search

Contents

URL

Tha Plant Genome Duplication Database portal can be accessed here.

Introduction

Gene duplication is an important source for genetic novelty (LYNCH and CONERY 2000). As abrupt evolutionary events whole genome duplications (WGDs) and segmental DNA duplications (SDs) prevailed and recursively occurred in flowering plants’ history (BLANC and WOLFE 2004; SOLTIS 2005), contributing the most of gene redundancy. Generally, a genome doubling was often followed by quick and severe genome turmoil, characterized by extensive chromosomal rearrangements and massive DNA losses (PATERSON et al. 2004; WANG et al. 2005).

This database exploits the available plant genome sequences and displays the duplicated genes survived the genome turmoil in various plant genomes, at present including Arabidopsis, poplar and rice, and in near future, sorghum and papaya. Our effort will surely contribute to revealing the extensiveness of paralogous chromosomal segments and understand the evolution of genome structure. Simultaneously, it provides a platform to research into the evolution of the duplicated genes in various plant genomes. The duplicated genes derived from large scale duplication events provide valuable materials to understand the rules governing gene evolution and functionalization. Moreover, we include the information of cross-species chromosomal homology, which will benefit comparative exploration of genomic structural and functional changes.

The chromosomal homology pattern, specifically, duplication pattern within a genome, were detected by checking gene colinearity, the property of conservation of gene order and gene content between two chromosomal segments within a genome or between genomes. We run a software named ColinearScan which implemented a colinearity approach described previously (WANG et al. 2006), to locate chromosomal homology adopting parameters theoretically inferred and experientially curated. To date the duplicated segments revealed we calculated the substitution rates on the nonsynonymous sites (Ka) and synonymous sites (Ks), and nucleotide transversion (changes from purines to pyrimidines or vice versa) rates on four-fold degenerate synonymous sites (4DTV). Experientially, we found that 4DTV and Ka could not provide a resolution comparable to that of Ks. While studying the evolution of homologous genes in this database, one may focus on the paralogs on the large homologous chromosomal segments, which are more probably to have a definite derivative source.

Duplication pattern in Arabidopsis

As a model plant, Arabidopsis was the first selected to decipher its genome sequences. Though having a small genome, ~125 Mbp, and not suspected to have undergone genome doubling, its genome was proposed to have been derived from three rounds of polyploidization, referred to alpha, beta, and gamma (BOWERS et al. 2003; SIMILLION et al. 2002). The dotplot of duplicated genes shows massive chromosomal rearrangements and gene losses reshaped Arabidopsis genome after genome doublings. There are several large duplicated blocks, with the largest one containing 245 collinear paralogous genes. According to the distribution of median Ks value of paralogs on each duplicated segments, the duplicated segments with Ks around 0.85 were likely to have been produced by alpha whole genome duplication (WGD), those with Ks around 1.8 correspond to beta-WGD, while those with Ks < 0.5 produced by some recent segmental duplication events. The duplicated segments produced by the supposed gamma-WGD could not be distinctively determined.

Duplication pattern in poplar

Poplar is the first tree that has its genome sequenced for its genome’s relatively small size (just over 500 Mbp). Similar to Arabidopsis, poplar was also proposed to have undergone three rounds of genome doublings (TUSKAN et al. 2006), for simplicity, here referred to PGD1, PGD2, PGD3 from recent to ancient. The PGD1 could be inferred to have occurred after Arabidopsis-poplar divergence, while PGD2 has not been determined whether it occurred before or after the divergence. The dotplot of duplicated genes shows some chromosomal rearrangements and gene losses reshaped poplar genome after genome doublings, but the extent is not comparable with that of Arabidopsis genome. There are tens of large duplicated blocks, with the largest one containing 765 collinear paralogous genes. According to the distribution of median Ks value of paralogs on each duplicated segments, the duplicated segments with Ks around 0.3 were likely to have been produced by PGD1, those with Ks around 1.25 correspond to PGD2. The duplicated segments produced by the supposed PGD3 could not be distinctively determined. For the evolutionary rate in poplar is several times lower than that in Arabidopsis and other species, a direct comparison of the Ks values is not valuable for determine the relationship of whole genome duplications revealed in poplar and Arabidopsis genomes. The above findings indicate the poplar is much conservative in both genomic structure and evolution compared with Arabidopsis and perhaps many other species.

Duplication pattern in rice

As model plant to monocots, especially to cereals, rice was selected to have its genome sequenced. Actually, two rice cultivated subspecies indica and japonica genomes were decoded by different international groups independently (GOFF et al. 2002; INTERNATIONAL RICE GENOME SEQUENCING PROJECT 2005; YU et al. 2005). Though having a small genome of ~430 Mbp, rice genome was inferred to have been affected by polyploidy and other large segmental duplication events (PATERSON et al. 2004; WANG et al. 2005). The dotplot of duplicated genes shows some chromosomal rearrangements and gene losses reshaped poplar genome after genome doublings. There are about ten large duplicated blocks, with the largest one containing 251 collinear paralogous genes. According to the distribution of median Ks value of paralogs on each duplicated segments, the duplicated segments with Ks around 0.8 were likely to have been produced by a whole genome duplication, referred RGD, those with Ks smaller than 0.5 correspond to recent segmental duplications.

Chromosomal homology between Arabidopsis and rice

Arabidopsis and rice, models to dicot and monocot plants respectively, diverged around 200 million years ago(WOLFE et al. 1989). The dotplot of homologous genes shows massive chromosomal rearrangements after their divergence, which may partially be attributed to the independent whole genome duplications and massive genome structure reshaping in each linage. Most homologous blocks are quite small, with the largest one containing only 17 collinear homologous genes. According to the distribution of median Ks values of Arabidopsis-rice homologs, there is a peak at 2.

Chromosomal homology between Arabidopsis and poplar

Arabidopsis and poplar diverged around 120 million years ago. The dotplot of homologous genes shows massive chromosomal rearrangements after their divergence, which may partially be attributed to the independent whole genome duplications and massive genome structure reshaping in each linage. A lot of homologous blocks have tens of collinear homologous genes, with the largest one containing 148 collinear homologous genes. According to the distribution of median Ks values of Arabidopsis-poplar homologs, there is a peak at 2.

Chromosomal homology between poplar and rice

Poplar and rice, models to dicot and monocot plants respectively, diverged around 200 million years ago (WOLFE et al. 1989). The dotplot of homologous genes shows massive chromosomal rearrangements after their divergence, which may partially be attributed to the independent whole genome duplications and massive genome structure reshaping in each linage. Most homologous blocks are quite small, with the largest one containing only 32 collinear homologous genes. According to the distribution of median Ks values of poplar-rice homologs, there is a peak at 1.7.

References

BLANC, G., and K. H. WOLFE, 2004 Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell 16: 1667-1678.

BOWERS, J. E., B. A. CHAPMAN, J. RONG and A. H. PATERSON, 2003 Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422: 433-438.

GOFF, S. A., D. RICKE, T. H. LAN, G. PRESTING, R. WANG et al., 2002 A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296: 92-100.

INTERNATIONAL RICE GENOME SEQUENCING PROJECT, 2005 The map-based sequence of the rice genome. Nature 436: 793-800.

LYNCH, M., and J. S. CONERY, 2000 The evolutionary fate and consequences of duplicate genes. Science 290: 1151-1155.

PATERSON, A. H., J. E. BOWERS and B. A. CHAPMAN, 2004 Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc. Natl. Acad. Sci. U.S.A 101: 9903-9908.

SIMILLION, C., K. VANDEPOELE, M. C. VAN MONTAGU, M. ZABEAU and Y. VAN DE PEER, 2002 The hidden duplication past of Arabidopsis thaliana. Proc Natl Acad Sci U S A 99: 13627-13632.

SOLTIS, P. S., 2005 Ancient and recent polyploidy in angiosperms. New Phytol 166: 5-8.

TUSKAN, G. A., S. DIFAZIO, S. JANSSON, J. BOHLMANN, I. GRIGORIEV et al., 2006 The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313: 1596-1604.

WANG, X., X. SHI, B. HAO, S. GE and J. LUO, 2005 Duplication and DNA segmental loss in the rice genome: implications for diploidization. New Phytologist 165: 937-946.

WANG, X., X. SHI, Z. LI, Q. ZHU, L. KONG et al., 2006 Statistical inference of chromosomal homology based on gene colinearity and applications to Arabidopsis and rice. BMC Bioinformatics 7: 447.

WOLFE, K. H., M. GOUY, Y. W. YANG, P. M. SHARP and W. H. LI, 1989 Date of the monocot-dicot divergence estimated from chloroplast DNA sequence data. Proc Natl Acad Sci U S A 86: 6201-6205.

YU, J., J. WANG, W. LIN, S. LI, H. LI et al., 2005 The Genomes of Oryza sativa: A History of Duplications. PLoS Biology 3: e38.


Back to Main Page


-Contributed by Xiyin Wang, 16:44, 10 April 2007 (EDT)

Personal tools