PGDD is a public database to identify and catalog plant genes in terms of intragenome or cross-genome syntenic relationships. Current work focus on plants with available whole genome sequences (preferrably assembled pseudomolecules with ordered gene models). A detailed description of this database can be found here.
Data source
| Plant genomes used in this database | ||||
| Species name | Common name | Release version | Gene number | Access |
| A. thaliana | thale cress | TAIR 7.0 (Aug. 2007) | 26784 | TAIR FTP |
| C. papaya | papaya | EVM (Jul. 2007) | 25536 | restricted |
| P. trichocarpa | poplar | JGI 1.1 (Dec. 2004) | 45554 | JGI HTTP |
| V. vinifera | grape | Genoscope (Aug. 2007) | 37829 | Genoscope HTTP |
| O. sativa (ssp. japonica) | rice | RAP 2.0 (Nov. 2007) | 30192 | RAP HTTP |
| S. bicolor * | sorghum | JGI 1.4 (Dec. 2007) | 34496 | JGI HTTP |
* Un-published genome data therefore temporarily restricted
Methods
Syntenic blocks
We used BLASTP to search for potential anchors (E <1e-5, top 5 matches) between every possible pair of chromosomes in multiple genomes. The homologous pairs are used as the input for MCscan. MCscan is a novel synteny search program that combines the merits of two existent algorithms. The built-in scoring scheme for MCscan is min {-log10E, 50} for every matching gene pairs and -1 for each 10kb distance between anchors, similar to DAGchainer and blocks that have scores >300 were kept. The resulting syntenic chains are evaluated using a procedure by ColinearScan and E-value <1e-10 were used as a significance cutoff.
Ks calculations
For homologs inferred from syntenic alignments, we aligned the protein sequences of the gene pairs using CLUSTALW and used the protein alignments to guide CDS alignments by PAL2NAL. Finally, we used Nei-Gojobori method implemented in the PAML package to calculate Ks.
Page last updated: Mar. 01, 2008
