Plant Dataset

Protein sequences (FASTA format)

Coding sequences (FASTA format)

Positions on the scaffolds (Scaffold, gene, start, stop)

* These dataset represent processed annotations for various plant genomes (see "Data Source" table). Gene names for Carica, Populous and Vitis do not follow their release names, the ID conversion table is here.




Orthologous groups (ClusterID, list of syntenic genes)

Syntenic groups of genes

The gene groups are clustered not solely by sequence similarities, but also based on synteny determined by MCscan.

Multi-alignments (Views of synteny)

Multiple alignments of gene orders

The first column is the reference genome (Vitis in this case), sliced in ~1MB chunks, and syntenic matches to other scaffolds are aligned in the following columns, a dot is placed where a matching gene cannot be found (gap due to gene loss).






Last update: Sep. 01, 2008 Haibao Tang