Tree Shrew (Tupaia belangeri)

Explore the Tupaia belangeri genome

Search Ensembl Tupaia belangeri

Example Data Points

This release of Tupaia belangeri data is assembled into scaffolds, so there are no chromosomes available to browse.

A few example data points :

About the Tupaia belangeri genome

Assembly

This is the first release of the low-coverage 2X assembly of the northern treeshrew(Tupaia belangeri). The genome sequencing and assembly is provided by the Broad Institute.

The N50 size is the length such that 50% of the assembled genome lies in blocks of the N50 size or longer. The N50 length for supercontigs is 88.86 kb and is 2.97 kb for contigs. The total number of bases in supercontigs is 3.66 Gb and in contigs is 2.14 Gb.

Annotation

Owing to the fragmentary nature of this preliminary assembly, it was necessary to arrange some scaffolds into "gene-scaffold" super-structures, in order to present complete genes. There are 6153 such gene-scaffolds which consist of 1.58 Gb , with identifiers of the form "GeneScaffold_1".

Details of the gene-scaffold construction and subsequent gene-build

Mammalian Genome Project

Tupaia belangeri is one of 16 mammals that will be sequenced as part of the Mammalian Genome Project, funded by the National Institutes of Health (NIH). A group of species were chosen to maximise the branch length of the evolutionary tree while representing the diversity of mammalian species. Low-coverage 2X assemblies will be produced for these mammals and used in alignments for cross-species comparison. The aim is to increase our understanding of functional elements, especially in the human genome.

What's New in Ensembl 50

Tupaia belangeri News

Non-coding genes
These have been updated for most species, including an miRNA update and HGNC names where possible.
Multiple alignments
The multiple alignments are being extended with new species and 2X genomes.
Read more...

General News

Canonical Transcripts
Canonical transcripts have been defined for all genes in the core databases.
Read more...
SSAHA
From release 50 we will no longer be providing SSAHA sequence search. If you wish to run your own SSAHA sequence search you can download the files to generate the search hashes from our FTP site.
Projections of gene names and GO terms
These have been done as usual, between a variety of species.

More news...

Statistics

Assembly:	tupBel1, Jun 2006
Genebuild:	Ensembl, Oct 2006
Database version:	50.1e
Known protein-coding genes:	12
Projected protein-coding genes:	13,204
Novel protein-coding genes:	2,242
Pseudogenes:	2,313
RNA genes:	2,112
Genscan gene predictions:	101,619
Gene exons:	218,739
Gene transcripts:	17,775
Base Pairs:	2,137,225,476
Golden Path Length:	3,670,324,638
Most common InterPro domains:	Top 40 Top 500

How the statistics are calculated

Region:
From (bp):
To (bp):

.