What do the different biotypes in Ensembl mean?
The Ensembl automatic annotation system classifies genes and transcripts into biotypes including: protein coding, pseudogene, processed pseudogene, miRNA, rRNA, scRNA, snoRNA, snRNA.
For human, mouse, rat and pig, we incorporate manual annotation from Havana. For genes and transcripts that include manual annotation, we display the manually assigned biotype. The full list of Havana biotypes can be found here.
The biotypes can be grouped into protein coding, pseudogene, long noncoding and short noncoding. Examples of biotypes in each group are as follows:
- 编码基因: IGC gene, IGD gene, IG gene, IGJ gene, IGLV gene, IGM gene, IGV gene, IGZ gene, nonsense mediated decay, nontranslating CDS, non stop decay, polymorphic pseudogene, TRC gene, TRD gene, TRJ gene.
- 假基因: disrupted domain, IGC pseudogene, IGJ pseudogene, IG pseudogene, IGV pseudogene, processed pseudogene, transcribed processed pseudogene, transcribed unitary pseudogene, transcribed unprocessed pseudogene, translated processed pseudogene, TRJ pseudogene, unprocessed pseudogene
- 长链非编码RNA: 3prime overlapping ncrna, ambiguous orf, antisense, antisense RNA, lincRNA, ncrna host, processed transcript, sense intronic, sense overlapping
- 短链非编码RNA: miRNA, miRNA_pseudogene, miscRNA, miscRNA pseudogene, Mt rRNA, Mt tRNA, rRNA, scRNA, snlRNA, snoRNA, snRNA, tRNA, tRNA_pseudogene
网友评论