genomes
Data license: ODbL · Data source: Larue & Roy, 2023 · About: Minor Intron Database (WtMTA)
- taxonomy_id
- INTEGER (primary key), unique identifier for each species
- species
- TEXT, binomial name of the species
- family
- TEXT, taxonomic family of the species
- order
- TEXT, taxonomic order of the species
- phylum
- TEXT, taxonomic phylum of the species
- accession
- TEXT, accession number of the genome assembly
- n_minor_introns
- INTEGER, total number of minor introns in the genome
- n_major_introns
- INTEGER, total number of major introns in the genome
- percent_minor_introns
- REAL, percentage of minor introns in the genome
- busco_score
- REAL, BUSCO score assessing the genome assembly completeness (vs. eukaryota_odb10)
- minor_snRNAs
- TEXT, minor snRNAs found in the annotated transcriptome
- genome_version
- TEXT, version of the genome assembly
- source_url
- TEXT, URL for the source genome/annotation files
- source_metadata
- TEXT, additional metadata from the original data source
- minor_intron+
- INTEGER, indicates if the species is inferred to contain real minor introns (1) or not (0)
2 rows where minor_snRNAs contains "u11" and n_minor_introns = 723 sorted by percent_minor_introns descending
This data as json, CSV (advanced)
Suggested facets: minor_snRNAs (array)
| taxonomy_id | species | family | order | phylum | accession | n_minor_introns | n_major_introns | percent_minor_introns ▲ | busco_score | minor_snRNAs | genome_version | source_url | source_metadata | minor_intron+ | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 173247 | Echeneis naucrates | Echeneidae | Carangiformes | Chordata | GCF_900963305.1 | 723 | 222669 | 0.3236463257412978 | 99.6 | ["u11", "u12", "u4atac", "u6atac"] | fEcheNa1.1 | https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/900/963/305/GCF_900963305.1_fEcheNa1.1 | GCF_900963305.1;PRJNA548465;SAMEA4966390;CAAHFO000000000.1;representative genome;173247;173247;Echeneis naucrates;;;latest;Chromosome;Major;Full;2019/04/11;fEcheNa1.1;SC;GCA_900963305.1;different;https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/900/963/305/GCF_900963305.1_fEcheNa1.1;;;na | 1 | 
| 390379 | Thalassophryne amazonica | Batrachoididae | Batrachoidiformes | Chordata | GCF_902500255.1 | 723 | 236215 | 0.3051431176088259 | 99.2 | ["u11", "u12", "u4atac", "u6atac"] | fThaAma1.1 | https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/902/500/255/GCF_902500255.1_fThaAma1.1 | GCF_902500255.1;PRJNA627696;SAMEA104129913;CABVOY000000000.1;representative genome;390379;390379;Thalassophryne amazonica;;;latest;Chromosome;Major;Full;2019/09/30;fThaAma1.1;SC;GCA_902500255.1;identical;https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/902/500/255/GCF_902500255.1_fThaAma1.1;;;na | 1 | 
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE "genomes" (
"taxonomy_id" INTEGER,
  "species" TEXT,
  "family" TEXT,
  "order" TEXT,
  "phylum" TEXT,
  "accession" TEXT,
  "n_minor_introns" INTEGER,
  "n_major_introns" INTEGER,
  "percent_minor_introns" REAL,
  "busco_score" REAL,
  "minor_snRNAs" TEXT,
  "genome_version" TEXT,
  "source_url" TEXT,
  "source_metadata" TEXT,
  "minor_intron+" INTEGER
  ,PRIMARY KEY ([taxonomy_id])
);
CREATE INDEX [idx_genomes_phylum]
    ON [genomes] ([phylum]);
CREATE INDEX [idx_genomes_order]
    ON [genomes] ([order]);
CREATE INDEX [idx_genomes_family]
    ON [genomes] ([family]);