home / WtMTA

Where the Minor Things Are (WtMTA): (Yet Another) Minor Intron Database

The Where the Minor Things Are (WtMTA) intron database contains information about introns in > 1500 species identified by Larue & Roy, 2023 as containing minor introns, with a total of more than 250 million rows. The data includes intron information such as type classification (major or minor), phase, genomic coordinates, etc. for all annotated introns included in our analyses, as well as additional metadata about parent genes, transcripts, and genomes.

Intron classifications were generated using intronIC, and other intron-based metadata (introns per kbps coding sequence, etc.) was obtained using custom Python workflows. All substrate data was sourced from publicly-available genomic resources such as NCBI, Ensembl and JGI.

Exploring the database

Unless you are interested in the entirety of the data (see the section on running the database locally), the best place to start exploring may be via the genomes table. There, you can select a species of interest and drill down to the associated introns and/or transcripts for further filtering.

The results of any query can be downloaded in a number of plaintext formats (e.g., CSV), provided they don’t exceed 1 GB (see Advanced Export below the paginated results; select stream all rows to ensure the full dataset is returned). This should be sufficient to retrieve, for example, the complete intron/transcript set for any individual genome, or a subset of introns/transcripts across a number of different genomes.

Searching within tables

The genomes and transcripts table provide limited search functionality, allowing for queries of complete words only (i.e., no wildcards). For example, to return information about all cnidarian genomes, the genomes table should be searched for cnidaria, but not (for example) cnidar*.

Obtaining a local copy of the DB

The SQLite database file was created using sqlite-utils and Datasette.

You are free to download the entire WtMTA database file via the link at the bottom of this page. After doing so, you can recreate most of the functionality of this website on a local computer/server.

To explore a local version of this database using Datasette, first install Datasette:

python3 -m pip install datasette

Then, run Datasette with the SQLite database file:

datasette -i WtMTA.db

This command will start a local web server (the default URL will be displayed by Datasette automatically), and you can explore the database interactively using your web browser. See Datasette’s documentation for details and additional options.

Data license: ODbL · Data source: Larue & Roy, 2023

Custom SQL query returning 101 rows (hide)

This data as json, CSV

iddinucleotide_pairis_minorscorelengthtranscript_idordinal_indexstartendtaxonomy_idscored_motifsphasein_cdsrelative_position
1 GT-AG 0 0.0195327585023065 359 1 2 4535803 4536161 3816 AGG|GTATGCTACA...CTTTCCATATTT/ATATTTCTAACT...ATTAG|GTG 1 1 64.714
2 GT-AG 0 1.700451944193456e-05 727 1 3 4536979 4537705 3816 TGG|GTAGGATATC...ATTCCTTTATTT/TATTCCTTTATT...AACAG|CTA 2 1 86.57
3 GT-AG 0 0.0004265006076337 272 1 4 4537997 4538268 3816 CAG|GTAACTGTTC...TTAGCTTAAACT/TATTTTGTTATT...TGTAG|CAT 2 1 94.355
4 GT-AG 0 0.0006319036177959 596 2 1 64482 65077 3816 ATA|GTATGATAGA...GATCCTATATTT/CCTATATTTACA...CAAAG|TCA 0 1 52.484
5 TG-TT 0 1.000000099473604e-05 941 2 2 65310 66250 3816 TAC|TGGACGGAAT...AAATTCTCATAT/GAAATTCTCATA...ACCTT|AGT 1 1 60.49
6 GC-AG 0 1.000000099473604e-05 930 2 3 66890 67819 3816 AAG|GCATTTCTGG...ATTATCATGTCC/AAAATTATCATG...CGGAG|GAT 1 1 82.54
7 GT-AG 0 0.0006319036177959 596 3 1 169072 169667 3816 ATA|GTATGATAGA...GATCCTATATTT/CCTATATTTACA...CAAAG|TCA 0 1 52.484
8 TG-TT 0 1.000000099473604e-05 941 3 2 169900 170840 3816 TAC|TGGACGGAAT...AAATTCTCATAT/GAAATTCTCATA...ACCTT|AGT 1 1 60.49
9 GC-AG 0 1.000000099473604e-05 931 3 3 171480 172410 3816 AAG|GCATTTCTGG...ATTATCATGTCC/AAAATTATCATG...CGGAG|GAT 1 1 82.54
10 GT-AG 0 0.0006319036177959 596 4 1 125785 126380 3816 ATA|GTATGATAGA...GATCCTATATTT/CCTATATTTACA...CAAAG|TCA 0 1 52.484
11 TG-TT 0 1.000000099473604e-05 941 4 2 124612 125552 3816 TAC|TGGACGGAAT...AAATTCTCATAT/GAAATTCTCATA...ACCTT|AGT 1 1 60.49
12 GC-AG 0 1.000000099473604e-05 931 4 3 123042 123972 3816 AAG|GCATTTCTGG...ATTATCATGTCC/AAAATTATCATG...CGGAG|GAT 1 1 82.54
13 GT-AG 0 0.0006319036177959 596 5 1 2317730 2318325 3816 ATA|GTATGATAGA...GATCCTATATTT/CCTATATTTACA...CAAAG|TCA 0 1 52.484
14 TG-TT 0 1.000000099473604e-05 942 5 2 2318558 2319499 3816 TAC|TGGACGGAAT...AAATTCTCATAT/GAAATTCTCATA...ACCTT|AGT 1 1 60.49
15 GC-AG 0 1.000000099473604e-05 931 5 3 2320139 2321069 3816 AAG|GCATTTCTGG...ATTATCATGTCC/AAAATTATCATG...CGGAG|GAT 1 1 82.54
16 GT-AG 0 0.0006319036177959 596 6 1 717093 717688 3816 ATA|GTATGATAGA...GATCCTATATTT/CCTATATTTACA...CAAAG|TCA 0 1 52.484
17 TG-TT 0 1.000000099473604e-05 941 6 2 717921 718861 3816 TAC|TGGACGGAAT...AAATTCTCATAT/GAAATTCTCATA...ACCTT|AGT 1 1 60.49
18 GC-AG 0 1.000000099473604e-05 930 6 3 719501 720430 3816 AAG|GCATTTCTGG...ATTATCATGTCC/AAAATTATCATG...CGGAG|GAT 1 1 82.54
19 GT-AG 0 0.0006319036177959 596 7 1 770836 771431 3816 ATA|GTATGATAGA...GATCCTATATTT/CCTATATTTACA...CAAAG|TCA 0 1 52.484
20 TG-TT 0 1.000000099473604e-05 941 7 2 769663 770603 3816 TAC|TGGACGGAAT...AAATTCTCATAT/GAAATTCTCATA...ACCTT|AGT 1 1 60.49
21 GC-AG 0 1.000000099473604e-05 931 7 3 768093 769023 3816 AAG|GCATTTCTGG...ATTATCATGTCC/AAAATTATCATG...CGGAG|GAT 1 1 82.54
22 GT-AG 0 0.0006319036177959 596 8 1 390565 391160 3816 ATA|GTATGATAGA...GATCCTATATTT/CCTATATTTACA...CAAAG|TCA 0 1 52.484
23 TG-TT 0 1.000000099473604e-05 941 8 2 391393 392333 3816 TAC|TGGACGGAAT...AAATTCTCATAT/GAAATTCTCATA...ACCTT|AGT 1 1 60.49
24 GC-AG 0 1.000000099473604e-05 931 8 3 392973 393903 3816 AAG|GCATTTCTGG...ATTATCATGTCC/AAAATTATCATG...CGGAG|GAT 1 1 82.54
25 GT-AG 0 0.0003660273703047 58 9 1 438521 438578 3816 TAG|GTAGCATAGA...TGATCGTTGAAA/TGATAATTGATA...AACAG|AAA 1 1 42.721
26 GT-AG 0 1.000000099473604e-05 1284 9 2 436703 437986 3816 TAC|GTGCGATTCG...GGTTCTGTAATA/AATGTTATCACC...ACTAG|GAT 1 1 62.303
27 GT-AG 0 0.0021405538438473 85 9 3 436120 436204 3816 TCG|GTAATCTATT...ATATTCATAACT/TCTATATTCATA...AAGAG|AAA 1 1 80.565
28 GT-AG 0 0.0003660273703047 58 10 1 556368 556425 3816 TAG|GTAGCATAGA...TGATCGTTGAAA/TGATAATTGATA...AACAG|AAA 1 1 42.721
29 GT-AG 0 1.000000099473604e-05 1284 10 2 554550 555833 3816 TAC|GTGCGATTCG...GGTTCTGTAATA/AATGTTATCACC...ACTAG|GAT 1 1 62.303
30 GT-AG 0 0.0021405538438473 85 10 3 553967 554051 3816 TCG|GTAATCTATT...ATATTCATAACT/TCTATATTCATA...AAGAG|AAA 1 1 80.565
31 GT-AG 0 0.0003660273703047 58 11 1 257174 257231 3816 TAG|GTAGCATAGA...TGATCGTTGAAA/TGATAATTGATA...AACAG|AAA 1 1 42.721
32 GT-AG 0 1.000000099473604e-05 1284 11 2 255356 256639 3816 TAC|GTGCGATTCG...GGTTCTGTAATA/AATGTTATCACC...ACTAG|GAT 1 1 62.303
33 GT-AG 0 0.0021405538438473 85 11 3 254773 254857 3816 TCG|GTAATCTATT...ATATTCATAACT/TCTATATTCATA...AAGAG|AAA 1 1 80.565
34 GT-AG 0 0.0003660273703047 58 12 1 2018861 2018918 3816 TAG|GTAGCATAGA...TGATCGTTGAAA/TGATAATTGATA...AACAG|AAA 1 1 42.721
35 GT-AG 0 1.000000099473604e-05 1284 12 2 2019453 2020736 3816 TAC|GTGCGATTCG...GGTTCTGTAATA/AATGTTATCACC...ACTAG|GAT 1 1 62.303
36 GT-AG 0 0.0021405538438473 85 12 3 2021235 2021319 3816 TCG|GTAATCTATT...ATATTCATAACT/TCTATATTCATA...AAGAG|AAA 1 1 80.565
37 GT-AG 0 1.000000099473604e-05 537 13 1 4319949 4320485 3816 CAG|GTTATGCTTT...CTTGCCTTGCAC/ACATTTGTAACC...GATAG|GCA 2 1 50.8
38 GT-AG 0 1.452863356426882e-05 1455 13 2 4318326 4319780 3816 TAG|GTAAGCCTTA...AATGTCTAATCA/TAATGTCTAATC...TTTAG|AAC 2 1 58.063
39 GT-AG 0 1.000000099473604e-05 90 13 3 4318070 4318159 3816 AAA|GTGAGTGTAG...ACATTCTTACTA/TTCTTACTAATT...TGTAG|GCT 0 1 65.24
40 GT-AG 0 0.004745601705285 316 13 4 4317637 4317952 3816 AAG|GTATACATGT...TAATCTTTATGT/ATAATCTTTATG...TGTAG|GTT 0 1 70.298
41 GT-AG 0 1.000000099473604e-05 104 13 5 4317339 4317442 3816 CAG|GTGACGAAGA...GGATTCTTATAT/TGGATTCTTATA...CTCAG|GCA 2 1 78.686
42 GT-AG 0 0.0035287938522452 104 13 6 4317104 4317207 3816 CTG|GTACTTTTCT...ATATTCTTAAGT/TGGCTTTTCAAT...CTCAG|GTG 1 1 84.349
43 GT-AG 0 1.000000099473604e-05 632 13 7 4316399 4317030 3816 CAG|GTCACACGTC...ATAACTTTAGTT/ACTTTAGTTATG...TATAG|ATG 2 1 87.505
44 GT-AG 0 0.0100857315324956 579 14 1 2298378 2298956 3816 CGA|GTAACCGGCT...CATGCCATAAAA/CCATGTCTCATG...TTTAG|AAA 2 1 3.427
45 GT-AG 0 0.0006319036177959 596 14 2 2299750 2300345 3816 ATA|GTATGATAGA...GATCCTATATTT/CCTATATTTACA...CAAAG|TCA 0 1 38.718
46 TG-TT 0 1.000000099473604e-05 941 14 3 2300578 2301518 3816 TAC|TGGACGGAAT...AAATTCTCATAT/GAAATTCTCATA...ACCTT|AGT 1 1 49.043
47 GC-AG 0 1.000000099473604e-05 930 14 4 2302158 2303087 3816 AAG|GCATTTCTGG...ATTATCATGTCC/AAAATTATCATG...CGGAG|GAT 1 1 77.481
48 GT-AG 0 0.1261554502439858 501 15 1 4540210 4540710 3816 TGG|GTACACTTCC...ATGCCTTTAATT/TATTTGTTGATG...TGCAG|GCT 2 1 6.358
49 GT-AG 0 1.000000099473604e-05 216 15 2 4540808 4541023 3816 CAG|GTGATTTTCC...GTTACTTTATCA/TTTTTCCTCATT...TTCAG|ATT 0 1 10.763
50 GT-AG 0 1.000000099473604e-05 450 15 3 4541178 4541627 3816 CAG|GTGAAAAAAT...TTGTTATTACCC/TAGATTCTAATA...TGCAG|GAA 1 1 17.757
51 GT-AG 0 3.3217703208816803e-05 147 15 4 4542752 4542898 3816 CAA|GTACGTAAAA...CATTGCTTAACA/TGTTTACTCATT...TACAG|GGG 0 1 68.801
52 GT-AG 0 1.000000099473604e-05 478 15 5 4543367 4543844 3816 AAG|GTACGGGTTC...GTATTATTAATA/GTATTATTAATA...TGCAG|CTA 0 1 90.054
53 GT-AG 0 4.215167928265068 180 16 1 2724536 2724715 3816 CCC|GTATGCTCTT...TTTTTTTTAAAC/TTTAAACTTATT...ATTAG|GGA 2 1 11.63
54 GT-AG 0 1.000000099473604e-05 310 16 2 2724846 2725155 3816 GAG|GTCTAACTGA...GTTTTCTTGTAC/GTGTTGCTCATT...TTCAG|GTA 0 1 17.582
55 GT-AG 0 0.0091537172203787 1024 16 3 2725428 2726451 3816 TTG|GTATGCTGTG...GCTCTGTTATTT/CTGTTATTTATG...TGTAG|GTT 2 1 30.037
56 GT-AG 0 1.000000099473604e-05 4674 16 4 2726657 2731330 3816 CAG|GTTGATCTCT...CATTCCTTAATC/TCCTTTTTCATT...ATCAG|GAT 0 1 39.423
57 GT-AG 0 1.000000099473604e-05 109 16 5 2731372 2731480 3816 CAG|GTGAAAATTC...GTTTCCTTTTTT/TCCTTTTTTACT...TGCAG|TAG 2 1 41.3
58 GA-AA 0 0.0001975326584514 219 16 6 2731604 2731822 3816 TCT|GAAGTATGCA...GGAATTATGACT/ATGACTCTCATT...TTTAA|TAG 2 1 46.932
59 GT-AG 0 1.000000099473604e-05 142 16 7 2731900 2732041 3816 CAG|GTAAGTTAAA...TTTAGTTTAATC/TTTAATCTGATT...TACAG|GTG 1 1 50.458
60 GT-AG 0 6.363193456617576e-05 4499 16 8 2732263 2736761 3816 CAG|GTATAAACTC...CTTTCCTTCATA/CTTTCCTTCATA...TTTAG|GAT 0 1 60.577
61 GT-AG 0 2.741820703946869e-05 110 16 9 2736857 2736966 3816 CCG|GTAAGTATTA...ATTCTTTTGATT/ATTCTTTTGATT...TGCAG|TTG 2 1 64.927
62 GT-AG 0 0.0011511988583908 391 16 10 2737184 2737574 3816 ATT|GTAAGTTATT...ACTACTTTGACT/CTTTGACTGATG...CACAG|CCA 0 1 74.863
63 GT-AG 0 1.000000099473604e-05 82 16 11 2737701 2737782 3816 AAG|GTGGGCATAC...AGTACTTTGATA/TGAATATTTACC...ATCAG|GTT 0 1 80.632
64 GT-AG 0 1.000000099473604e-05 5052 16 12 2737882 2742933 3816 AAG|GTAAATACTT...TGTGTGTTGACA/ATTCTACTAAAT...GATAG|GTG 0 1 85.165
65 GT-AG 0 1.152183230571476e-05 87 16 13 2743147 2743233 3816 AGG|GTACTGAGTA...TATTTTTTATTG/CTATTTTTTATT...TATAG|GTT 0 1 94.918
66 GT-AG 0 1.8176032030235344e-05 87 17 1 4639987 4640073 3816 ATG|GTAAGCCTAA...TAGTTCATGATG/TAGTAGTTCATG...TACAG|AAA 0 1 0.161
67 GT-AG 0 2.4546189899519868e-05 93 17 2 4640153 4640245 3816 TTG|GTAAGCATAA...TCTTCTTTGTTC/ACTATGTTCATC...TGCAG|TGT 1 1 4.387
68 GT-AG 0 1.000000099473604e-05 123 17 3 4640464 4640586 3816 ATG|GTAAGTGCTA...TTGTTTCTGATG/TTGTTTCTGATG...TATAG|GCT 0 1 16.051
69 GT-AG 0 1.000000099473604e-05 89 17 4 4641010 4641098 3816 CAG|GTTTGTCACC...ACTATTTTGAAG/ACCAAACTAACT...TACAG|GTT 0 1 38.684
70 GT-AG 0 1.00316075406751e-05 745 17 5 4641215 4641959 3816 GAG|GTGACTCTCT...AATGTATTAAAA/AATGTATTAAAA...ATCAG|GTT 2 1 44.89
71 GT-AG 0 16.558560639924906 300 18 2 4127615 4127914 3816 CAA|GTACCCTCAT...CTTTTCTTATTT/TCTTTTCTTATT...GACAG|AGA 2 1 13.464
72 GT-AG 0 0.010374023981751 124 18 3 4128169 4128292 3816 ATG|GTACACTGTC...TTAATTTTGAAT/TTAATTTTGAAT...GGCAG|GGA 1 1 29.009
73 GT-AG 0 1.000000099473604e-05 330 18 4 4128508 4128837 3816 TCA|GTGAGTTTAT...GCATTTTTACTG/TTTTTACTGATC...TGCAG|ATC 0 1 42.166
74 GT-AG 0 2.4430797447365505e-05 93 18 5 4128862 4128954 3816 TAT|GTAAGTGTTC...AAGCCCTTTTTG/ATATATATAAGC...TGCAG|TCA 0 1 43.635
75 GT-AG 0 1.000000099473604e-05 89 18 6 4129136 4129224 3816 CAG|GTGATCTAAA...ATGTCTTTATTG/CATGTCTTTATT...TGCAG|CTG 1 1 54.712
76 GT-AG 0 0.0019818733498083 91 18 7 4129445 4129535 3816 ATG|GTATAGTGCC...GATTCCTGAACT/CTTTTTCTGATT...TACAG|TTT 2 1 68.176
77 GT-AG 0 1.3567089716023494e-05 188 18 8 4129676 4129863 3816 TGA|GTTTTGAGAA...CATGGTTTGACC/CATGGTTTGACC...AGGAG|TCC 1 1 76.744
78 GT-AG 0 0.0001978211466595 796 18 9 4130044 4130839 3816 TTG|GTACGTATTA...ATTATTTTATTA/TATTATTTTATT...TTCAG|GGC 1 1 87.76
79 GT-AG 0 1.000000099473604e-05 5381 18 10 4130884 4136264 3816 GTG|GTAAGAAATA...GCTATTTTATTT/TGCTATCTAATT...TCCAG|TCA 0 1 90.453
80 GT-AG 0 1.000000099473604e-05 307 18 11 4136328 4136634 3816 CAG|GTAATAATAC...TTCCTTTTATTA/CTTTTATTAACA...ATCAG|GTT 0 1 94.308
81 GT-AG 0 0.0001027454813397 873 19 1 2745520 2746392 3816 TAG|GTAACGGTAT...CTGTTTTTGACA/CTGTTTTTGACA...CAAAG|GAA 2 1 28.766
82 GT-AG 0 1.000000099473604e-05 371 19 2 2746701 2747071 3816 AAG|GTTAGAAACC...ACTTTCTTATTT/AACTTTCTTATT...AACAG|TTT 1 1 49.761
83 GT-AG 0 1.000000099473604e-05 79 19 3 2747227 2747305 3816 AAG|GTGCGTGGTG...ATAATTTGAATA/CTTGTACTGATC...TGCAG|ATT 0 1 60.327
84 GT-AG 0 1.000000099473604e-05 2477 19 4 2747624 2750100 3816 AAG|GTTAGCATTT...TTTGCTTTCCCT/GAATGTTTCATT...TGTAG|GTT 0 1 82.004
85 GT-AG 0 1.000000099473604e-05 115 19 5 2750248 2750362 3816 CAG|GTTATATATC...TTCTTGTTGATT/TTCTTGTTGATT...TCCAG|ATA 0 1 92.025
86 GT-AG 0 1.000000099473604e-05 456 19 6 2750414 2750869 3816 CAG|GTAACGGGAA...ATGTGCTTAATT/ATGTGCTTAATT...TTCAG|CTT 0 1 95.501
87 GT-AG 0 0.0156640026816781 156 20 1 2699783 2699938 3816 GAC|GTCTTCAAGA...GTCCCTTTGACT/CTTTGACTGAAG...TGCAG|AGA 0 1 30.435
88 TG-TT 0 1.000000099473604e-05 942 21 1 2338552 2339493 3816 TAC|TGGACGGAAT...AAATTCTCATAT/GAAATTCTCATA...ACCTT|AGT 1 1 15.214
89 CT-AC 0 3.944137525664904e-05 967 21 2 2336940 2337906 3816 TTT|CTGGTCCTTT...CCTATCCTAATA/TAAGAATTCACC...AAAAC|CTG 1 1 64.526
90 TG-TT 0 1.000000099473604e-05 942 22 1 2327541 2328482 3816 TAC|TGGACGGAAT...AAATTCTCATAT/GAAATTCTCATA...ACCTT|AGT 1 1 15.214
91 CT-AC 0 3.944137525664904e-05 968 22 2 2325928 2326895 3816 TTT|CTGGTCCTTT...CCTATCCTAATA/TAAGAATTCACC...AAAAC|CTG 1 1 64.526
92 GT-AG 0 1.000000099473604e-05 124 23 1 69508 69631 3816 CGT|GTGTAAAATT...TATATATTACTA/ATATATATTACT...TACAG|TGG 0 1 4.233
93 GT-AG 0 8.834621956367143e-05 75 23 2 68853 68927 3816 GGA|GTTTAACTCC...ATTCCCTTTTTA/GATTTATTTATT...TGCAG|AAG 1 1 10.207
94 GT-AG 0 1.000000099473604e-05 124 24 1 21841 21964 3816 CGT|GTGTAAAATT...TATATATTACTA/ATATATATTACT...TACAG|TGG 0 1 31.713
95 GT-AG 0 8.834621956367143e-05 75 24 2 22545 22619 3816 GGA|GTTTAACTCC...ATTCCCTTTTTA/GATTTATTTATT...TGCAG|AAG 1 1 76.466
96 GT-AG 0 1.000000099473604e-05 124 25 1 174098 174221 3816 CGT|GTGTAAAATT...TATATATTACTA/ATATATATTACT...TACAG|TGG 0 1 31.713
97 GT-AG 0 8.834621956367143e-05 75 25 2 173443 173517 3816 GGA|GTTTAACTCC...ATTCCCTTTTTA/GATTTATTTATT...TGCAG|AAG 1 1 76.466
98 GT-AG 0 1.000000099473604e-05 124 26 1 2344664 2344787 3816 CGT|GTGTAAAATT...TATATATTACTA/ATATATATTACT...TACAG|TGG 0 1 31.713
99 GT-AG 0 8.834621956367143e-05 75 26 2 2344009 2344083 3816 GGA|GTTTAACTCC...ATTCCCTTTTTA/GATTTATTTATT...TGCAG|AAG 1 1 76.466
100 GT-AG 0 1.000000099473604e-05 124 27 1 200292 200415 3816 CGT|GTGTAAAATT...TATATATTACTA/ATATATATTACT...TACAG|TGG 0 1 31.713
101 GT-AG 0 8.834621956367143e-05 75 27 2 199637 199711 3816 GGA|GTTTAACTCC...ATTCCCTTTTTA/GATTTATTTATT...TGCAG|AAG 1 1 76.466
Powered by Datasette · Queries took 2.739ms · Data license: ODbL · Data source: Larue & Roy, 2023