Supporting data for "Chromosome‐level genome assembly of the black widow spider Latrodectus elegans illuminates composition and evolution of venom and silk proteins"
We assembled the L. elegans genome using Nanopore long-reads. The genome size is 1.57 Gb with contig N50 of 4.34Mb and scaffold N50 of 114.31 Mb. Hi-C scaffolding assigned 98.08% of the genome to 14 pseudo-chromosomes, and BUSCO (Benchmarking Universal Single-Copy Orthologs) completeness analysis revealed that 98.4% of the core eukaryotic genes were completely present in the genome assembly. Annotation of the L. elegans genome assembly identified that repetitive sequences account for 506.09 Mb (32.30%) and 20167 protein-coding genes, among which 81.03% have functional annotation terms. Phylogenetic analysis showed that L. elegans is closely related to the house spider Parasteatoda tepidariorum (Theridiidae) and that they diverged from a common ancestor ∼73.0 million years ago. The relatively high evolution rate suggests that L. elegans evolved under strong selection pressure. Based on genome-wide comparative analysis, we identified and 39 toxin proteins and 26 spidroin. Among the toxins, latrotoxins experienced substantial gene duplication and diversification in the two Theridiidae spiders. L. elegans latrotoxin genes had higher Ka/Ks ratios compared to those in other species, suggesting rapid evolution of them. We found that L. elegans have remarkably more MiSp and tandem duplication is the main duplication event of the spidroin.
The high-quality L. elegans assembled genome illuminates composition and evolution of venom and silk proteins provides a resource for in-depth exploration and application of them.