Supporting data for "Chromosomal-level assembly of the blood clam, Scapharca (Anadara) broughtonii, using long sequence reads and Hi-C"
A total of 75.79 Gb clean data was generated with the PacBio and Oxford Nanopore platforms, which represented approximately 86× coverage of the S. broughtonii genome. De novo assembly of these long reads resulted in an 884.5 Mb genome, with a contig N50 of 1.80 Mb and scaffold N50 of 45.00 Mb, respectively. Genome Hi-C scaffolding resulted in 19 chromosomes containing 99.35% of bases in the assembled genome. Genome annotation revealed that a considerable part of the genome (46.1%) is composed by repeated sequences, while 24,045 protein-coding genes were predicted and 84.7% of them were annotated.
We report here the chromosomal-level assembly of the S. broughtonii genome based on long read sequencing and Hi-C scaffolding. The genomic data could be served as reference genome for family Arcidae and will provide a valuable resource for the scientific community and aquaculture sector.