Supporting data for "Multi-omic dataset of patient-derived tumor organoids of neuroendocrine neoplasms"
We have generated the first multi-omic dataset (whole-genome sequencing, WGS, and RNA-sequencing, RNA-seq) of PDTOs from the rare and understudied pulmonary neuroendocrine tumors (n = 12; 6 grade 1, 6 grade 2), and provide data from other rare neuroendocrine neoplasms: small intestine (ileal) neuroendocrine tumors (n = 6; 2 grade 1 and 4 grade 2) and large-cell neuroendocrine carcinoma (n = 5; 1 pancreatic and 4 pulmonary). This dataset includes a matched sample from the parental sample (primary tumor or metastasis) for a majority of samples (21/23) and longitudinal sampling of the PDTOs (1 to 2 time-points), for a total of n = 47 RNA-seq and n = 33 WGS. We here provide quality control for each technique, and provide the raw and processed data as well as all scripts for genomic analyses to ensure an optimal re-use of the data. In addition, we report gene expression data and somatic small variant calls and describe how they were generated, in particular how we used WGS somatic calls to train a random-forest classifier to detect variants in tumor-only RNA-seq. We also report all histopathological images used for medical diagnosis: hematoxylin and eosin-stained slides, brightfield images, and immunohistochemistry images of protein markers of clinical relevance.
This dataset will be critical to future studies relying on this PDTO biobank, such as drug screens for novel therapies and experiments investigating the mechanisms of carcinogenesis in these understudied diseases.