Abstract: Cell lines are the most used model system in cancer research. The transcriptomic data of established prostate cancer (PCa) cell lines help researchers explore differential gene expressions across the various PCa cell lines. Through large scale datamining, we established a curated Combined Transcriptome dataset of PCa Cell lines (CTPC) which contains the transcriptomic data of 1840 samples of 9 commonly used PCa cell lines including LNCaP, LNCaP-95, LNCaP-abl, C4-2, VCaP, 22Rv1, PC3, DU145, and NCI-H660. The CTPC dataset provides an opportunity for researchers to not only compare gene expression across different PCa cell lines but also retrieve the experiment information and associate the differential gene expression data with meta data, such as gene manipulation and drug treatment information. Additionally, based on the CTPC dataset, we built a platform for users to visualize the data. It is our hope that the combined CTPC dataset and the user-friendly platform are of great service to the PCa research community.
Abstract: Androgen deprivation therapy has improved patient survival. Nevertheless, treatment resistance inevitably emerges due to the complex interplay of tumor heterogeneity and lineage plasticity. We integrated scRNAseq data from multiple studies, comprising both publicly available cohorts and data generated by our research team, and established the HuPSA (Human Prostate Single cell Atlas) and MoPSA (Mouse Prostate Single cell Atlas) datasets. Through unsupervised clustering and manual annotation, we found that both atlases consisted of previously known cell clusters including prostate adenocarcinoma (AdPCa), neuroendocrine prostate cancer (NEPCa), stromal, and immune cell populations. Our analysis also unearthed the less described populations including MMP7+ normal prostate club cells and two novel lineage plastic cancerous populations, namely Progenitor-like and KRT7. Immunohistochemical staining analysis confirmed the presence of these populations in both human and mouse PCa tissues, reinforcing their significance in PCa pathobiology. To unravel the molecular drivers of these distinct cell populations, we explored the upstream regulators of the genes enriched in these cells. Furthermore, employing HuPSA-based deconvolution, we scrutinized over one thousand human PCa bulk RNAseq samples and reclassified them into different molecular subtypes, including the newly discovered KRT7 and Progenitor-like categories. Moreover, employing supervised dimensional reduction and label transferring techniques, we projected the scRNAseq data derived from C4-2B xenograft tumors onto HuPSA. Our analysis effectively identified the C4-2B derived diverse subpopulations including NEPCa. The HuPSA & MoPSA app (https://pcatools.shinyapps.io/HuPSA-MoPSA/) have been launched for users to visualize and quantify genes expression in both atlases at single-cell level. Similarly, we also launched the PCaAtlas app (https://pcatools.shinyapps.io/ProAtlas_dev/) for users to visualize genes expression in re-classified human prostatic tissue samples. In conclusion, our study elucidates a roadmap of PCa progression, showcasing the development of heterogeneous populations and the involvement of lineage plasticity. This understanding holds promise for guiding the development of precision medicine in PCa field. Additionally, the HuPSA and MoPSA provide invaluable blueprints for analyzing and interpreting user-generated PCa single-cell RNAseq data.
A substantial volume of RNA sequencing data were generated from cancer cell lines. However, it requires specific bioinformatics skills to compare gene expression levels across cell lines. This has hindered non-bioinformaticians from fully utilizing these valuable datasets in their research. To bridge this gap, we established a curated Pan-cancer Cell Line Transcriptome Atlas (PCTA) dataset. This resource aims to provide a user-friendly platform, allowing researchers without extensive bioinformatics expertise to access and leverage the wealth of information within the dataset for their studies. The PCTA dataset encompasses the expression matrix of 24,965 genes, featuring data from 84,385 samples derived from 5,677 studies. This comprehensive compilation spans 535 cell lines, representing a spectrum of 114 cancer types originating from 30 diverse tissue types. On UMAP plots, cell lines originating from the same type of tissue tend to cluster together, illustrating the dataset's ability to capture biological relationships. Additionally, an interactive and user-friendly web application (https://pcatools.shinyapps.io/PCTA_app/ ) was developed for researchers to explore the PCTA dataset. This platform allows users to examine the expression pattern of their genes of interest across a diverse array of samples.
Department of Biochemistry & Molecular Biology
Phone: (318) 675-5160
Fax: (318) 675-5180 Mailing Address:
Department of Biochemistry & Molecular Biology
LSU Health Shreveport
1501 Kings Highway
Shreveport, LA 71103
Contact the PhD Program
David Gross, PhD
Professor of Biochemistry and Molecular Biology, Graduate Program Director david.gross@lsuhs.edu