Background & Aims
Since 1996, public data sharing policies have allowed for the vast accumulation of genetic information on the internet. This information is available in multiple omics-wide level studies, allowing researchers to make significant scientific progress in genetics and the study of the molecular pathophysiology of pain conditions. However, the format that this publicly available data is in acts as a barrier to those without the skills and tools needed to convert it into meaningful information. Even for those with the skills, analysis of many datasets would still be laborious. We made the Transcriptomics Pain Signatures Database, containing fully processed transcriptomics datasets to enable the search of differentially expressed genes in pain conditions. Further, the database is available on a website that also allows for the meta-analysis and visualization of these datasets to identify further genes of interest.
Methods
Datasets were collected from the Gene Expression Omnibus. The datasets were chosen by searching “pain” in Geo DataSets, then restricting to expression profiling by array or by high throughput sequencing in the organisms Mus Musculus, Rattus norvegicus, Sus Scrofa, or Homo Sapiens. The research abstract was then read to verify eligibility.
Microarray data was processed in R using GEOquery and differential expression of genes was detected using limma
RNA-seq data was SRA to FASTQ and mapped on the appropriate genome using STAR. The genomes GRCm39 for mice, Rnor6 for rat and GRCh38 for humans were retrieved from Ensembls FTP site. Differential expression of genes was then detected in R using Deseq2 with sex and age as co-variables when appropriate.
The results of the differential gene expression were used as input to ‘fgsea’, where pathways were identified from gene ontology.
Finally, the results are presented using the web app framework Django, where they can be further analysed.
Results
At the time, 338 differential expression contrasts have been included in the database, with over half of the contrasts from high throughput sequencing. Most contrasts are done in mice and rats, and the most common tissues assessed were from the peripheral and central nervous system. Most comparisons were done on the pain state versus control, while most of the remaining studies were gene expression over time and sex related comparisons. The database covers a wide variety of pain types, combining them as either neuropathic pain or inflammatory.
Genes were ranked by their presence in studies across different conditions. For the highly differentially expressed genes, the number of contrasts they appeared in was calculated. Overlap in gene expression in various tissues showed that these shared highly expressed genes were present in up to 34% of blood assays and 65% in the Sciatic nerve.
Pathway analysis also shows commonality in differentially expressed pathways across different tissues.
Conclusions
The database allows researchers to access the vast amounts of information available to them while at the same time being convenient to use. The database aims to be used in hypothesis free analyses by including a variety of transcriptomes from different pain conditions, organisms, tissues and time points. The use of microarray and high throughput sequencing to test genetic content also allows for more information to be gotten from the same study conditions.
The database also helps to remove variation in genetic expression due to factors other than a particular study condition. Through the use of different datasets on the same phenotype with slight variations, we are able to more thoroughly isolate genes contributing to pain with high confidence.
The common differentially expressed genes were compared to known pain genes, finding significant overlap between them. Notably, many differentially expressed genes were not found among known pain genes, creating a source for novel gene studies
References
Diatchenko, L., Nackley, A. G., Tchivileva, I. E., Shabalina, S. A. & Maixner, W. Genetic architecture of human pain perception. Trends Genet. 23, 605–613 (2007).
Bang, S. et al. GPR37 regulates macrophage phagocytosis and resolution of inflammatory pain. J. Clin. Invest. 128, 3568–3582 (2018).
Diatchenko, L., Parisien, M., Jahangiri, S. & Mogil, J. S. Omics approaches to discover pathophysiological pathways contributing to human pain. 00, (2022)
Jodele. ?????? HHS Public Access. Physiol. Behav. 176, 100–106 (2016).
Arias, J. J., Pham-Kanter, G. & Campbell, E. G. The growth and gaps of genetic data sharing policies in the United States. J. Law Biosci. 2, 56–68 (2016).
Edgar, R., Domrachev, M. & Lash, A. E. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30, 207–210 (2002).
Barrett, T. et al. NCBI GEO: Archive for functional genomics data sets – Update. Nucleic Acids Res. 41, 991–995 (2013).
Sean, D. & Meltzer, P. S. GEOquery: A bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics 23, 1846–1847 (2007).
Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. 43, (2015).
Bayes, E., Eb, T. & October, R. HYPERVARIABLE GENES AND IMPROVES POWER TO DETECT. 10, 946–963 (2016).
Liberzon, A. et al. collection. 1, 417–425 (2016).
Durinck, S. et al. BioMart and Bioconductor: A powerful link between biological databases and microarray data analysis. Bioinformatics 21, 3439–3440 (2005).
Smedley, D. et al. The BioMart community portal: An innovative alternative to large, centralized data repositories. Nucleic Acids Res. 43, W589–W598 (2015).
Dobin, A. et al. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Howe, K. L. et al. Ensembl 2021. Nucleic Acids Res. 49, D884–D891 (2021).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 1–21 (2014).
Tweedie, S. et al. Genenames.org?: the HGNC and VGNC resources in 2021. 49, 939–946 (2021).
Carbon, S. et al. The Gene Ontology resource: Enriching a GOld mine. Nucleic Acids Res. 49, D325–D334 (2021).
Lacroix-fralish, M. L., Ledoux, J. B. & Mogil, J. S. The Pain Genes Database?: An interactive web browser of pain-related transgenic knockout studies. 131, 1–4 (2007).
Parisien, M., Samoshkin, A., Tansley, S. N., Piltonen, M. H. & Martin, L. J. Genetic pathway analysis reveals a major role for extracellular matrix organization in inflammatory and neuropathic pain. 160, 932–944 (2019).
Mogil, J. S. Pain genetics?: past, present and future. Trends Genet. 28, 258–266 (2012).
Rao, M. S. et al. Comparison of RNA-Seq and Microarray Gene Expression Platforms for the Toxicogenomic Evaluation of Liver From Short-Term Rat Toxicity Studies. 9, 1–16 (2019).
Rao, M. S. et al. Comparison of RNA-Seq and Microarray Gene Expression Platforms for the Toxicogenomic Evaluation of Liver From Short-Term Rat Toxicity Studies. 9, 1–16 (2019).
Crow, M. & Denk, F. RNA-seq data in pain research–an illustrated guide. 160, 1502–1504 (2019)
Martorell-marugan, J., Mart?, C., Tarazona, S. & Conesa, A. Gene expression Identification and visualization of differential isoform expression in RNA-seq time series. 34, 524–526 (2018).
Li, Z. et al. Emerging roles of long non ? coding RNAs in neuropathic pain. doi:10.1111/cpr.12528.
Presenting Author
Calvin Surbey
Poster Authors
Topics
- Informatics, Coding, and Pain Registries