Inconsistency in large pharmacogenomic studies - Nature
- ️Quackenbush, John
- ️Wed Nov 27 2013
References
Roden, D. M. & George, A. L., Jr The genetic basis of drug response. Nature 1, 37–44 (2002)
Shoemaker, R. H. The NCI60 human tumour cell line anticancer drug screen. Nature Rev. Cancer 6, 813–823 (2006)
Weinstein, J. N. Drug discovery: Cell lines battle cancer. Nature 483, 544–545 (2012)
Heiser, L. M. et al. Subtype and pathway specific responses to anticancer compounds in breast cancer. Proc. Natl Acad. Sci. USA 109, 2724–2729 (2012)
Yamori, T. Panel of human cancer cell lines provides valuable database for drug discovery and bioinformatics. Cancer Chemother. Pharmacol. 52 (Suppl. 1). 74–79 (2003)
Garnett, M. J. et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483, 570–575 (2012)
Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012)
Wu, R. & Lin, M. Statistical and Computational Pharmacogenomics (Chapman and Hall/CRC, 2010)
Subramanian, A. et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005)
Greshock, J. et al. Molecular target class is predictive of in vitro response profile. Cancer Res. 70, 3677–3686 (2010)
Papillon-Cavanagh, S. et al. Comparison and validation of genomic predictors for anticancer drug sensitivity. J. Am. Med. Inform. Assoc. 20, 597–602 (2013)
Spearman, C. The proof and measurement of association between two things. Int. J. Epidemiol. 39, 1137–1150 (2010)
Barretina, J. et al. Addendum: The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 492, 290 (2012)
Parkinson, H. et al. ArrayExpress–a public database of microarray experiments and gene expression profiles. Nucleic Acids Res. 35, D747–D750 (2007)
McCall, M. N., Bolstad, B. M. & Irizarry, R. A. Frozen robust multiarray analysis (fRMA). Biostatistics 11, 242–253 (2010)
Li, Q., Birkbak, N. J., Győrffy, B., Szallasi, Z. & Eklund, A. C. Jetset: selecting the optimal microarray probe set to represent a gene. BMC Bioinformatics 12, 474 (2011)
Ashburner, M. et al. Gene ontology: tool for the unfication of biology. Nature Genet. 25, 25–29 (2000)
Sim, J. & Wright, C. C. The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Phys. Ther. 85, 257–268 (2005)
Acknowledgements
We thank J. Archambault for his insightful comments on the comparative study between experimental protocols used in the large pharmacogenomic studies investigated in this work. The authors would like to thank the investigators of the Cancer Genome Project, the Cancer Cell Line Encyclopedia and the GlaxoSmithKline cell line study who have made their invaluable data available to the scientific community. N.E.-H. was supported by an IRCM doctoral fellowship. A.H.B. was supported by an award from the Klarman Family Foundation and by support from NIH grant CA087969. N.J.B. was funded The Villum Kann Rasmussen Foundation. J.Q. was supported grants from the Dr Miriam and Sheldon G. Adelson Medical Research Foundation and from the NCI GAME-ON Cancer Post-GWAS initiative (U19 CA148065-01).
Author information
Author notes
Andrew H. Beck, Hugo J. W. L. Aerts and John Quackenbush: These authors contributed equally to this work.
Authors and Affiliations
Institut de Recherches Cliniques de Montréal, University of Montreal, Montreal, Quebec, Canada ,
Benjamin Haibe-Kains & Nehme El-Hachem
Ontario Cancer Institute, Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 2M9, Canada ,
Benjamin Haibe-Kains
Department of Systems Biology, Center for Biological Sequence Analysis, Technical University of Denmark, 2800 Kgs, Lyngby, Denmark,
Nicolai Juul Birkbak
Department of Pathology, Beth Israel Deaconess Medical Center and Harvard Medical School, Boston, 02215, Massachusetts, USA
Andrew C. Jin & Andrew H. Beck
Department of Biostatistics and Computational Biology and Center for Cancer Computational Biology, Dana-Farber Cancer Institute, Boston, 02215, Massachusetts, USA
Hugo J. W. L. Aerts & John Quackenbush
Department of Radiation Oncology & Radiology, Dana-Farber Cancer Institute, Brigham and Women’s Hospital, Harvard Medical School, Boston, 02215, Massachusetts, USA
Hugo J. W. L. Aerts
Department of Radiation Oncology, Maastricht University, Maastricht 6200 MD, The Netherlands,
Hugo J. W. L. Aerts
Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, 02215, Massachusetts, USA
John Quackenbush
Authors
- Benjamin Haibe-Kains
You can also search for this author inPubMed Google Scholar
- Nehme El-Hachem
You can also search for this author inPubMed Google Scholar
- Nicolai Juul Birkbak
You can also search for this author inPubMed Google Scholar
- Andrew C. Jin
You can also search for this author inPubMed Google Scholar
- Andrew H. Beck
You can also search for this author inPubMed Google Scholar
- Hugo J. W. L. Aerts
You can also search for this author inPubMed Google Scholar
- John Quackenbush
You can also search for this author inPubMed Google Scholar
Contributions
B.H.-K. conceived the study with major contributions from N.J.B., H.J.W.L.A. and J.Q. B.H.-K. and N.E.-H. collected and curated the gene expression profiles and drug phenotypic data. A.C.J. and A.H.B. collected and curated the mutation data. B.H.-K. performed all the analyses and wrote the code with contributions from N.E.-H. and A.C.J. N.E.-H., A.C.J. and A.H.B. compared the experimental protocols of the pharmacogenomic studies. B.H.-K., A.H.B., H.J.W.L.A. and J.Q. supervised the study. B.H.-K., A.H.B., H.J.W.L.A. and J.Q. wrote the manuscript with contributions from N.E.-H. and N.J.B. All authors discussed the results and commented on the manuscript.
Corresponding author
Correspondence to Benjamin Haibe-Kains.
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Additional information
The R code enables one to download and process the pharmacogenomic data and to generate all the results presented in the paper and its Supplementary Information.
Extended data figures and tables
Extended Data Figure 1 Intersection between the pharmacogenomic studies in terms of drugs, cell lines and genes.
a, Venn diagram reporting the number of drugs shared between CGP and CCLE studies. b, Description of the 15 anticancer drugs screened both in CGP and CCLE studies. c, Venn diagram reporting the number of drugs shared between CGP, CCLE and GSK studies. d, Venn diagram reporting the number of cell lines shared by CGP and CCLE studies. e, Number of cell lines for each tissue type among the 471 common to CGP and CCLE studies. f, Venn diagram reporting the number of cell lines shared between CGP, CCLE and GSK studies. g, Venn diagram reporting the number of genes whose presence of mutations was assessed both in CGP and CCLE studies. h, Venn diagram reporting the number of genes whose expression was assessed both in CGP and CCLE studies.
Extended Data Figure 2 Box plot of the correlations of missense mutation profiles between identical cell lines in CGP and CCLE.
Two‐sided Wilcoxon rank‐sum test was used to test whether agreement (Cohen’s κ coefficient) was significantly higher in identical cell lines compared to different cell lines (upper‐right corner).
Extended Data Figure 4 Consistency of IC50 values within the range of tested concentrations between CGP and CCLE.
a, Scatter plots reporting the drug sensitivity measurements, which are the IC50 values within the range of tested concentrations (thus excluding extrapolated IC50 in CGP and placeholder values in CCLE) in the 471 cell lines and for each of the 15 drugs investigated both in CGP and CCLE. b, Spearman’s rank correlation coefficient (rs) for each drug where significance of each correlation coefficient is reported using an asterisk if one‐sided P value <0.05.
Extended Data Figure 5 Consistency of AUC values between CGP and CCLE.
a, Scatter plots reporting the drug sensitivity (AUC) measured in the 471 cell lines and for each of the 15 drugs investigated both in CGP and CCLE. b, Spearman’s rank correlation coefficient (rs) for each drug where significance of each correlation coefficient is reported using an asterisk if one‐sided P value <0.05.
Extended Data Figure 6 Consistency of AUC‐based gene–drug associations between CGP and CCLE.
a, Scatter plots reporting the gene–drug associations computed with AUC, as quantified by the standardized coefficient of the gene of interest in a linear model controlled for tissue type, in the 471 cell lines and for each of the 15 drugs investigated both in CGP and CCLE. b, Spearman’s rank correlation coefficient (rs) for each drug where significance of each correlation coefficient is reported using an asterisk if one‐sided P value <0.05.
Extended Data Figure 7 Consistency of AUC‐based pathway–drug associations between CGP and CCLE.
a, Scatter plots reporting the pathway–drug associations computed with AUC, as quantified by the standardized coefficient of the gene of interest in a linear model controlled for tissue type, in the 471 cell lines and for each of the 15 drugs investigated both in CGP and CCLE. b, Spearman’s rank correlation coefficient (rs) for each drug where significance of each correlation coefficient is reported using an asterisk if one‐sided P value <0.05.
Extended Data Figure 8 Consistency of AUC‐based mutation–drug associations between CGP and CCLE.
a, Scatter plots reporting the mutation–drug associations computed with AUC, as quantified by the standardized coefficient of the gene of interest in a linear model controlled for tissue type, in the 471 cell lines and for each of the 15 drugs investigated both in CGP and CCLE. b, Spearman’s rank correlation coefficient (rs) for each drug where significance of each correlation coefficient is reported using an asterisk if one‐sided P value <0.05.
Extended Data Figure 9 Comparison of drug sensitivity measured in CGP and CCLE with GSK.
a, Scatter plots reporting the drug sensitivity measurements (IC50) of all drugs and cell lines screened both in CCLE and GSK data sets (2 drugs in 249 cell lines). b, Scatter plots reporting the drug sensitivity measurements (IC50) of all drugs and cell lines screened both in CCLE and GSK data sets (5 drugs in 231 cell lines). Significance of the Spearman's rank correlation (positive) coefficient is reported as one‐sided P value.
Supplementary information
Supplementary Information
This file contains a list of abbreviations, the instructions to fully reproduce the analysis results from the R scripts, the comparison of pharmacological assays, Supplementary Tables 1-2, Supplementary Figures 1-23 and Supplementary References. (PDF 13904 kb)
Supplementary Data
This file contains Supplementary set 1, R scripts. The archive (zip) contains the R scripts and accompanying files to enable full reproducibility of the analysis results. (ZIP 11045 kb)
Supplementary Data
This zipped file contains Supplementary Data sets 2 and 3. Supplementary File 2, Statistics for the gene-drug-associations for IC50 in CGP reports the gene-drug associations using IC50 as drug sensitivity measure, including the standardized coefficient, its standard error, t statistic, nominal p-value and FDR for the 12,187 genes and 15 drugs screened in CGP. Supplementary File 3, Statistics for the gene-drug-associations for IC50 in CCLE reports the gene-drug associations using IC50 as drug sensitivity measure, including the standardized coefficient, its standard error, t statistic, nominal p-value and FDR for the 12,187 genes and 15 drugs screened in CCLE. (ZIP 25103 kb)
Supplementary Data
This zipped file contains Supplementary Data sets 4-13 and a guide to the data. (ZIP 30820 kb)
PowerPoint slides
Rights and permissions
About this article
Cite this article
Haibe-Kains, B., El-Hachem, N., Birkbak, N. et al. Inconsistency in large pharmacogenomic studies. Nature 504, 389–393 (2013). https://doi.org/10.1038/nature12831
Received: 15 April 2013
Accepted: 07 November 2013
Published: 27 November 2013
Issue Date: 19 December 2013
DOI: https://doi.org/10.1038/nature12831