Introduction

Panoply is a method to assess possible gene or pathway targets for a single sample given genomic information from DNA and RNA. We provide this vignette to demonstrate how to set up Drug-Gene Data for prioritizing drugs for cancer patients based on genomic data.

Druggable Genome Interaction Database (DGIdb)

We curated a set of high-confidence cancer-related genes and used the curl command line interface to download drug-gene interactions for cancer drugs (anti-neoplastic) on the sets of genes. Our gene set was too large to do at once, so we had to do the command in smaller chunks and paste together. The example below shows how to do it for six well-known cancer genes, with a post-download step using python to convert to json format.

# http://dgidb.genome.wustl.edu/api curl
# http://dgidb.genome.wustl.edu/api/v1/interactions.json?drug_types=antineoplastic\&genes=TP53,HER2,ESR1,ATM,BRCA1,BRCA2
# | python -mjson.tool

Next we use the RJSONIO package to load the json file into R. We show a small example of the values for the drug-gene ineraction downloaded from DGI.

library(RJSONIO)
file1 <- "dgiAntiNeo1.json"
dgiAntiNeo1 <- fromJSON(paste(readLines(file1), collapse = ""))

# Gene Drug interactionType [1,] 'PRKCA' 'ELLAGIC ACID' 'inhibitor,competitive'
# [2,] 'PRKCA' 'BRYOSTATIN-1' 'n/a' [3,] 'PRKCA' 'SOPHORETIN' 'inhibitor' [4,]
# 'PRKCA' 'ENZASTAURIN' 'inhibitor' [5,] 'PRKCA' 'MIDOSTAURIN' 'inhibitor' [6,]
# 'PRKCA' 'AFFINITAC' 'antisense oligonucleotide' [7,] 'PRKCA' 'TAMOXIFEN' 'n/a'
# [8,] 'NOTCH1' 'RO4929097' 'inhibitor' [9,] 'NOTCH1' 'RO4929097' 'other/unknown'

Drug-Bank

We also included drug-gene targets from Drug Bank. The steps included a web download and converting gene ids into gene symbols. A snippet of this data appears as follows:

# Drug_name Drug_ID Target uniprot GeneID Afatinib DB08916 P00533; P04626;
# Q15303; P08183; Q9UNQ0 EGFR;ERBB2;ERBB4;ABCB1;ABCG2 Aflibercept DB08885 P15692;
# P49763; P49765 VEGFA;PGF;VEGFB Anastrozole DB01217 P11511; P05177; P11712;
# P08684 CYP19A1;CYP1A2;CYP2C9;CYP3A4 Azacitidine DB00928 P26358; P32320
# DNMT1;CDA

Combined Sources

We show the steps needed to fix up both sources so they could be combined into one common data frame in R. First, fix column names and add the Source name for DGI. For Drug Bank, need to pull apart gene ids and expand the data.frame to one row per drug-gene pair.

## fix up dbi source for combining
dgidb$Source <- "DGIdb"
names(dgidb) <- gsub("interactionType", "type", names(dgidb))

## fix up drugbank for combining
dbank$DRUG <- casefold(dbank$Drug_name, upper = TRUE)

udrugs.dgi <- unique(c(dgidb$Drug, dbank$DRUG))
udrugs.dgi <- udrugs.dgi[!(grepl("\\[", udrugs.dgi) | grepl("\\{", udrugs.dgi) | 
    grepl("\\(", udrugs.dgi))]

glist <- strsplit(dbank$GeneID, split = ";")
dbankfix <- data.frame(Drug = NULL, Gene = NULL, type = NULL, Source = NULL)
for (k in 1:nrow(dbank)) {
    if (length(glist[[k]]) > 0) {
        dbankfix <- rbind.data.frame(dbankfix, data.frame(Drug = dbank$DRUG[k], Gene = glist[[k]], 
            type = "n/a", Source = dbank[k, "Annotation From"]))
    }
}
drugdbPan <- rbind.data.frame(dgidb, dbankdf)

Create Data Objects for PANOPLY Network Analyses

Using the pre-made dataset described above, drugdbPan, we

data(drugdbPan)

kable(head(drugdbPan, 20))
Gene Drug type Source
1 PRKCA ELLAGIC ACID inhibitor,competitive DGIdb
2 PRKCA BRYOSTATIN-1 n/a DGIdb
3 PRKCA SOPHORETIN inhibitor DGIdb
4 PRKCA ENZASTAURIN inhibitor DGIdb
5 PRKCA MIDOSTAURIN inhibitor DGIdb
6 PRKCA AFFINITAC antisense oligonucleotide DGIdb
7 PRKCA TAMOXIFEN n/a DGIdb
8 NOTCH1 RO4929097 inhibitor DGIdb
10 APH1A UNII-DRL23N424R n/a DGIdb
11 APH1B UNII-DRL23N424R n/a DGIdb
12 MAPK11 REGORAFENIB inhibitor DGIdb
14 MAPK14 LY2228820 n/a DGIdb
15 BIRC3 LCL161 antagonist DGIdb
16 BIRC3 AT-406 antagonist DGIdb
17 BIRC2 AT-406 antagonist DGIdb
18 BIRC2 LCL161 antagonist DGIdb
19 BIRC2 BIRINAPANT n/a DGIdb
20 NFKB1 THALIDOMIDE n/a DGIdb
21 NFKB1 BARDOXOLONE n/a DGIdb
22 NFKB1 BORTEZOMIB n/a DGIdb
annoDrugs <- annotateDrugs(drugdbPan)
drug.gs <- annoDrugs[[1]]
drug.adj <- annoDrugs[[2]]

hist(sapply(drug.gs, length), main = "Drug set length")

plot of chunk pandrugdata

hist(rowSums(drug.adj), main = "Drug targets (genes) via adjacency")

plot of chunk pandrugdata

hist(colSums(drug.adj), main = "Gene targets (from drugs) via adjacency")

plot of chunk pandrugdata

Session Information

Show the R session information.

sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.9 (Final)

Matrix products: default
BLAS: /usr/lib64/libblas.so.3.2.1
LAPACK: /usr/lib64/atlas/liblapack.so.3.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=C              
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] grid      parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] knitr_1.20          panoply_0.98        RColorBrewer_1.1-2  randomForest_4.6-12 Rgraphviz_2.22.0   
 [6] graph_1.56.0        BiocGenerics_0.24.0 circlize_0.4.2      gage_2.28.0         MASS_7.3-47        

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.15         highr_0.6            formatR_1.5          pillar_1.1.0         compiler_3.4.2      
 [6] XVector_0.18.0       tools_3.4.2          zlibbioc_1.24.0      digest_0.6.12        bit_1.1-12          
[11] evaluate_0.10.1      RSQLite_2.0          memoise_1.1.0        tibble_1.4.2         png_0.1-7           
[16] rlang_0.1.6          DBI_0.8              httr_1.3.1           stringr_1.3.0        Biostrings_2.46.0   
[21] S4Vectors_0.16.0     GlobalOptions_0.0.12 IRanges_2.12.0       stats4_3.4.2         bit64_0.9-7         
[26] Biobase_2.38.0       R6_2.2.2             AnnotationDbi_1.40.0 blob_1.1.0           magrittr_1.5        
[31] KEGGREST_1.18.0      shape_1.4.3          colorspace_1.3-2     stringi_1.1.7