Skip to content

Commit 71275f4

Browse files
committed
pathways
1 parent 78253bd commit 71275f4

1 file changed

Lines changed: 18 additions & 16 deletions

File tree

rna_seq.md

Lines changed: 18 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -159,7 +159,7 @@ Then, we create a sample distance heatmap:
159159

160160
```R
161161
library(RColorBrewer)
162-
library(gplots)
162+
library(gplots) # you may need to install this package
163163

164164
(mycols <- brewer.pal(8, "Dark2")[1:length(unique(samples$condition))])
165165
sampleDists <- as.matrix(dist(t(assay(rld))))
@@ -215,7 +215,7 @@ library(gage)
215215
library(gageData)
216216
```
217217

218-
Let’s use the mapIds function to add more columns to the results. The row.names of our results table has the Ensembl gene ID (our key), so we need to specify keytype=ENSEMBL. The column argument tells the mapIds function which information we want, and the multiVals argument tells the function what to do if there are multiple possible values for a single input value. Here we ask to just give us back the first one that occurs in the database. Let’s get the Entrez IDs, gene symbols, and full gene names.
218+
Let’s use the `mapIds` function to add more columns to the results. The row.names of our results table has the Ensembl gene ID (our key), so we need to specify `keytype=ENSEMBL`. The column argument tells the `mapIds` function which information we want, and the `multiVals` argument tells the function what to do if there are multiple possible values for a single input value. Here we ask to just give us back the first one that occurs in the database. Let’s get the Entrez IDs, gene symbols, and full gene names.
219219

220220
```R
221221
res$symbol = mapIds(org.Hs.eg.db,
@@ -237,26 +237,30 @@ res$name = mapIds(org.Hs.eg.db,
237237
head(res)
238238
```
239239

240+
We’re going to use the [gage](http://bioconductor.org/packages/release/bioc/html/gage.html) package for pathway analysis, and the [pathview](http://bioconductor.org/packages/release/bioc/html/pathview.html) package to draw a pathway diagram.
241+
242+
243+
The gageData package has pre-compiled databases mapping genes to KEGG pathways and GO terms for common organisms:
244+
240245
```R
241246
data(kegg.sets.hs)
242247
data(sigmet.idx.hs)
243248
kegg.sets.hs = kegg.sets.hs[sigmet.idx.hs]
244249
head(kegg.sets.hs, 3)
245250
```
246251

252+
Run the pathway analysis. See help on the gage function with ?gage. Specifically, you might want to try changing the value of same.dir
253+
247254
```R
248255
foldchanges = res$log2FoldChange
249256
names(foldchanges) = res$entrez
250-
head(foldchanges)
251-
```
252-
253-
```R
254-
# Get the results
255257
keggres = gage(foldchanges, gsets=kegg.sets.hs, same.dir=TRUE)
256-
257-
# Look at both up (greater), down (less), and statatistics.
258258
lapply(keggres, head)
259+
```
259260

261+
pull out the top 5 upregulated pathways, then further process that just to get the IDs. We’ll use these KEGG pathway IDs downstream for plotting.
262+
263+
```R
260264
# Get the pathways
261265
keggrespathways = data.frame(id=rownames(keggres$greater), keggres$greater) %>%
262266
tbl_df() %>%
@@ -270,6 +274,8 @@ keggresids = substr(keggrespathways, start=1, stop=8)
270274
keggresids
271275
```
272276

277+
Finally, the pathview() function in the pathview package makes the plots. Let’s write a function so we can loop through and draw plots for the top 5 pathways we created above.
278+
273279
```R
274280
# Define plotting function for applying later
275281
plot_pathway = function(pid) pathview(gene.data=foldchanges, pathway.id=pid, species="hsa", new.signature=FALSE)
@@ -278,12 +284,8 @@ plot_pathway = function(pid) pathview(gene.data=foldchanges, pathway.id=pid, spe
278284
tmp = sapply(keggresids, function(pid) pathview(gene.data=foldchanges, pathway.id=pid, species="hsa"))
279285
```
280286

281-
```R
282-
data(go.sets.hs)
283-
data(go.subs.hs)
284-
gobpsets = go.sets.hs[go.subs.hs$BP]
287+
#### Thanks
285288

286-
gobpres = gage(foldchanges, gsets=gobpsets, same.dir=TRUE)
289+
This material was inspired by Stephen Turner's blog post:
287290

288-
lapply(gobpres, head)
289-
```
291+
> Tutorial: RNA-seq differential expression & pathway analysis with Sailfish, DESeq2, GAGE, and Pathview: http://www.gettinggeneticsdone.com/2015/12/tutorial-rna-seq-differential.html

0 commit comments

Comments
 (0)