Skip to content

Commit 9a35d51

Browse files
committed
began writing rna-seq tuto
1 parent aa1977b commit 9a35d51

1 file changed

Lines changed: 32 additions & 1 deletion

File tree

rna_seq.md

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ So to summarize we have:
2727
* HBR + ERCC Spike-In Mix2, Replicate 2
2828
* HBR + ERCC Spike-In Mix2, Replicate 3
2929

30-
You can download the data from [here](link)
30+
You can download the data from [here](http://139.162.178.46/files/tutorials/toy_rna.tar.gz)
3131

3232
Unpack the data and go into the toy_rna directory
3333

@@ -73,4 +73,35 @@ First, open up your favourite R IDE and install the necessary packages:
7373
```R
7474
source("https://bioconductor.org/biocLite.R")
7575
biocLite("tximport")
76+
biocLite("GenomicFeatures")
77+
78+
install.packages("readr")
79+
```
80+
81+
Then load the modules:
82+
83+
```R
84+
library(tximport)
85+
library(GenomicFeatures)
86+
library(readr)
87+
```
88+
89+
Salmon did the quantifiation of the transcript level. We want to see which genes are differentially expressed, so we need to link the transcripts name to the gene names. We can use our .gtf annotation for that, and the GenomicFeatures package:
90+
91+
```R
92+
txdb <- makeTxDbFromGFF("chr22_genes.gtf")
93+
k <- keys(txdb, keytype = "GENEID")
94+
df <- select(txdb, keys = k, keytype = "GENEID", columns = "TXNAME")
95+
tx2gene <- df[, 2:1]
96+
head(tx2gene)
97+
```
98+
99+
now we can import the salmon quantification:
100+
101+
```R
102+
samples <- read.table("samples.txt", header = TRUE)
103+
104+
files <- file.path("salmon", samples$quant, "quant.sf")
105+
names(files) <- paste0("sample", 1:6)
106+
txi.salmon <- tximport(files, type = "salmon", tx2gene = tx2gene, reader = read_tsv)
76107
```

0 commit comments

Comments
 (0)