You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: qc.md
+29-4Lines changed: 29 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -53,18 +53,43 @@ Scythe can be run minimally with:
53
53
54
54
`scythe -a adapter_file.fasta -o trimmed_sequences.fastq sequences.fastq`
55
55
56
-
Trim the adapters in both your read files!
56
+
Try to trim the adapters in both your read files!
57
57
58
58
## Sickle
59
59
60
-
https://github.com/najoshi/sickle
60
+
Most modern sequencing technologies produce reads that have deteriorating quality towards the 3'-end and some towards the 5'-end as well. Incorrectly called bases in both regions negatively impact assembles, mapping, and downstream bioinformatics analyses.
61
61
62
62
We will trim each read individually down to the good quality part to keep the bad part from interfering with downstream applications.
63
63
64
-
and set the quality score to 25. This means the trimmer will work its way from both ends of each read, cutting away any bases with a quality score < 25.
64
+
To do so, we will use sickle. Sickle is a tool that uses sliding windows along with quality and length thresholds to determine when quality is sufficiently low to trim the 3'-end of reads and also determines when the quality is sufficiently high enough to trim the 5'-end of reads. It will also discard reads based upon a length threshold.
65
65
66
+
First, install sickle:
66
67
67
-
What did the trimming do to the per-base sequence quality, the per sequence quality scores and the sequence length distribution?
68
+
```
69
+
git clone https://github.com/najoshi/sickle.git
70
+
cd sickle
71
+
make
72
+
```
73
+
74
+
Sickle has two modes to work with both paired-end and single-end reads: sickle se and sickle pe.
75
+
76
+
Running sickle by itself will print the help:
77
+
78
+
`sickle`
79
+
80
+
Running sickle with either the "se" or "pe" commands will give help specific to those commands. Since we have paired end reads:
81
+
82
+
`sickle pe`
83
+
84
+
Set the quality score to 25. This means the trimmer will work its way from both ends of each read, cutting away any bases with a quality score < 25.
85
+
86
+
```
87
+
sickle pe -f input_file1.fastq -r input_file2.fastq -t sanger \
What did the trimming do to the per-base sequence quality, the per sequence quality scores and the sequence length distribution? Run FastQC again to find out.
68
93
69
94
What is the sequence duplication levels graph about? Why should you care about a high level of duplication, and why is the level of duplication very low for this data?
0 commit comments