Sequencing has become an essential tool in genetic and genomic analysis. It is increasingly important for experimental scientists to gain the bioinformatics skills required to analyse the large volumes of data produced by sequencers. This course aims to equip participants with the essential informatics skills required to begin analysing data and apply some of the most commonly used tools and resources for sequence data analysis. The programme covers prominent sequencing technologies, algorithmic theory, and principles of bioinformatics, with a strong focus on practical computational sessions using sequence analysis techniques and tools applicable to any species or genome size. A variety of applications will be covered from post-sequencing analysis, quality control, reference alignment, and variant calling
Applicants should be postdoctoral scientists, senior PhD students, junior faculty members or clinicians/healthcare professionals actively engaged in or soon to commence research involving sequencing data analysis
After this workshop participants should be able to:
- Use the Unix command-line as a tool for data analysis
- Describe the different sequencing data file formats available
- Perform QC assessment of high throughput sequencing data
- Explain the algorithmic concepts behind read alignment, variant calling and structural variant detection
- Perform read alignment, variant calling and structural variation detection using standard tools
Date |
Module |
Topic |
Sessions |
15 August |
Welcome to the course |
Get to know your classroom and meet your teaching assistants and fellow participants |
|
17 August |
Using Vula effectively and classroom bios |
||
22 August |
1 |
Introduction to Unix/Linux |
Introduction to Linux command line tools |
24 August |
Introduction to Linux command line tools II |
||
29 August |
2 |
Introduction to sequencing technologies |
Introduction to sequencing technologies |
31 August |
3 |
Sequencing data formats and QC |
Data pre-processing and QC I |
5 September |
Data pre-processing and QC II |
||
7 September |
4 |
Alignment to Reference |
Alignment to Reference |
12 September |
No session |
||
14 September |
5 |
Variant Calling - Human |
Human Variant Calling |
19 September |
Human Variant Calling II Structural |
||
21 September |
6 |
Variant Calling - Pathogen |
Science talks |
26 September |
Pathogen Variant Calling I |
||
28 September |
Pathogen Variant Calling II |
||
3 October |
|
Workshop and wrap up |
FAIR and data sharing workshop |
5 October |
Course evaluation and wrap up |
Classroom applications
Participant applications
The course will cover the following topics:
- Intro to Unix/Linux & running workflows
- Introduction to Sequencing Technologies
- Sequencing data pre-processing and QC
- Alignment to reference sequences
- Variant calling and annotation
The practical sessions will be taught exclusively through Unix/Linux. Therefore, participants are required to have some previous experience using the Linux operating system. This will be essential for participants to fully benefit from the course. There are numerous online introductory tutorials to the UNIX/Linux operating system and command line, including:
Course limitations
Training material availability
Training materials for this course are currently being curated and disseminated but a git repository can be accessed here: https://github.com/WCSCourses/GSBAfrica2023, in the interim.