Next generation sequencing (NGS) has become an essential tool in genetic and genomic analysis. It is increasingly important for experimental scientists to gain the bioinformatics skills required to analyse the large volumes of data produced by next generation sequencers. This course will equip participants with the essential informatics skills required to begin analysing NGS data and apply some of the most commonly used tools and resources for sequence data analysis. The programme will cover prominent sequencing technologies, algorithmic theory and principles of bioinformatics, with a strong focus on practical computational sessions using sequence analysis techniques and tools applicable to any species or genome size. A variety of applications will be covered from post-sequencing analysis - QC, alignment, assembly, variant calling and RNA-Seq.
Applicants should be postdoctoral scientists, senior PhD students, junior faculty members or clinicians/healthcare professionals actively engaged in or soon to commence research involving next generation sequencing data analysis.
After this workshop participants should be able to:
• Use the unix command-line as a tool for data analysis
• Describe the different NGS data file formats available
• Perform QC assessment of high throughput sequencing data
• Explain the algorithmic concepts behind read alignment, variant calling and structural variant detection
• Perform read alignment, variant calling and structural variation detection using standard tools
• Analyse RNA-Seq and CHiP-seq data
• Perform a genome assembly using NGS data
Classroom applications
Participant applications
The course will cover the following topics:
• Intro to Unix/Linux & running workflows
• Introduction to NGS Technologies
• NGS data pre-processing and QC
• Alignment to reference sequences
• Variant calling and annotation
• ChipSeq
• RNASeq
• Genome assembly
The practical sessions will be taught exclusively through Unix/Linux. Therefore, participants are required to have some previous experience using the Linux operating system. This will be essential for participants to fully benefit from the course. There are numerous online introductory tutorials to the UNIX/Linux operating system and command line, including:
Course limitations
The course aims to provide a hands-on introduction to bioinformatics for next generation sequencing, and should not be considered a complete education in the theoretical and mathematical foundations of the topics.
Training material availability
Training materials for this course are currently being curated and disseminated but a git repository can be accessed here: https://github.com/WCSCourses/NGS_Bio_Africa, in the interim.