H3ABioNet

Pan African Bioinformatics Network for H3Africa

Genome Wide Association Studies, Quality Control and Family based analysis

Please note that this workshop has taken place. The course materials are available on this page.

General Workshop Information

Venue of workshop: Medical Campus, University of Cape Town

Dates for the workshop: 22nd February to 24th February, 2016

Workshop organisers: CBIO, Nicola Mulder, Matthew McQueen

Participation: Open application with selection

Workshop Sponsors: CBIO

Course Overview:The post-genomic era has been characterized by the rapid advance of genotyping technology resulting in a wealth of new, high-quality data that may hold promise for the further elucidation of genetic factors underlying complex disease. Without the proper tools and methods, the ultimate utility of such rich data may be limited in scope as the field attempts to process and interpret the growing amount of information being generated. For this workshop, we will adopt a hands-on approach to navigate some of the more popular genome-wide software packages available. A general overview of the topics to be covered is as follows. After a brief introduction on the current state of genetic research, we will begin with a discussion of the key quality assurance and quality control steps that are necessary to any genome-wide analysis. We will then transition to the analysis phase with a focus on genome-wide association using unrelated samples, including an overview of the analysis of gene pathways. Next, we will discuss the use of genome-wide data to estimate heritability, construction of genetic risk scores and pathway analysis. We will conclude with an introduction to family-based analysis and an overview of meta-analytic techniques.

Intended Audience: This workshop is aimed at beginning and intermediate level graduate students, postdoctoral researchers and faculty.

Syllabus and Tools: Participants will learn about quality control and quality assurance steps of genome-wide data, basic GWAS analysis, construction of genetic risk scores, estimation of genome-wide SNP heritability, an introduction to family-based association approaches and an overview of meta-analytic techniques.

Workshop Trainers: Dr. Matthew McQueen is an Associate Professor and Director of the Public Health Program at the Department of Integrative Physiology at Colorado University, Boulder, USA. His research is focused on a multi-faceted approach to the investigation of genetic determinants underlying complex disease, with a particular interest in psychiatric, behavioral and neurologic disorders. Recent areas of research include the development and application of statistical and epidemiological methods geared towards large-scale genomic analysis in both family-based and population-based samples.

Prerequisites: Participants are encouraged to work through the following resources to enable them to gain the most from the workshop. We will provide a basic overview of the Linux environment as well as work towards gaining an understanding of R.

http://www.ee.surrey.ac.uk/Teaching/Unix/

http://www.r-tutor.com/r-introduction

Objectives: After this workshop participants should be able to:

- Understand the steps necessary to assess the quality of genome-wide data

- Use common software to conduct a basic GWAS analysis

- Conduct relationship-checks and generate components of ancestry

- Aggregate genome-wide association data using GCTA and genetic risk scores

- Understand the features of a family-based association test

- Understand the different techniques to conduct a meta-analysis of genetic results

Workshop limitations: This workshop will only provide a foundation for continued learning in genome wide association testing and studies. Further directions for specific genome-wide approaches will need to be tailored towards particular studies.

Registration: All potential applicants must complete the application form. Incomplete applications will NOT be reviewed. Successful participants will be notified of their selection for the workshop and will be contacted provide a short biosketch with a recent picture.

Please note, if a participant is unable to attend this workshop after acceptance, their place will be passed onto to applicants on the waiting list and not to other recommended members.

Workshop Programme:

Time Topic Trainer
22nd February 2016
9:00 Registration and Introductions  
9:30 Lecture 1: Introduction to Genome-Wide Approaches Matthew McQueen
10:30 Tea break  
11:00 Tutorial 1: Getting Started (Linux) Matthew McQueen
12:00 Lunch  
1:00 Lecture 2: Quality Control Procedures for GWAS Data Matthew McQueen
2:00 Tea Break  
2:30 Tutorial 2: Data Cleaning, Relationship Checks and Genetic Ancestry (PLINK, R) Matthew McQueen
3:30 Workshop End  
23rd February 2016
9:30 Lecture 3: Genome-Wide Association Approaches Matthew McQueen
10:30 Tea Break  
11:00 Tutorial 3: Genome-Wide Association Analysis (PLINK, R) Matthew McQueen
12:00 Lunch  
1:00 Lecture 4: Aggregation of GWAS Information Matthew McQueen
2:00 Tea Break  
2:30 Tutorial 4: Heritability (GCTA), Genetic Risk Score ® and Pathway Analysis Matthew McQueen
3:30 Workshop End  
24th February 2016
9:30 Lecture 5: Introduction to Family-Based Approaches Matthew McQueen
10:30 Tea Break  
11:00 Tutorial 5: Family-Based Association Test (FBAT) Matthew McQueen
12:00 Lunch  
1:00 Lecture 6: Meta-Analysis of Genetic Results Matthew McQueen
2:00 Tea Break  
2:30 Tutorial 6: Meta-Analysis (R) Matthew McQueen
3:30 Workshop End  

Workshop materials:

Lecture 1 slides

Tutorial 1 pdf

Tutorial 1 input files and R script - please note that this is a zip file. The paths in the R script will differ and need to be changed according to your file locations.

------------------------------------------------------------------------------------------------------------------------------------------------------------------

Lecture 2 slides

Tutorial 2 pdf

Tutorial 2 files - this contains the PLINK input files in a directory called raw.zip. It is stored in Google drive and is ~80 MB zipped. The file paths will differ based on your set up.

------------------------------------------------------------------------------------------------------------------------------------------------------------------

Lecture 3 slides

Tutorial 3 pdf

Tutorial 3 input files - please note that this is a zip file. The paths in the R script will differ and need to be changed according to your file locations.

------------------------------------------------------------------------------------------------------------------------------------------------------------------

Lecture 4 slides

Tutorial 4 pdf

Tutorial 4 input files - please note that this is a zip file. The paths in the R script will differ and need to be changed according to your file locations.

------------------------------------------------------------------------------------------------------------------------------------------------------------------

Lecture 5 slides

Tutorial 5 pdf

Tutorial 5 input files - please note that this is a zip file. The paths will differ and need to be changed according to your file locations.

------------------------------------------------------------------------------------------------------------------------------------------------------------------

Lecture 6 slides

------------------------------------------------------------------------------------------------------------------------------------------------------------------

Rscripts directory - directory of all the R scripts used in the workshop, it is a zip file and the paths will differ according to your set up.

All input files for the workshop - this in zip format on Google drive and the higher level folder cbio2016.zip contains the fbat, plink, gcta, giant and meta directories and their files. It is 215 MB in size zipped and ~2GB unzipped

Useful urls pdf - pdf file of useful links to obtain various software used for this workshop.