H3ABioNet

Pan African Bioinformatics Network for H3Africa

Workshop Title: H3ABioNet Cloud computing hackathon

Dates and location:  22nd August to 26th August 2016, University of Pretoria, South Africa

Workshop organisers: Scott Hazelhurst, Fourie Joubert, Sumir Panji and Nicola Mulder Workshop sponsors: H3ABioNet

Workshop overview: One of H3ABioNet’s deliverables is to investigate the use of Cloud computing for H3Africa data analysis. It was recently agreed that important pipelines should be made available as Docker containers that can be deployed on public or private clouds. The Wits’ Node has made a start by “containerizing” the Wits GWAS pipeline, and a Cloud Computing Task Force has recently been established in H3ABioNet to examine and implement the use of cloud based technologies for bioinformatics pipelines relevant to H3Africa researchers. In order to further the work of this task force we will be holding a Cloud computing hackathon.

Workshop objectives:

  • To bring together H3ABioNet staff and students who have some working knowledge of Cloud computing and enable them to interact with international partners with experience in  Docker pipelines
  • To assess existing Docker containers for relevant pipelines
  • To develop Docker containers for specific pipelines of relevance to H3Africa
  • To work on interoperability between containers and portability
  • To document the pipelines and make them available from the H3ABioNet website and/or GitHub
  • To document the available public and private clouds with the appropriate infrastructure for these pipelines to be run on

Target audience/participants:   

  • H3ABioNet staff and students with working knowledge of Cloud computing and developing containers
  • International collaborators with expertise in developing Docker containers

Prerequisites and selection:

The selection will be partly through an open call for applications and part invitation. As this will be a working meeting, participants are expected to have good programming ability (Python is likely to be the working language) and some knowledge of some of the following areas bioinformatics, cloud computing, Docker, pipeline construction (we don’t expect everyone to know everything and will have multi-skilled teams).

Once applicants have been selected, they will be required to assist with the technical preparatory phase of the hackathon that will include but is not limited to:

  • Analysis of existing cloud based containers/pipelines of relevance to H3ABioNet, e.g.DNAnexus, Clovr, GVL (Genomics Virtual Laboratory), NGS-logistics, UCSC microbiome pipeline, Global Alliance
  • Documentation of existing public and private clouds and their operating systems and costs
  • Attend scheduled online meetings prior to the hackathon for preliminary work
  • Continue working on outstanding Cloud tasks once they return to their home Institutions

The application form is available at: http://goo.gl/forms/Q5Om2HB4dxY1F3zA3

Application deadline: 30th June 2016.

Workshop draft outline:

Participants will choose (or be assigned to) various streams depending on their skills, interests and to ensure each stream has sufficient participants.

Proposed streams for the H3ABioNet cloud hackathon
Stream A Stream B Stream C Stream D

Illumina genotype chip QC, genotype and SNP calling

Imputation and phasing

Population structure, association testing and visualization

Metagenomics

 

Time Topic Trainer
22nd August 2016
9:00 am Registration, introductions by participants and discussion on the objectives and desired outcomes for the week All participants
9:30 am

Report and overview from task force on existing clouds available

TBD
10:00 am

Report back on overview of technologies to be used for Stream A – what is out there and what should be created

TBD
10:30 am Tea break  
11:00 am Report back on overview of technologies to be used for Stream B – what is out there and what should be created TBD
11:30 am Report back on overview of technologies to be used for Stream C – what is out there and what should be created TBD
12:00 pm Report back on overview of technologies to be used for Stream D – what is out there and what should be created TBD
12:30 pm Lunch  
1:30 pm Discussion of resources and working practices to be used e.g Git, server access, tools, pipeline All participants
2:00 pm Breakout into Streams and formulate work plan with priorities for each group All participants
3:30 pm Tea Break  
4:00 pm Invited talk TBD
5:00 pm Workshop End  
23rd August 2016
9:00 am Report back on work plan and objectives for the day – Stream A TBD
9:20 am

Report back on work plan and objectives for the day – Stream B

TBD
9:40 am

Report back on work plan and objectives for the day – Stream C

TBD
10:00 am

Report back on work plan and objectives for the day – Stream D

TBD
10:30 am Tea break  
11:00 am Streams break out to implement work plan All participants
1:00 pm Lunch  
2:00 pm Streams break out to implement work plan All participants
3:30 pm Tea Break  
4:00 pm Invited talk TBD
5:30 pm Workshop End  
24th August 2016
9:00 am Report back on previous day’s accomplishments and work plan for the day – Stream A TBD
9:20 am

Report back on previous day’s accomplishments and work plan for the day – Stream B

TBD
9:40 am

Report back on previous day’s accomplishments and work plan for the day – Stream C

TBD
10:00 am

Report back on previous day’s accomplishments and work plan for the day – Stream D

TBD
10:30 am Tea break  
11:00 am Streams break out to implement day work plan All participants
1:00 pm Lunch  
2:00 pm Streams break out to implement day work plan All participants
3:30 pm Tea break TBD
4:00 pm Invited talk TBD
5:30 pm Workshop End  
25th August 2016
9:00 am Report back on previous day’s accomplishments and work plan for the day – Stream A TBD
9:20 am Report back on previous day’s accomplishments and work plan for the day – Stream B TBD
9:40 am

Report back on previous day’s accomplishments and work plan for the day – Stream C

TBD
10:00 am

Report back on previous day’s accomplishments and work plan for the day – Stream D

TBD
10:30 am Tea break  
11:00 am Streams break out to implement day work plan All participants
1:00 pm Lunch  
2:00 pm Streams break out to implement day work plan All participants
3:30 pm Tea break  
4:00 pm Invited talk TBD
5:30 pm Workshop End  
26th August 2016
9:00 am Streams break out to implement day work plan All participants
10:30 am Tea break  
11:00 am Streams break out to implement day work plan All participants
1:00 pm Lunch  
2:00 pm Streams start wrapping up on tasks and creating plan for continued work on unfinished tasks All participants
3:30 pm Tea break  
4:00 pm Report on progress and plan for continued work – Stream A TBD
4:20 pm Report on progress and plan for continued work – Stream B TBD
4:40 pm Report on progress and plan for continued work – Stream C TBD
5:00 pm Report on progress and plan for continued work – Stream D TBD
5:30 pm Workshop End