Bioinformatics Group Lab Rotation II Max Planck Institute for Marine Microbiology, Bremen, Germany May 8, 2017 - June 19, 2017 Advisor: Antonio Fernandez-Guerra, PhD
Project: Defining gene cluster families using community detection methods. The aim of my project was to look at the community structure of biosynthetic gene clusters (BGCs) from seawater samples collected from the TARA Oceans global seawater sampling effort. BGCs can code for antibiotics, pesticides, etc. The goal was to examine the distribution and novelty of BGCs for potential agricultural, pharmaceutical, and scientific knowlege purposes. The project's collaborators had sequenced the metagenome of these samples and assembled metagenomes (MAGs-metagenome assembled genomes). They then searched for regions of the genomes that coded for BGCs using the program antiSMASH. To see how related these BGCs were to each other, I used the program BiG-SCAPE to calculate the similarity of each BGC to each other in a pairwise fashion. I included a reference dataset from the database MiBIG, which is a highly curated database of biosynthetic genes as a control to determine which community detection algorithm to apply to the BGCs from the TARA MAGs. For this project, I worked in R studio extensively and maintained a detailed script of my code for the project on GitHub. I will present my findings at the YOUMARES8 Conference on 15 September, 2017.
Sadly, I have no pretty pictures of this project, but I attached my presentation as a PDf (the powerpoint file was too large).