(via Derek Johnson, National Network of Libraries of Medicine – Greater Midwest Region)
From Monday, August 14, to Wednesday, August 16, the NCBI, with involvement from several NIH institutes, will host a Biomedical Data Science hackathon at the National Library of Medicine on the NIH campus. The hackathon will primarily focus on medical informatics, advanced bioinformatics analysis of next generation sequencing data and metadata. To apply for this event, complete this application, which should take no more than 10 minutes. Applications are due by 3:00 PM CST on Tuesday, July 11.
This event is for students, postdocs and investigators or other researchers already engaged in the use of medical informatics data or pipelines for genomic analyses from next generation sequencing data. Some projects are available to other non-scientific developers, cybersecurity experts, mathematicians or librarians. The event is open to anyone selected for the hackathon, and willing to travel to NIH. There will be 5-7 teams comprised of 5-6 individuals. These teams will build pipelines and tools to analyze large datasets within a cloud infrastructure. After a brief organizational session, teams will spend three days analyzing a challenging set of scientific problems related to a group of datasets. Participants will analyze and combine datasets in order to work on these problems.
Datasets will come from public repositories. During the hackathon, participants will have an opportunity to include other datasets and tools for analysis. Please note, if you use your own data during the hackathon, we ask that you submit it to a public database within six months of the end of the event. All pipelines and other scripts, software and programs generated in this hackathon will be added to a public GitHub repository designed for that purpose. A manuscript outlining the design and usage of the software tools constructed by each team may be submitted to an appropriate journal such as the F1000Research hackathons channel.
Potential subjects for this iteration are below.
- defining single cell expression profiles and extracting them from tissues
- building robots to extract genomic signatures from primary datasets,
- locating antibiotic resistance signatures in metagenomic data
- building a pipeline to make legacy code security compliant
- defining complex phenotypes from primary datasets
- developing an interactive viewer for gene expression in aging
- integrating phenvar.colorado.edu with Clinvar, Electronic Medical Records, and Ontologies
- possibly others
Please contact Dr. Ben Busby, at ben.busby@nih.gov, with any questions.