I am driven by solving biological and medical problems with computational tools.
I always wanted to become a biologist and I also learned programming as a hobby. When bioinformatics became a thing, I was lucky to combine my two interests and turn into a bioinformatician.
I have 7+ years of experience in the development of pipelines for large scale genomic and transcriptomic data analyses. I program in Python, R, and Bash. To make my projects reproducible and scalable, I use Git for version control, Snakemake for pipelines, Conda/Docker for virtual environments, R Markdown for explanation and communication.
I am very interested in Big Data analyses. Genomic data is Big Data in a sense but current approaches to processing genomic data are lagging behind the industry standards. I believe that analyzing genomic data in Apache Spark is the future and such projects as ADAM, Hail, VariantSpark will advance genomic analyses.
Currently, I am seeking to leverage my passion and expertise to help develop automated large scale bioinformatic analyses in an industrial or academic setting.