Who we are:
Calico is a research and development company whose mission is to understand the biology of aging, and to help people to lead longer and healthier lives. We aim to combine the best biomedical science with cutting edge technology and computing.
Position description:
Biology is in many ways becoming a data science, with an ongoing explosion in the amount and quality of biological and medical data. These data can be transformative to our understanding of biology and disease, but the tools to store, process, and analyze these data lag far behind what’s needed. Calico is seeking a strong full stack software developer to become an early member of a small, world-class team that develops, productionizes, and maintains Calico’s data platform, including its data warehouse and a code base of high-quality data processing, analysis, and visualization algorithms. Here you will work in close collaboration with some of the world’s best life scientists and enable them to achieve groundbreaking discoveries in human health. You will do so by creating something new in a company that is both a nimble start-up but also has a firm financial footing.
As a member of the Data Platform team, you will work alongside our UX/UI designer and other software engineers to create and maintain a platform to derive insights from data produced by Calico, by Calico’s collaborators, and from publicly available sources. Our stack includes modern JavaScript frameworks, Flask, Apache Airflow, Docker, Kubernetes, and Google Cloud Platform. The team will build and maintain a data warehouse that stores biological data and allows easy exploration, analysis, and visualization. The team will also help develop and maintain a repository of scalable, reusable tools for data processing and visualization, machine learning, and natural language processing.
Many of the problems you will help tackle involve new types of biological data, data at a scale such that the relevant methods and tools have yet to be developed, or the integration of data across multiple modalities. The data span multiple organisms (from yeast to human), scales (from molecules to entire organisms), data modalities (from sequencing to imaging to physiology), and time scales (from single time points to long-term time series).
As a founding member of this new team, you will be a key part of setting the vision on how a data platform can best provide a productivity multiplier to our biological and computational scientists as they work toward finding solutions to help improve human healthspan.
Position requirements:
Nice to have: