View Our Website View All Jobs

Senior Systems Administrator (Research Computing)

Who we are:

Calico is a research and development company whose mission is to harness advanced technologies to increase our understanding of the biology that controls lifespan, and to devise interventions that enable people to lead longer and healthier lives. Executing on this mission will require an unprecedented level of interdisciplinary effort and a long-term focus for which funding is already in place.

Position description:

The Senior System Administrator manages physical and virtual servers, data storage, on-premises compute cluster, and hybrid cloud used by Calico scientists. This role involves hands-on system management in cooperation with Calico IT and Google employees who assist with network management and infrastructure application development. The Senior System Administrator works closely with scientists who are leaders in their fields, advising and assisting them with their storage and high performance computing needs. We are seeking a resourceful individual with expertise in a variety of system administration functional areas who enjoys delivering high performance computing to a dynamic, creative research organization.

Responsibilities:

  • Monitor, troubleshoot, and maintain Calico’s high performance clusters, storage, scientific virtual machines
  • Work with scientific users and core facility leaders to assess needs, propose solutions, and provision new equipment, virtual machines, operating systems and applications as needed
  • Implement and configure tools to automate monitoring and provisioning functions
  • Participate in the design and implementation of Calico’s evolving hybrid cloud environment on Google Cloud Platform (GCP) for research computing
  • Work with Calico IT to assure the integrity and availability of data and disaster recovery planning

Position requirements:

  • 5+ years experience in a research or high performance computing environment, on premises and cloud
  • Advanced Linux system administration (Centos/Ubuntu preferred)
  • Shell scripting (bash or csh) and basic utility programming (e.g. python, C++)
  • NFS (open ZFS, NetApp and/or similar) and samba network storage management
  • Slurm (or similar) cluster resource management
  • Open HPC (or similar) cluster provisioning and monitoring
  • Puppet (or similar) deployment manager script development
  • Directory and naming services management (Active Directory, LDAP, DNS)
  • Excellent communication skills and track record of working well with scientific end users

Helpful skills and experience:

  • Nagios (or similar) infrastructure monitoring component development
  • Lustre (or similar) cluster storage management and performance tuning
  • Provision cloud resources using scripts, APIs, deployment managers
  • Securing and managing PHI and PPI
  • Life cycle data management for business continuance, disaster recovery and archive
  • Familiarity with biotechnology methods and applications
  • VMware virtual machine, network, and resource provisioning and monitoring
  • Docker/Kubernetes cluster deployment
  • Working knowledge of networking theory, technologies and operations
  • High performance computing in a hybrid cloud environment

 

Read More

Apply for this position

Required*
Apply with Indeed
Attach resume as .pdf, .doc, or .docx (limit 2MB) or Paste resume

Paste your resume here or Attach resume file

To comply with government Equal Employment Opportunity / Affirmative Action reporting regulations, we are requesting (but NOT requiring) that you enter this personal data. This information will not be used in connection with any employment decisions, and will be used solely as permitted by state and federal law. Your voluntary cooperation would be appreciated. Learn more.
Gender
Race
Veteran/Disability status