Site Reliability Engineer

Site Reliability Engineer (Remote) Needed - Leading Education SaaS Platform!

  • REMOTE
  • New York, New York
Easy Apply Now

A bit about us:

Founded over twenty years ago, we specialize in building a SaaS education platform and are a leading curriculum solutions provider for K-12 students. Our comprehensive, dynamic, and progressive learning technology helps students develop as learners and thinkers. Our platform delivers research-proven, high-quality core and supplemental solutions in math, world languages, ELA and literacy, computer science and biotech, as well as best-in-class K-12 professional learning services. Here, we strive to create an environment where people want to work - one where the larger team comes first, where trying new things (and sometimes failing) is encouraged, and where we pursue our mission relentlessly. We are a major disruptive force in the digital curriculum market by combining world-class research, differentiated technology, best in class content together with a world-class mission-oriented team. Are you passionate about shaping the future of learning?

Why join us?

  • Competitive base salary and overall compensation package
  • 401 K with generous company match
  • Full benefits: Medical, Dental, Vision, Life, Disability
  • Generous PTO, vacation, sick, and holiday schedule

Job Details

We are looking for a Site Reliability Engineer to join our growing team to ensure that the systems that our students and teacher rely on daily are available, reliable, secure, scalable, and satisfying. We apply engineering disciplines to improve user satisfaction and prevent crises, while also responding to the inevitable error. The existing team has a broad base of collective experience, so we can accommodate a range of experience levels for this position. We seek to automate the mundane tasks, so we can research and implement exciting tools and technologies.

This role will entail the following:

  • Enhancement and improvements of our SaaS solutions within AWS
  • Define, operate, and refine processes for continuous integration and deployment of application software
  • Development of CI/CD pipelines, IaaS for cloud native applications
  • Configuring services
  • Manage and interpret application data and logs to assist customer support teams with escalations to development.
  • Design and implement mechanisms for proactive monitoring, alerting, trend-analysis and self-healing.
  • Identify opportunities to improve DevOps processes and collaborate with the team for solutions.
  • Help define, measure and report on SLIs and SLOs, drive organization to meet SLOs, and support the ability of the company to provide its customers with SLAs.
  • Participate in post-incident reviews to better expose system or process gaps.
  • Document procedures and site infrastructure.

You should know some of the following:

  • Site Reliability, SRE
  • AWS, SaaS
  • IaaC, CloudFormation (currently used), Terraform, Chef/OpsWorks, Elastic Beanstalk
  • CI/CD, Pipelines, Jenkins, Bamboo, Git, Jira
  • Container Orchestration, Fargate, ECS, Kubernetes
  • Python, Bash
  • System logs/metrics, Splunk
  • Application Performance Management Tools, New Relic, Datadog
Easy Apply Now
Easy Apply Now
Job Details
Managed by Jobot Pro
Location
REMOTE
Job Type
Permanent