Site Reliability Engineer
- $125,000 - $160,000
- Nashville, TN
Site Reliability Engineer needed ASAP!
A bit about us:
We are a technology company that offers a suite of solutions ranging from social media to decentralized / Web3 technical innovations. We were established upon a foundation of respect for privacy and personal data, free speech, free markets, and ethical, transparent corporate policy.
Why join us?
· Medical covered 80% for employee
· Dental and Vision
· Unlimited Vacation Time
· Speak freely on any social media accounts without getting pulled into HR for "a discussion"
· Dental and Vision
· Unlimited Vacation Time
· Speak freely on any social media accounts without getting pulled into HR for "a discussion"
Job Details
As SRE you will:
· Run our infrastructure with Ansible, Terraform, Chef, Saltstack, or Puppet and DevOps best practice
· Make monitoring and alerting alerts on symptoms and not on outages.
· Document every action so your findings turn into repeatable actions–and then into automation.
· Design, build and maintain core infrastructure pieces that allow Parler scaling to support hundreds of thousands of concurrent users.
· Improve the deployment process to make it as boring as possible
· Debug production issues across services and levels of the stack
· Be on a PagerDuty rotation to respond to availability incidents and provide support for service engineers with customer incidents
· Use your on-call shift to prevent incidents from ever happening
· Have experience with Nginx, HAProxy, Docker, Kubernetes, Terraform, ProxySQL or similar technologies
· Projects you could work on Coding infrastructure automation with Chef and Terraform
· Develop a relationship with a product group, define their SLAs, share our data on those SLAs and improve their reliability
Areas of expertise/contribution for Leveling
· Have (decently) strong programming skills in PHP (creating tools / accessing data)
· Ability to use Chef and Ansible to efficiently manage our infrastructure
· Intermediate level Unix knowledge
· Load balancing the application using proxying, image serving via Object Store and CDN, as well as containerizing our system for Kubernetes
· Backend storage management and scaling
· Disaster Recovery and High Availability strategy
· Run our infrastructure with Ansible, Terraform, Chef, Saltstack, or Puppet and DevOps best practice
· Make monitoring and alerting alerts on symptoms and not on outages.
· Document every action so your findings turn into repeatable actions–and then into automation.
· Design, build and maintain core infrastructure pieces that allow Parler scaling to support hundreds of thousands of concurrent users.
· Improve the deployment process to make it as boring as possible
· Debug production issues across services and levels of the stack
· Be on a PagerDuty rotation to respond to availability incidents and provide support for service engineers with customer incidents
· Use your on-call shift to prevent incidents from ever happening
· Have experience with Nginx, HAProxy, Docker, Kubernetes, Terraform, ProxySQL or similar technologies
· Projects you could work on Coding infrastructure automation with Chef and Terraform
· Develop a relationship with a product group, define their SLAs, share our data on those SLAs and improve their reliability
Areas of expertise/contribution for Leveling
· Have (decently) strong programming skills in PHP (creating tools / accessing data)
· Ability to use Chef and Ansible to efficiently manage our infrastructure
· Intermediate level Unix knowledge
· Load balancing the application using proxying, image serving via Object Store and CDN, as well as containerizing our system for Kubernetes
· Backend storage management and scaling
· Disaster Recovery and High Availability strategy