Our client is a global digital transformation tech company. They create digital solutions powered by innovative technologies and incorporate emerging trends to help businesses transform and grow. They are looking for enthusiasts to complement our teams of experts.
ABOUT THE PROJECT:
The project is the #1 text communications technology delivering fast, easy, and effective solutions for businesses across a wide variety of industries.
Currently they are looking for a high-performance, experienced Senior Site Reliability Engineer to be a part of our growing SRE team. This person will join a diverse group of SREs focused on scaling their cloud infrastructure and CI/CD processes to support their accelerating growth.
Site Reliability Engineer are responsible for keeping all user-facing services and other production systems running smoothly. Site Reliability Engineers are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our environments and codebase. They specialize in systems, cloud infrastructure, release engineering, observability, and enabling their product team to go fast.
As a member of the SRE team, you will take ownership of the overall performance and reliability infrastructure, robustness of the deployment pipeline, as well as timely and effective incident response and resolution.
- Design, build, scale, and maintain core infrastructure in GCP;
- Manage our infrastructure with Terraform and Ansible;
- Advance the adoption of cloud-native technologies;
- Create efficient and effective CI/CD processes;
- Design and deploy self-healing infrastructure;
- Monitor and alert on service health metrics;
- Debug production issues across services;
- Lead and mentor by setting the example;
- Improve documentation all around;
- Create and maintain runbooks;
- Run blameless postmortem;
- Develop automation.
- 5+ years experience working in a Site Reliability Engineering (SRE) or DevOps role;
- 3+ years experience with cloud platforms and technologies, (GCP, AWS);
- Experience in a scripting language (Python), and a shell language (Bash);
- Container-based deployments and orchestration tools (Kubernetes, helm);
- Deep understanding of DevOps culture, SRE principles, and Agile methodologies;
- Hands-on technical experience with Terraform and Ansible;
- Experience implementing security best practices and “shifting security left”;
- Strong desire to collaborate asynchronously, with a focus on robust documentation;
- Process-oriented approach, driven to iterate on existing processes or create new ones;
- Excellent communication, empathetic with end-users and internal customers;
- Experience identifying SLOs/SLIs that will align the team to meet objectives;
- Strong intuition about system design, robustness, and scalability;
- Ability to troubleshoot problems with existing code and systems;
- Passion for stable and secure systems management practices;
- Ability to orchestrate and automate complex tasks;
- Proactive, grab-a-shovel, and go-for-it attitude;
- Outstanding problem-solving skills.
THEY ARE OFFER:
- An environment that allows you to maximize your productivity and gives you the freedom to think and collaborate beyond the next line of code or deadline;
- We like to have fun, we love what we do, we relax when we need to, we are a great team and we deliver;
- Regular performance based salary and career development reviews;
- Medical insurance (health), employee assistance program;
- Paid vacation, holidays and sick leaves;
- Gym 24/7, personal fitness instructor;
- Massage in the office, personal wellness consultant;
- English classes with native speakers and partially or fully reimbursed personal trainings and conferences;
- Referral program;
- Team building and a lot of fun to take a break, relax, and give you the freedom to think beyond the next line of code.