Mission Context
In a rapidly evolving environment, we are looking for a Service Reliability Engineer (SRE) to strengthen a team dedicated to the performance, reliability, and continuity of Cloud and Data services.
Your role will be essential in ensuring service stability and efficiency while supporting development teams in building a culture of quality and continuous improvement.
Your Responsibilities
- Support and contribute to the migration towards Cloud solutions.
- Work closely with development and production teams to ensure service availability, performance, and compliance with SLO objectives.
- Identify, assess, and anticipate production risks and their potential impact on service quality.
- Manage incidents efficiently to minimize impact on users.
- Set up and maintain monitoring and alerting systems, including dashboards to track service health.
- Actively participate in the SRE Guild and collaborate with Incident Managers (TIM, LIM).
- Promote a culture of operational excellence, continuous improvement, and knowledge sharing.
Profile
- Master's degree or equivalent professional experience.
- At least 5 years of experience in infrastructure reliability or critical production environments.
- Strong command of IT management tools (CMDB, application referential), monitoring tools (Dynatrace, Opnet, BAM...), and ITSM tools (ServiceNow).
- Solid understanding of Incident, Problem, and Change Management processes.
- Excellent analytical skills, attention to detail, and a mindset focused on improvement and optimization.
- Fluent in French or Dutch, and English (written and spoken).
Soft Skills
- Strong team spirit and communication skills.
- Autonomous, organized, and proactive.
- Customer-oriented with the ability to explain technical topics in simple terms.
- Stress-resistant and adaptable to a dynamic, multicultural environment.
- Willing to provide support outside of business hours when required (on-call).