Site Reliability Engineer (SRE) / Ingénieur SRE

Juniors accepted

Permanent contract

Site Reliability Engineer (SRE)

45k€ ➞ 45k€/year

Azure

Kubernetes

Management

Here's the job offer formatted in Markdown, with the specified requirements:

✨ [FR] Le poste ✨

En tant que Site Reliability Engineer (SRE), vous contribuerez à la fiabilité, la performance et l’évolutivité des plateformes et applications utilisées par des chercheurs et ingénieurs travaillant sur des problématiques scientifiques complexes.

Selon les projets, vos missions pourront concerner :

l’infrastructure interne de Discngine, qui supporte nos produits et services
les environnements techniques de nos clients, principalement des acteurs de l’industrie pharmaceutique et des sciences de la vie.

Vous interviendrez donc dans des contextes variés, allant de l’exploitation de plateformes internes à l’accompagnement technique de clients utilisant nos solutions.

Vous travaillerez à l’interface entre infrastructure, développement et utilisateurs scientifiques, avec un rôle clé dans la compréhension des problèmes rencontrés par les utilisateurs et leur résolution durable.

🚀 Missions principales 🚀

Fiabilité et exploitation des plateformes

Concevoir et maintenir l’infrastructure nécessaire au fonctionnement des applications scientifiques (cloud, conteneurs, services distribués).
Mettre en place des pratiques SRE : observabilité, monitoring, alerting, gestion des incidents.
Améliorer la disponibilité, la performance et la résilience des services.
Automatiser les opérations et les déploiements (CI/CD, infrastructure as code).

Support technique avancé

Diagnostiquer et résoudre les incidents complexes en production.
Participer à l’amélioration continue des systèmes suite aux incidents (post-mortem, automatisation).
Collaborer avec les équipes de développement pour améliorer la robustesse des applications.

Interaction avec les clients et les équipes scientifiques

Échanger directement avec les utilisateurs et clients afin de comprendre leurs problématiques et identifier les causes racines des incidents.
Participer à l’analyse technique de leurs environnements et workflows.
Proposer des solutions techniques adaptées à leurs usages scientifiques.

💻 Environnement technique 💻

Linux / Windows
Cloud et infrastructures distribuées (AWS / OCI / Azure / GCP)
Kubernetes
CI/CD
Monitoring et observabilité
Scripting et automatisation (la stack exacte dépendra des projets et des besoins des équipes)

✨ [EN] The Role ✨

As a Site Reliability Engineer (SRE), you will contribute to the reliability, performance, and scalability of platforms and applications used by researchers and engineers working on complex scientific challenges.

Depending on the projects, your responsibilities may involve:

Discngine’s internal infrastructure, which supports our products and services
Our customers’ technical environments, mainly actors in the pharmaceutical and life sciences industries

You will therefore work in a variety of contexts, ranging from operating internal platforms to providing technical support to customers using our solutions.

You will work at the intersection of infrastructure, development, and scientific users, playing a key role in understanding the issues faced by users and ensuring their sustainable resolution.

🚀 Main Responsibilities 🚀

Platform Reliability and Operations

Design and maintain the infrastructure required to run scientific applications (cloud, containers, distributed services)
Implement SRE best practices: observability, monitoring, alerting, incident management
Improve service availability, performance, and resilience
Automate operations and deployments (CI/CD, infrastructure as code)

Advanced Technical Support

Diagnose and resolve complex production incidents
Contribute to continuous system improvement following incidents (post-mortems, automation)
Collaborate with development teams to improve application robustness

Interaction with Clients and Scientific Teams

Work directly with users and clients to understand their issues and identify root causes of incidents
Participate in the technical analysis of their environments and workflows
Propose technical solutions tailored to their scientific use cases

💻 Technical Environment 💻

Linux / Windows
Cloud and distributed infrastructures (AWS / OCI / Azure / GCP)
Kubernetes
CI/CD
Monitoring and observability
Scripting and automation (The exact tech stack will depend on the projects and team needs)

Reference :jobsforukrainefrance-welcomekit-co-+Discngine-Site-Reliability-Engineer-SRE-Ingenieur-SRE

Home>Job offers