Senior Data Platform Engineer - DataOps (Montreal, Ottawa, Calgary locations)

GHGSat

GHGSat

Software Engineering, Data Science
Montreal, QC, Canada
Posted on Aug 9, 2024

About GHGSAT
GHGSat offers greenhouse gas detection, measurement, and monitoring services to industrial and government customers around the world. We use our own satellites and aircraft sensors, combined with third-party data, to help industrial emitters better understand, control, and reduce their emissions. GHGSat’s capability is unique: The company provides high-resolution, local measurements of atmospheric methane and carbon dioxide concentrations from space, enabling us to detect greenhouse gas emitters and visualize and quantify their emissions.

Job Description:

As a Senior Data Platform Engineer, you will be responsible for designing, building, and maintaining our data infrastructure with a focus on automation, efficiency, and scalability. You will work closely with cross-functional teams to develop CI/CD pipelines, manage cloud data services, and ensure high performance of data products. Your expertise in DevOps, cloud platforms, and data technologies will be crucial to our success.

Key Responsibilities:

  • Data Infrastructure Management: Provision and manage data infrastructure in cloud environments such as AWS, including services like Redshift, Glue, Sagemaker, AWS DMS, AWS RDS (PostgreSQL), AWS EKS, and AWS ECR.
  • Infrastructure as Code (IaC): Develop and maintain IaC using Terraform to automate the provisioning and management of cloud resources.
  • CI/CD Pipelines: Develop CI/CD pipelines and automation tools using GitLab to ensure quick and reliable data deployment across multiple environments.
  • Observability and Telemetry: Implement observability and telemetry solutions to monitor and manage the health and performance of data infrastructure.
  • ETL Processes: Design and implement ETL processes, leveraging tools like Airflow, DBT, and various data lake file formats such as Delta, Hudi, or Iceberg.
  • Container Orchestration: Utilize container-based orchestration services such as Kubernetes, ECS, and Fargate for managing and scaling containerized applications.
  • Documentation and Support: Maintain comprehensive documentation of system configurations and procedures, including run books for live support.
  • Performance Improvement: Proactively troubleshoot and optimize data products to enhance performance and integrate best practices throughout the data development lifecycle.
  • Collaboration: Work closely with data scientists, data analysts, and other engineering teams to understand data requirements and ensure the availability and reliability of data services.

Required Qualifications:

  • Experience: 5+ years of hands-on experience with DevOps or DataOps in a similar role.
  • Proficiency in Terraform, GitLab, SQL, and Python.
  • Experience with Airflow, Kubernetes, Docker, DBT, and Linux.
  • Strong knowledge of AWS services such as DMS, RDS (PostgreSQL), EKS, and ECR.
  • Familiarity with data lake file formats like Delta, Hudi, or Iceberg.
  • Expertise in container-based orchestration services like Kubernetes, ECS, and Fargate.
  • Strong problem-solving skills with the ability to troubleshoot and resolve complex issues.
  • Experience designing and building data models using data warehousing concepts to enable access to data across multiple domains.
  • Communication: Excellent communication skills with the ability to collaborate effectively with cross-functional teams.
  • Education: Bachelor's degree in computer science, Engineering, or a related field. Advanced degree or relevant certifications are a plus.

Preferred Qualifications:

  • Experience with additional cloud data platforms or services.
  • Knowledge of other programming languages or data tools.
  • Relevant certifications in cloud platforms (AWS, Azure, GCP) or terraform.