Site Reliability Engineer - Observability

Flinks

Flinks

Software Engineering
Montreal, QC, Canada
Posted on Aug 22, 2025

About Flinks 🚀

Flinks is where financial data moves—with purpose, trust, and impact.

We’re on a mission to simplify access to financial data and help businesses build better, faster, and more secure financial products and experiences. Since 2016, we’ve been bridging the gap between fintechs, financial institutions, and consumers by enabling seamless, secure data connectivity.

From instant account funding to smarter lending, our solutions help power some of the most innovative financial products in North America. We partner with lenders, banks, and fintechs to streamline onboarding, prevent fraud, and fuel real-time decision-making with enriched, reliable data.

As pioneers in Canada’s open banking movement, we're not waiting for the future—we're building it. If you're bold, curious, and ready to help shape the future of finance, we’d love to meet you.

What You'll Be Doing 🔥

As the Observability SRE, you will own the end-to-end observability, monitoring, and reliability strategy across all Flinks product lines. Your mission is to ensure every product—Data Connectivity, Payments, Enrichment, and Document Services—has the right telemetry, actionable alerts, and reliability insights.

  • Company-wide Observability & Monitoring: Define and maintain an observability framework across products; ensure coverage for APIs, scraping systems, payments, enrichment, and document services; establish SLIs/SLOs aligned to client expectations.
  • Alerting & Incident Management: Build consistent, low-noise alerting rules; integrate observability into Incident.io workflows; lead cross-product RCA; maintain a “single source of truth” for reliability metrics.
  • Reliability Analysis & Insights: Deliver monthly/quarterly scorecards linking reliability to client outcomes (e.g., churn risk, adoption blockers); analyze trends and recurring failures; translate data into executive insights.
  • Automation & AI-Enabled Observability: Automate anomaly detection, escalation, and self-healing; partner with the AI team; optimize logging and monitoring spend.
  • Collaboration & Enablement: Champion observability practices across teams; train PMs, QA, and Engineers; ensure insights influence roadmaps; collaborate with Tech Leadership to build observability in from the start.

Who You Are 💪

  • Experience: 5–8 years in SRE, Observability, or Reliability roles, ideally across multiple product environments (fintech, SaaS, or data platforms).
  • Technical Skills: Strong in observability tooling (Grafana, Prometheus, OpenTelemetry, ELK); Hands on experience with tracing and profiling tools (APM, OTEL, Pyroscope); experience with distributed systems, APIs, and data pipelines; strong automation skills (Kubernetes).
  • Strong programming skills with working knowledge of at least one programming language; C# and Go are preferred, but experience in other languages will also be considered valuable.
  • Mindset:
    • Systems thinker who sees the big picture.
    • Business-aware, connecting reliability to retention and profitability.
    • Proactive, anticipating failures before they occur.
    • Collaborative, working across product, QA, engineering, and reliability.

Great to haves

  • Experience in fintech or high-availability SaaS environments.
  • Familiarity with payments infrastructure and fraud detection systems.
  • Contributions to open-source observability tools or frameworks.

Why This Role Matters at Flinks 💡

  • Ensures all products have consistent reliability and observability standards.
  • Provides a single source of truth for performance and reliability across the org.
  • Directly improves client trust, profitability, and operational efficiency.
  • Enables proactive stability management across Flinks’ core product lines.
  • Supports our shift to a cohesive, reliable, platform-first mindset at scale.

The Interview Process 🏗

  • Head of People
  • Director of IT Ops
  • Technical Challenge
  • Panel Interview