📌 Hiring Update: Site Reliability Engineer (SRE) – Tidecrest Reliability

Title: Site Reliability Engineer (SRE)

Quick Summary
Tidecrest Reliability is hiring a Site Reliability Engineer to harden uptime, tune performance, and remove toil across containerized, multi-region cloud services. You will champion service level objectives, automate safe deployments, and improve observability so product teams can ship confidently. We welcome strong entry-level candidates with solid fundamentals alongside experienced engineers who enjoy mentorship and pragmatic operations.

Project Category or Industry
Cloud platform engineering for data-driven B2B SaaS

Type
Full-time employment

Experience Level
Entry-level to mid-level, with structured mentorship and clear growth paths for motivated freshers who demonstrate strong systems fundamentals

Duration
Permanent role

Location
Remote-first across the Americas, EMEA, and APAC with a minimum of 4 hours overlap between 09:00–18:00 UTC; optional hub days in Seattle and Belfast

Salary
USD 75,000–115,000 base depending on location and experience, plus annual performance bonus and comprehensive benefits

Payment Mode
Monthly payroll via bank transfer; contractor arrangements available where local employment is not supported

Hiring Company Name
Tidecrest Reliability

Required Skills or Tools
Comfort with Linux, containers, and networking; experience with Kubernetes, infrastructure as code, CI/CD, and modern observability stacks; ability to write automation in a language such as Python, Go, or Bash; clear written communication and a bias toward measurable outcomes.

Project Details

Project Description
You will join the reliability group responsible for the health, performance, and delivery safety of Tidecrest’s customer-facing services. The team builds paved roads for product squads—secure defaults, fast feedback, and great tooling—so features reach customers without trading off stability. The work blends greenfield automation with iterative hardening of existing systems.

Core Responsibilities and Expected Deliverables

Define and track SLIs and SLOs for critical services; drive error budget policy and reliability reviews.
Build and maintain CI/CD pipelines with progressive delivery, canary analysis, and automated rollback.
Operate and optimize Kubernetes clusters, including autoscaling, network policies, ingress, and service mesh where appropriate.
Implement end-to-end observability: metrics, logs, and traces; create actionable dashboards and alerts tied to user impact.
Reduce toil through runbooks, incident tooling, and self-service platform capabilities; champion post-incident learning.
Deliver well-scoped pull requests with rollout plans, security checks, and documentation.

Required Experience and Preferred Qualifications

Foundation in Linux internals, TCP/IP networking, DNS, and containerization.
Exposure to distributed systems concepts, caching, queues, and backpressure.
Nice to have: GitOps (Argo CD or Flux), policy as code (Open Policy Agent), cost visibility (FinOps), and chaos or load testing.
Awareness of secure supply chain practices, secrets management, and compliance frameworks such as SOC 2.
Certifications are welcome but not required.

Tools or Platforms to Be Used

Cloud: AWS or GCP (EKS/GKE, IAM, VPC, S3/GCS, CloudFront/Cloud CDN, RDS/Cloud SQL)
Orchestration and packaging: Kubernetes, Helm, Kustomize, container registries with image scanning and signing
Infrastructure as code: Terraform or Pulumi; secret management via Vault or cloud KMS
CI/CD: GitHub Actions or GitLab CI with artifact promotion and automated release gates
Observability: Prometheus, Grafana, Loki, Tempo or OpenTelemetry, Alertmanager or PagerDuty
Security: Trivy or Grype for image scanning, Sigstore/Cosign signing, baseline runtime policies

Language Requirement
English is required for daily collaboration; additional languages are a plus but not required.

Communication Style
Asynchronous-first via Slack and GitHub with weekly Zoom stand-ups, design reviews, and post-incident retrospectives. Lightweight RFCs in Notion capture decisions and trade-offs.

Time Commitment or Working Window
Approximately 40 hours per week with flexible scheduling; core collaboration windows target late morning to late afternoon UTC. Participation in a compensated on-call rotation begins after onboarding and shadowing.

Payment Terms
Monthly salary with an annual performance review and bonus eligibility. For contractors, milestone-based deliverables with biweekly invoicing and net-15 payment terms.

Evaluation Criteria

Demonstrated automation and infrastructure-as-code skills
Practical problem-solving in a time-boxed take-home focused on CI/CD and reliability trade-offs
Collaboration during a live pairing session and clarity explaining failure modes and mitigations
Communication, ownership, and reliability assessed through interviews and references
Evidence of observability-driven operations and security-minded practices

Other Requirements
Standard NDA upon offer acceptance, identity verification, and reference checks compliant with local laws. Light-touch time tracking for contractors. Adherence to secure development lifecycle, change management, and incident response guidelines.

About the Company
Tidecrest Reliability is a remote-first engineering company that builds resilient platforms for data-intensive SaaS products. Founded in 2019, we operate small, autonomous teams that value pragmatism, measurable outcomes, and a strong reliability culture. Our engineers are distributed across North America and Europe with collaboration hubs in Seattle and Belfast. Learn more at https://tidecrestreliability.com or contact careers@tidecrestreliability.com.

Post a Job or Project

✔ Requirements Before Posting:

Site Reliability Engineer (SRE) – Tidecrest Reliability

Project Details

Instant Apply

Read Before Apply

Similar Opportunities

Mobile App Tester – AppSure Technologies

Unity Developer – Lumos Interactive

Content Marketing Manager – StoryCraft Media

Cloud Engineer – SkyForge Technologies

Cloud Engineer – CirrusScale Systems

IoT Engineer – CircuitWave Solutions

Report

Post Your Job or Project

Ready to Hire?

Company

About Us

Contact Us

Products

Services

Blog

Features

Analytics

Engagement

Builder

Publisher

Help

Privacy Policy

Terms

Conditions

Newsletter