Skip to content
Pratik Parikh

Lead DevOps Engineer @ Simbian

Pratik Parikh

I architect and operate cloud-native platforms — Kubernetes, Kafka, and observability at production scale, across every major cloud and on-prem.

find me on

About Me

Cloud-native engineer, platform architect, systems thinker.

I lead DevOps at Simbian, where AI agents take on cybersecurity — and where I own everything from multi-cluster, multi-cloud Kubernetes to shipping our product into customer on-prem and airgapped environments. Before that, I spent four years at Falkonry running multi-region infrastructure for time-series AI: Kubernetes, Kafka, Pulsar, HDFS, high-speed datastreams and a large open-source data stack under real industrial load.

My work sits at the intersection of platform engineering and architecture: building observability that scales, release processes that let one team ship 50+ microservices to many environments, and developer platforms that replace "works on my machine" with ephemeral, reproducible environments. Today that means keeping multiple Kubernetes clusters across three clouds above 99.5% uptime.

Outside of work I organize CNCF Mumbai tech events, am a Grafana Champion, speak at conferences and meetups, and contribute to open source in this way. I care about operational maturity — systems that are boring at 3 AM — and about getting things done across team boundaries.

$ status --current

  • Leading DevOps/SRE/Cloud at Simbian — AI agents for security operations
  • CNCF Mumbai organizer · Grafana Champion · conference speaker
  • Mumbai, India

6+

years in production

4

clouds operated

35+

Kubernetes clusters

100+

services on CI/CD

>99.5%

uptime

90%

MTTD improvement

Skills & Technologies

The stack I run in production — not a logo wall, but tools I've operated, broken, and fixed at scale.

Cloud Platforms

  • AWS
  • Azure
  • Google Cloud
  • OCI

Kubernetes & Containers

  • Kubernetes
  • Docker
  • Helm
  • Rancher
  • Operator SDK

Infrastructure as Code

  • Terraform
  • Ansible
  • CloudFormation
  • Serverless
  • Azure ARM
  • Azure Bicep

Observability

  • Prometheus
  • Grafana
  • Thanos
  • OpenSearch
  • FluentBit
  • FluentD
  • OpenTelemetry

Data & Streaming

  • Kafka
  • Pulsar
  • HDFS
  • Elasticsearch
  • Opensearch
  • MongoDB
  • PostgreSQL
  • Redis
  • Clickhouse

Delivery & Tooling

  • GitHub Actions
  • Argo
  • NGINX
  • Go
  • Python
  • Node.js
  • Bash
  • Linux

Projects

Production systems I've architected and operated — Kubernetes platforms, Kafka pipelines, observability stacks and developer tooling.

HA Observability Platform

End-to-end observability across multi-cloud Kubernetes fleets — metrics, logs, traces and alerting built from scratch and operated in production.

Architecture

Per-cluster Prometheus in agent mode remote-writes to a central hub; Thanos adds HA, deduplication and long-term object storage; Grafana handles dashboards, alerting and cloud-provider plugins.

MTTD improved by 90%, MTTR by 50%

  • Prometheus
  • Thanos
  • Grafana
  • Kubernetes
  • FluentBit
  • OpenTelemetry

Kafka Streaming Backbone

Operated and scaled Kafka and Pulsar clusters powering time-series AI pipelines — high-throughput ingest with strict ordering and retention guarantees.

Architecture

Multi-broker clusters on Kubernetes with rack-aware replica placement, workload-isolated topics, tiered retention, and consumer-lag SLOs wired into Prometheus alerting.

  • Kafka
  • Pulsar
  • Kubernetes
  • Prometheus
  • Terraform

Release Orchestration Platform

CI/CD and release automation for 50+ microservices deploying to Kubernetes clusters and cloud services across staging, testing and many production environments.

Architecture

Trigger-based release pipelines push artifacts and images to internal and customer registries; a Helm-based operator (Operator SDK) drives rollouts with pause/resume for database migration failures.

Scaled from one hand-rolled environment to many, fully automated

  • GitHub Actions
  • Helm
  • Operator SDK
  • Kubernetes
  • Docker

Internal Developer Platform

Feature-branch deployments and ephemeral test environments so developers ship and validate without touching shared infrastructure or local hacks.

Architecture

Automated environment lifecycle — creation, dependency wiring, database initialization, cleanup and reporting — for every application component, per branch.

Replaced local-only developer setups entirely

  • Kubernetes
  • Helm
  • GitHub Actions
  • Python
  • PostgreSQL

Security Scanning Pipeline

Automated container and OS-level vulnerability management feeding SOC2 compliance and customer-facing security reporting.

Architecture

Trivy, AWS ECR scanning and Clair (Quay) cover application images; OS package scanning with automatic fixes; version-diff analysis with automated report generation and delivery.

  • Trivy
  • AWS ECR
  • Clair
  • Python
  • GitHub Actions

Fleet Automation with Ansible

Configuration management and drift-free operations for cloud instances and the monitoring stack itself.

Architecture

Playbooks automate EC2 patching, Grafana dashboard and alert sync, and OpenSearch monitor provisioning — observability config managed as code.

  • Ansible
  • AWS EC2
  • Grafana
  • OpenSearch

Talks & Conferences

I speak about FOSS, Kubernetes, Cloud, Observability, Platform Engineering, or really any piece of software that interests me - Kafka, Opensearch among others — grounded in production war stories, not slideware.

conferenceJun 2026

Telemetry at Scale: Best Ways to Use Prometheus and Thanos with Grafana

GrafanaCON Local PunePune, India

From zero monitoring to a robust multi-cluster Kubernetes observability stack — architectures from single cluster to multi-cloud, with pros, cons and use cases. Hosted at the InfraCloud office.

  • Observability
  • Prometheus
  • Grafana
  • Thanos
conferenceJun 2025

Operating OpenSearch — The Kubernetes Way

OpenSearchCon India 2025Bengaluru, India

Running OpenSearch effectively on Kubernetes takes a well-structured approach — how to deploy, configure and operate it. Presented at the Sheraton Grand Hotel, Bengaluru.

  • OpenSearch
  • Kubernetes
  • Observability
conferenceApr 2025

Internal Development Platform — The What, the How and the Why

DevOpsDays Atlanta 2025Atlanta, USA

Ignite talk on what an internal development platform is, why it matters and how to build one — delivered at the Historic Academy of Medicine and livestreamed on YouTube.

  • Platform Engineering
  • Developer Experience

Latest Writing

Notes from production — observability, streaming, Kubernetes and the occasional incident retrospective.

Get In Touch

Infrastructure questions, speaking invitations, architecture reviews, or mentoring — my inbox is open.

$ topmate --book-session

Book a 1:1 session

Career guidance, DevOps interview prep, architecture reviews, or deep dives on Kubernetes, observability and Kafka — book time with me directly on Topmate.

Book on Topmate