Sanius HealthNov 2023 — Present

Healthcare cloud data platform

Unified wearable event streams and scheduled operational feeds into one governed platform for analytics and product reporting.

Streaming ingestion architectureLakehouse platform designCross-functional platform standards
KafkaADLS Gen2Delta LakeSparkDruidSupersetAzure

Business problem

The team needed one trusted data platform to combine wearable streams and operational batch feeds without splitting analytics across separate systems or forcing every new source into a custom pipeline.

Platform scope

Wearable telemetry plus operational batch sources

Cadence

Real-time events and scheduled loads in one platform

Consumers

Product, operations, analytics, and downstream app services

Thinking model

  • Unify streaming and batch ingestion before optimizing downstream models.
  • Treat governance as part of the system design, not a reporting-only concern.
  • Keep operational ownership simple enough that product and analytics teams can move quickly.

Constraints

  • Source cadence varied widely, so the platform had to support real-time and scheduled delivery without creating separate operating models.
  • Application workflows and analytics models needed to stay aligned as the platform evolved.

Architecture

Ingest

Wearable + app events

Kafka ingestion

Storage

ADLS Gen2 + Delta Lake

Process

Spark transforms

Serve

Druid + Superset analytics

Ops

Azure Monitor + Log Analytics

Operational guardrails

Contract validation

Source-specific schemas were checked before curated tables were promoted.

Freshness gating

Publishing depended on completeness and freshness signals, not just job success.

Shared observability

Data and app teams used the same monitoring surface for triage and escalation.

Ownership boundaries

Raw, curated, and served layers were intentionally separated to prevent drift.

Flow checkpoints

event streamsWearable + app eventsKafka ingestionraw landingKafka ingestionADLS Gen2 + Delta LakecurationADLS Gen2 + Delta LakeSpark transformsserving modelsSpark transformsDruid + Superset analyticsruntime signalsSpark transformsAzure Monitor + Log Analytics

Design note

Streaming and batch sources were normalized into one platform contract so teams did not need parallel reporting paths.

Design note

Processing boundaries stayed explicit, which kept ownership clear between raw capture, curated models, and served analytics.

Delivery

Platform work

  • Established ingestion contracts for mixed source cadence across real-time and scheduled feeds.
  • Implemented Spark transformation layers on Delta-backed storage for reusable curation.
  • Aligned the data platform with Azure App Service, Static Web Apps, Functions, and PostgreSQL-backed product systems.

Quality controls

  • Schema and contract validation before curated table promotion.
  • Publishing gates based on completeness and freshness checks.

Observability

  • Pipeline and service alerts routed through Azure Monitor and Log Analytics.
  • Failure triage runbooks shared across data and application teams.

Impact

Platform footprint

One shared platform contract for streaming wearable and scheduled operational data.

Delivery model

Cross-functional standards coordinated across a 4-engineer team.

Decision speed

Real-time analytics became available without splitting data into parallel reporting systems.

Tradeoffs

  • Prioritized platform consistency over quick source-specific pipelines.
  • Accepted stricter ingestion contracts to reduce long-term downstream model drift.

Confidentiality note

  • Domain entities and internal naming are abstracted; the architectural patterns and delivery model are preserved.

Work with me

Need one platform for streaming and batch data?

I help teams design shared contracts, transformation layers, and operational controls before the platform fragments.

Plan the platform