
Massive cost reduction, scalable real-time platform
Real-time telemetry from thousands of vehicles, point-of-sale events from hundreds of locations, scheduling data from a 30-year-old planning system, and external feeds (weather, traffic, fuel prices) — all flowing into an Oracle warehouse that was struggling at 12TB and would not scale to the projected 50TB within three years. Oracle licensing alone was approaching seven figures annually. Real-time use cases (fleet rerouting, dynamic pricing) were technically blocked by batch-only access patterns.
Designed a lakehouse architecture on AWS — S3 as storage, Glue and EMR for batch processing, Kinesis for streaming, Athena and Redshift Serverless for query. Built a strict zoning model (raw → cleaned → modeled → mart) with full lineage. Migrated workloads in waves: dashboards first, then ad-hoc analytics, then real-time pipelines. Decommissioned the Oracle analytical instance after a 6-month parallel-run validation.